rights / license: research collection in copyright - non ... · user perception and acceptance...

149
Research Collection Doctoral Thesis Quality aspects of multimodal communication user perception and acceptance thresholds Author(s): Zuberbühler, Hans-Jörg Publication Date: 2003 Permanent Link: https://doi.org/10.3929/ethz-a-004583162 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection . For more information please consult the Terms of use . ETH Library

Upload: others

Post on 24-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Research Collection

Doctoral Thesis

Quality aspects of multimodal communicationuser perception and acceptance thresholds

Author(s): Zuberbühler, Hans-Jörg

Publication Date: 2003

Permanent Link: https://doi.org/10.3929/ethz-a-004583162

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

Page 2: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

DISS. ETH NO. 15124

QUALITY ASPECTS OF MULTIMODAL COMMUNICATION:USER PERCEPTION AND ACCEPTANCE THRESHOLDS

A dissertation submitted to the

SWISS FEDERAL INSTITUTE OF TECHNOLOGY ZURICH

for the degree of

Doctor of Natural Sciences

presented by

HANS-JORG ZUBERBOHLER

Dip!. Umwelt-Natw. ETH

born 11.02.1968

citizen of Urnasch (AR)

accepted on the recommendation of

Prof. Dr. Dr. Helmut Krueger, examinerProf. Or. Albert Kundig, co-examiner

Or. Sissel Guttormsen Schar, co-examiner

2003

Page 3: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Acknowledgement

This thesis would not exist without support and cooperation of a number of people,

whom I would like to thank:

First and foremost, Prof. Dr. Dr. Helmut Krueger, my promoter, for his keen obser­

vation and his valuable advice. He provided an excellent research environment for the

achievement of this thesis.

Furthermore, Prof. Dr. Albert Kiindig for the hours we spent discussing, and for

funding the QED-project, in whose frame I was writing my thesis.

A great thank also to Dr. Sissel Guttormsen Schar who introduced me to the world of

scientific research. And to my other colleagues in the research group man-machine inter­

action, who have contributed in one way or another to make my time at the ETH one

that I will always look back to with great pleasure: Marc Arial, Morten Fjeld, Christine

Hitzke, Pamela Ravasio, Sam Schluep, and Phillipe Zimmermann.

Thanks also goes to the QED-team members Alexander Braun and Patrik Estermann

for the work they have done to implement the videoconference setup and to run experi­

ments.

A special thanks to Kent Riopelle who proofread my thesis and provided valuable

feedback to improve its comprehensibility.

Finally, I would like to thank my parents and friends who ~upported and encouraged

me. Most of all, I thank my partner Ruth for her continuous and loving support.

Ziirich, August 2003 Hans-Jorg Zuberbiihler

Page 4: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Table of Contents

Table of Contents I

Abstract V

Zusammenfassung IX

1 Transfer to Practice 1

1.1 Regarding Human-Computer Interaction (HCI) 4

1.2 Regarding Human-Human Interaction (HHI) 6

2 Introduction 9

2.1

2.2

2.3

Background and Aims 9

Scope of Investigation 12

2.2.1 Delay as Quality of Service (QoS) Parameter 15

2.2.2 Published Results for Perception and Acceptance of Delay 17

2.2.3 A Psychophysical Approach 19

Structure of the Thesis 20

3 Theory 23

3.1 A Taxonomy of Communication 23

3.1.1 Social context 26

3.1.2 Orientation 27

3.1.3 Coding 29

Page 5: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

11 TABLE OF CONTENTS

3.1.4 Modality 31

3.1.5 Timing 34

3.1.6 Exemplification of the interpersonal communication model 37

3.2 Processing Time of Auditory and Visual Stimuli 40

3.2.1 Indirect: Reaction Time Differences 40

3.2.2 Direct: Event-Related Potentials (ERPs) 41

3.3 Mental Representation of Time 43

3.3.1 Low Frequency Processing 44

3.3.2 High Frequency Processing 46

3.4 Neural and Cognitive Models of Time Perception 48

3.4.1 Labelled Lines 49

3.4.2 Population Clocks (Neural Networks) 50

3.4.3 Pacemaker-Switch-Accumulator Models 52

3.5 Psychophysical Theory for Measuring Thresholds 54

3.5.1 Testing paradigms 54

3.5.2 Specification of the Psychometric function '1'= f(ifJ) 55

3.5.3 Adaptive Psychophysical Procedures 59

4 Experiments 65

4.1 In Human-Computer Interaction (HCI) Mode 65

4.1.1 Experimental Setup 66

4.1.2 Procedure 66

4.1.3 HCI-Results 70

4.2 In Human-Human Interaction (HHI) Mode 74

4.2.1 Experimental Setup 75

4.2.2 Procedure 76

4.2.3 HHI-Results 79

5 Discussion and Conclusions 85

5.1 Regarding Relative Delays 85

5.1.1 In Human-Computer Interaction (HCI) 85

Page 6: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

TABLE OF CONTENTS III

5.2

5.3

Regarding Absolute Delays 89

5.2.1 In Human-Computer Interaction (HCI) 89

5.2.2 In Human-Human Interaction (HHI) 91

Further Research 97

5.3.1

5.3.2

Relative Delay 98

Absolute Delay 98

Annex 101

Developed Software: The best-PEST Calculator 101

Description 102

Monte-Carlo-Simulations 107

References 113

Glossary 121

Index 131

Page 7: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /Blank leaf

Page 8: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Abstract

Recent trends in telecommunication networks indicate a shift away from the use of

circuit-switched networks towards the use of packet-switched networks. This new net­

working environment will present end users with new characteristics like variations in

transmission delays, and bit rates as well as potential loss of data packets. These charac­

teristics represent a challenge in the design and use of packet-switched networks, since

they may be lead to user impairments, depending on the kind of source coding and com­

pression used in the end-systems.

It is generally agreed that very little is known about user expectations or perceptive

mechanisms and user behaviour in this new situation. As a consequence, it is presently

difficult to base network engineering on proper traffic forecasts and real user require­

ments. This lack of knowledge is the driving force behind our work, aiming to examine

user perception and acceptance of the Quality of Service (QoS) parameters absolute and

relative del(~ys (also referred to as roundtrip delay and !)nchronisation errory.

In this thesis we investigated the perception and the acceptance thresholds for particu­

lar delay parameters using psychophysical methoc 5. I.e. threshold are obtained by means

of empirical determinations applying either 2-alt, nativeforced-choice oryes-no paradigms, and

using the adaptive psychophysical procedure ( Jled best-PEST. The experiments are con­

ducted in the interaction modes Human-Co IJjJuter-Interaction (HCI), and Human-Human­

Interaction (HHI), which evoke different del ,y perceptions. HeI delay thresholds are ob­

tained using an experimental set up that irc1udes stimulus presentation, best-PEST algo­

rithm, and data acquisition. It is implemented using the object-oriented scripting lan­

guage Lingo. The experiments conducted in the HHI mode comprise threshold determi­

nations in which the experimental subjects interact with each other over a videoconfer­

ence that uses an ATM-network infrastructure. The experimental set-up consists of two

Page 9: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

VI ABSTRACT

or three videoconference stations connected via fibre passing through a system called

ARES, which emulates the behaviour of AIM channels in real-time with the possibility

to emulate performance degradations, such as delay or errors.

In the HCI mode the following thresholds are determined:

• Relative delay between auditory stimuli preceding visual stimuli (AV).

• Relative delay between visual stimuli preceding auditory stimuli 01A).

• Absolute delay between voice input and visual computer-generated response

0loiVis).

• Absolute delay between mouse input and visual computer-generated re­

sponse (MouVis).

In the HHI mode the following thresholds are determined:

• Absolute delay in basic auditory interaction between two subjects (AudBas).

• Absolute delay in basic visual interaction between two subjects 01isBas).

• Absolute delay in realistic audio-visual interaction between three subjects

(AudVisReal) .

• Absolute delay in realistic auditory interaction between three subjects

(AudReal).

The thresholds for relative delays are 71 (±17) ms for the AV condition, and 105

(±25) ms for the VA condition. The thresholds for absolute delay in HCI are 115 (±23)

ms for the VoiVis condition, and 78 (±14) ms for the MoiVis condition. In HHI the

thresholds for absolute delays are 216 (±44) ms for the AudBas condition, and 237 (±92)

ms for the VisBas condition. Accomplishing a realistic task the perception threshold is

1220 ms, and the acceptance threshold is 2080 ms in the AudVisReal condition. In the

AudReal condition the perception threshold is 970 (±310) ms, and the acceptance

threshold is 1760 (±410) ms. Age and gender of the experimental subjects have no sig­

nificant effect (p>0.05) on these results.

To obtain psychometric functions experimental data of each condition are fitted using

a logistic model. The benefit of such functions is that network planners, as well as con­

tent and service providers are delivered with a means to estimate which user percentages

are expected to detect and to reject a specific delay. This 'political' question is influenced

Page 10: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

ABSTRACT VII

by economical considerations, which price/performance ratio is intended to be offered

to the user.

Furthermore the relative delay thresholds are discussed in the light of neural process­

ing times for different modalities. And the absolute delay threshold is discussed regarding

the task dependency represented by different degrees of interactivity.

Page 11: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /Blank leaf

Page 12: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Zusammenfassung

Telekommunikationsnetzwerke werden umgestellt von vermitdungsorientierten zu pa­

cketvermittelten Netzwerken. Diese Umstellung hat zur Folge, dass die Benutzer mit

veriinderten Netzwerkeigenschaften konfrontiert werden, wie zum Beispiel einer

variablen Durchsatzrate und Obertragungsverzogerung, aber auch mit Verlusten von

Datenpaketen. Diese neuen Eigenschaften stellen eine Herausforderung beziiglich

Auslegung und Benutzung von packetvermittelten Netzwerken dar, da sie zu

Behinderungen des gewohnten Kommunikationsprozesses fiihren konnen.

Bis anhin ist in diesem Gebiet noch wenig gesichertes Wissen vorhanden, weder dar­

iiber wie die Benutzer diese neue Situation wahrnehmen, noch damber wie sie sich insge­

samt verhalten. Dies erschwert die Konzeptionierung und Dimensionierung von Tele­

kommunikationsnetzen, da auf fundierte Annahmen iiber Benutzerbediirfnisse und ver­

Hissliche Vorhersagen zur Netzwerkbelastung verzichtet werden muss. Die beschriebene

Wissensliicke ist die treibende Kraft hinter der vorliegenden Arbeit, in der die Wahrneh­

mung und Akzeptanz der beiden Dienstqualitat-l 'arameter absolute und relative Verziigerung

untersucht werden.

Die Wahrnehmungs- und Akzeptanzschwellen der einzelnen Verzogerungsparameter

werden anhand empirischer Versuche mit psychophysischer Methodik bestimmt. Dabei

kommen entweder das 2-alternative forced-choice oder das yes-no Paradigma sowie das adapti­

ve psychophysische Verfahren best-PEST zur Anwendung. Die Experimente sind aufge­

teilt in die beiden Interaktionsmodi Mensch-Computer-Interaktion (HCI) und Mensch-Mensch­

Interaktion (HHI), die beide unterschiedliche Verzogerungswahrnehmungen hervorrufen.

Zur Bestimmung der HCI-Verzogerungsschwellen wird ein Versuchsaufbau eingesetzt,

der Stimuluspriisentation, best-PEST Algorithmus und Datenerhebung vereint und der

mit der objektorientierten Skriptsprache Lingo programmiert ist. Versuche im HHI Mo-

Page 13: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

x ZUSAMMENFASSUNG

dus andererseits werden mittels einer Videokonferenzanwendung durchgefiihrt, die uber

ein emuliertes ATM-Netzwerk Hiuft. Dieser Versuchsaufbau besteht aus zwei oder drei

Videokonferenzstationen, die uber Glasfaser mit dem sogenannten ARES-System ver­

bunden sind. (Das ARES-System emuliert das Echtzeit-Verhalten von ATM-Kanalen

und bietet die Moglichkeit, gezielt Leistungsverschlechterungen bezuglich Verzogerung

und Fehlerverhalten zu simulieren).

Im HCI Modus werden folgende Schwellwerte bestimmt:

• Relative Verzogerung zwischen auditiven Stimuli, die den visuellen vorange­

hen (AV).

• Relative Verzogerung zwischen visuellen Stimuli, die den auditiven vorange­

hen (VA).

• Absolute Verzogerung zwischen Stimmeingabe und visueller, rechnerge­

stiitzter Antwort (VoiVis).

• Absolute Verzogerung zwischen Mauseingabe und visueller, rechnergestiitz­

ter Antwort (MouVis).

Im HHI Modus werden folgende Schwellwerte bestimmt:

• Absolute Verzogerung bei einfacher auditiver Interaktion zweier Versuchs­

personen (AudBas).

• Absolute Verzogerung bei einfacher visueller Interaktion zweier Versuchs­

personen (VisBas).

• Absolute Verzogerung bei realistischer audio-visueller Interaktion zwischen

drei Versuchspersonen (AudVisReal).

• Absolute Verzogerung bei realistischer auditiver Interaktion zwischen drei

Versuchspersonen (AudReal).

Die Schwellwerte fur relative Verzogerungen betragen 71 (±17) ms in der AV­

Bedingung und 105 (±25) ms in der VA-Bedingung. Die Schwellwerte fur absolute Ver­

zogerung in HCI betragen 115 (±23) ms in der VoiVis-Bedingung und 78 (±14) ms in

der MoiVis-Bedingung. In HHI betragen die Schwellwerte fur absolute Verzogerung 216

(±44) ms in der AudBas-Bedingung und 237 (±92) ms in der VisBas-Bedingung. Wenn

die Versuchspersonen realistische Gesprachssituationen nachzubilden haben, liegt ihre

Wahrnehmungsschwelle bei 1220 ms und ihre Akzeptanzschwelle bei 2080 ms (AudVis­

Real-Bedingung). In der AudReal-Bedingung betragen diese Werte 970 (±330) ms fur

Page 14: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

ZUSAMMENFASSUNGXI

Wahrnehmung und 1760 (±410) ms fur die Akzeptanz. Weder das Alter noch das Ge­

schlecht der Versuchspersonen ubt einen signifikanten Einfluss (p>O.OS) auf die

Schwellwerte aus.

Um aus den experimentellen Daten psychometrische Funktionen zu erhalten, werden

fur alle Bedingungen logistische Kurven gefittet. Der Nutzen dieser Funktionen besteht

darin, dass Netzwerkplaner sowie Anbieter von Inhalten und Diensten abschatzen kon­

nen, welcher Anteil Benutzer bestimmte Verzogerungswerte bemerken und/oder ableh­

nen wird. Diese ,politische' Frage wird massgeblich durch okonomische Betrachtungen

beeinflusst, welches Preis-Leistungsverhaltnis den Kunden angeboten werden soli.

Des weiteren werden die re1ativen Verzogerungsschwellwerte im Licht der neuronalen

Verarbeitungsgeschwindigkeit fur verschiedene Modalitaten diskutiert. Und absolute

Verzogerungsschwellwerte werden beziiglich ihrer Abhangigkeit von den auszufuhren­

den - durch verschiedene Interaktionsgrade gekennzeichnete - Aufgaben diskutiert.

Page 15: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

1 Transfer to Practice

This chapter compiles the results rif the thesis that are direct!J transftrable to fields rifprac­

tice. Atfirst a briif summary rif the background and than the motivation fOr the thesis is

presented. Subsequent!J qualitative results are discussed, and final!J quantitative!J listed in

diverse tables, each consisting rif user percentages fOr particular deltry types, andfor the two

interaction modes, Human-Computer-Interaction (HCI), and Human-Human-Interaction

(HHI).

In recent times, the underlying technology of public network infrastructures experi­

enced a radical change from circuit-switched to packet-switched technology. The original

reason for this change is that with new packet-switched technologies, for instance Ipa,

the infrastructure can be operated with better capacity, since several data streams can be

multiplexed. This is in contrast to traditional circuit-switched technologies with ISDN

serving as the most service-wise advanced example, where certain circuits are reserved

for respective services. Another reason for the change is that packet-switching better

matches the characteristics of computer-generated data. This is crucial considering the

spread of computers acting as end-systems. Regarding the characteristics of these two

technologies it appears that - on the one hand - packet-switching results in higher net­

work dJicienry, but - on the other hand - causes lower network predictability in terms of the

Quality of Service (QoS): Variations in transmission delays and bit rates, as well as poten­

tialloss of data-packets are more likely to occur.

These new characteristics may lead to user impairments, depending on the application

used. For instance, real-time applications like voice-over-IP (VoIP) or videoconferences

are considered most critical in terms of delay. In this context, the question is, at which

threshold value a delay becomes perceivable, and at which threshold value does it be-

a All acronyms and abbreviations are explained in the glossary beginning on page 121.

Page 16: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2 CHAPTER 1. TRANSFER TO PRACTICE

come perturbing. In this thesis we decided to investigate these two delay thresholds by

means of a psychophysical approach. As a quantifiable result of the thesis, so-called psy­

chometric functions are obtained, which describe the user detectability and acceptance of

different delay values. These functions are listed at the end of this chapter. Before, we

take a look at the results of the thesis that are rather of qualitative nature.

The experiments showed that perception and consequently acceptance of delays are

very much task-dependent. Therefore it is probably not helpful to recommend universal

threshold values; rather they should be suggested for different task categories. It seems

that the degree rf interactiviry acts as the most delay-sensitive property of any communica­

tion scenario. For the time being, the choice of offered delay values should be kept as a

business strategy of the service provider. In order to base this strategy on a reliable fun­

dament, it will be helpful to classify the abundance of relevant task categories, and to as­

sess the proper delay thresholds for these categories separately. With such knowledge it

will be possible to adjust the delay values according to the measured degree of interactiv­

ity. Having knowledge of the appendant psychometric function, the delay can be set ac­

cording to a predefined (or negotiated) percentage of users perceiving or accepting this

particular delay.

Additionally, the experiments of this thesis showed that delay perception and accep­

tance are not only influenced by the degree of interactivity but also to a great extent by

the number of communication channels the application offers: it seems that the visual

channel in an audio-visual application is acting as a distractor. I.e. the focus of attention

is divided into parts for the audio, and parts for the visual channel. Thus, the gain of

'media richness' in audio-visual communication has to be paid by a loss of focussed per­

ception. Furthermore, the number of participants of the communication event turned

out to be another distractor: it seems that the focus of attention is divided into all com­

munication members. With increasing number of participants, this results in a decrease

of attentional resources for the detection of delay.

In summary it appears that the three following factors are responsible for the users'

delay requirements. All of them result in high delay-tolerance, since they evoke poor de­

lay perception:

• Low degree of interactivity.

• Increasing number of participants.

• Transition from mono- to multimodal communication.

Page 17: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

1.1 Regarding Human-Computer Interaction (HCn 3

Suboptimal communication support, e.g. missing gaze awareness of the videoconfer­

ence technology could be mentioned as a fourth factor that leads to higher delay­

tolerance. Without gaze awareness the communication members are not sure when they

are addressed - unless they are explicidy verbally addressed. This again slows down the

degree of interactivity, and might be the reason, why the results of the conducted ex­

periments suggest acceptable delay values for realistic audio-visual tasks that are well

above the elsewhere suggested values for audio-only tasks. Nevertheless, we have to bear

in mind two things.

• Users still have poor practice with multiperson, multimodal telecommunica­

tion services. With the upcoming use of such services, users will most

probably improve their delay perception skills ~.e. they will avail themselves

of free attentional resources that are no longer needed to cope with the new

technology).

• The psychophysical methods used in the experiments disallow conclusions

about long term effects regarding what Wilson calls user costs (Wilson et al.,

2000). Although users do not perceive a particular delay as disturbing, it may

subconsciously increase mental strain. A technology which evokes such ef­

fects contradicts the user-centred paradigm.

The remainder of this chapter presents the quantitative results of the conducted ex­

periments. They provide insight about the users' perception performance for different

delays types in different modalities. Due to the above mentioned reasons, the results of

the realistic tasks (fable 4) should not be applied to situation where only two partners

communicate.

The results are divided into Human-Computer-Interaction (HC!), and Human­

Human-Interaction (HH!). They describe two fundamental interaction modes resulting

in different delay perception. A further distinction concerns the types of delay: Relative de­

Itry is perceived between particular modalities, e.g. between the auditory and the visual

channel. This delay is sometimes called intermedia ~nchronisatio J, ~nchronisation error, or lip­

~nchronisation. The other type is called absolute de/try. It is per :eived only in dialogue set­

tings between sending information and receiving answer from the dialogue partner. This

delay is sometimes called roundtrip deltry or return trip deltry.

The benefit of the following tables is that network planners, and content providers,

are delivered with a means to estimate which user percentages are expected to detect and

Page 18: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4 CHAPTER 1. TRANSFER TO PRACTICE

to reject a specific delay. This 'political' question is influenced by economical considera­

tions, which price/performance ratio is intended to be offered to the user.

1.1 Regarding Human-Computer Interaction (Hel)

In HeI, we ran experiments determining the relative and the absolute delays. Results

concerning relative delays are available for situations where audio precedes the appendant

visual stimulus (condition AV), as well as for the opposite stimulus order (condition VA)

(see Table 1). The results are considered suitable for most stringent requirements, i.e. for

tasks facilitating the perception of asynchrony.

Further results concern the detection of absolute delays in situations where users ex­

perience a delay between their vocal input and a computer-generated visual response

(condition VoiVis). Or between their mouse input and a computer-generated visual re­

sponse (condition MouVis) (see Table 2). These results are considered suitable for appli­

cations that e.g. enable voice recognition, or that are driven by mouse pointer or joystick

inputs (e.g. database queries, browsing the WWW, or image processing software). Addi­

tionally, the absolute delay thresholds in HeI can also be used to analyse the later de­

scribed HHI thresholds, since they represent a component inherent to all network­

mediated HHI.

Page 19: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

1.1 Regarding Human-Computer Interaction (HCn

Table 1 Relative delay values perceived by particular percentages of users.

Reading example: It can be expected that not more than 25 % of users will detect an

AV-delay of 53 ms, and a VA-delay of 74 ms, respectively.

5

Percentage ofUsel'$Detecting Asynchrony in Bel

"f%]5

10

25

33

50

67

75

90

95

Extent of Asynchrony whenAuditory Precedes Visual (AV)

; [ms]

12

29

53

61

77

92

101

125

141

Extent of Asynchrony whenVisual Precedes Auditory (VA)

; [ms]

34

50

74

67

98

113

122

146

162

Table 2 Absolute delay values perceived by particular percentages of users.

Reading example: It can be expected that up to 75 % of users will detect an absolute de­

lay of 146 ms when interacting by voice. And up to 75 % of users will detect an absolute

delay of 96 ms when interacting by mouse clicks.

Percentages of Users DetectingAbsolute Delays in Hel

"C%]25

33

50

67

75

90

95

Absolute Delay inVocVis Interaction Mode

; [ms]

50

67

98

129

146

195

228

A .~ . • -nal.." 1ft

Aa • •• uelay :;ode; [ms]

33

45

65

85

96

128

149

Page 20: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

6 CHAPTER 1. TRANSFER TO PRACTICE

1.2 Regarding Human-Human Interaction (HHI)

In the HHI mode we ran experiments determining absolute delay thresholds. Results

are available for delay perception in basic auditory and visual interaction (conditions

AudBas and VisBas) (see Table 3). Since these experiments evoked a maximal degree of

interactivity, the results are considered to represent the minimal delay users can perceive,

when interacting together. Further HHI experiments concern perception and acceptance

of absolute delays, when users execute realistic tasks (conditions AudVisReal and

AudReal) (see Table 4). Note that these results count only for the chosen task (free dis­

cussion about a familiar topic) involving three participants. Since other tasks might evoke

different degrees of interactivity, they are assumed to allow for different delay values.

Table 3 Absolute delay values perceived by particular percentages of users.

Reading example: It can be expected that up to 75 % of users will detect an absolute de­

lay of 228 ms in auditory HHI. And up to 75 % of users will detect an absolute delay of

239 ms in visual HHI.

Percentages of Users DetectingAbsolute Delays in HHI

",[%]

5

10

25

33

50

67

75

90

95

a ...

a L _t

109

131

164

175

196

217

228

261

283

aL _I. <_

.. ---" .""-",'1T , •••_"

109

133

169

181

204

227

239

275

299

Page 21: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

1.2 Regarding Human-Human Interaction (HHI)

Table 4 Absolute delay values perceived by particular percentages of users.

Reading example: It can be expected that not more than 33 % of users will detect an ab­

solute delay of 734 ms when interacting audio-visually, and 535 ms when interacting

solely in the auditory mode. And not more than 33 % of users will find that an absolute

delay of 1610 ms is disturbing when interacting audio-visually, or 1430 ms when inter­

acting in the auditory mode. Note that these values count only for the chosen task.

7

Percentages of UsersDetecting or AcceptingAbsolute Delays in HHI

",rlOJ5

10

25

33

50

67

75

90

95

Perception of Absolute Delay

In realistic In realisticaudio-visual task audio-only task

(AudVisReal) (AudReal); [msJ ; [msJ

n.a. n.a.

n.a. n.a.

466 386

734 535

1220 800

1710 1070

1970 1220

2730 1640

3240 1920

Acceptance of Absolute Delay

In realistic In realisticaudio-visual task audio-only task

(AudVisReal) (AudReal); [msJ ; [msJ

n.a. 617

629 889

1350 1290

1610 1430

2080 1690

2550 1940

2810 2090

3530 2480

4030 2750

Page 22: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /Blank leaf

Page 23: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2 Introduction

In this chapter we expose the reasons that motivated us to investigate quality issues in mul­

timodal real-time communication. To begin we briif!y describe the state-rif-the-art in tele­

communication technology and outline user impacts 0/ such technology. Subsequent!J the ge­

neric approach is narrowed down to fit the actual scope 0/ the investigation, pointing out the

p.rychopl!Jsical approach for measuring delay perception lry means 0/aglobal model 0/the us­

ers' perception and acceptance 0/ environmental stimuli. Last!J the structure 0/ the thesis is

presented.

2.1 Background and Aims

Recent trends in telecommunication networks indicate a shift away from the use of

circuit-switched networks (with ISDN serving as the most technologically and service­

wise advanced example) towards the use of packet-switched networks (e.g. lP, ATM or

MPLS) (Coffman et aI., 1998). Thus, most operators of public networks plan to migrate

their core network infrastructure to a universal, service-independent system operating in

a packet-switched mode. The original motive for using packet-switching stemmed from

the idea that the existing infrastructure could be used more efficiently by multiplexing

several data streams. In the meantime however, it has become clear that this focus is no

longer sufficient. Rather, packet-switching better matches the characteristics of com­

puter-generated data. The traditional telecommunication networks (e.g. ISDN) have es­

sentially been characterized by the following properties:

Page 24: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

10 CHAPTER 2. INTRODUCTION

• Almost constant - and in the case of wire line transmission -low delay.

• Very low error rates for the ftxed network.

• Network services associated with constant bandwidth.

In contrast, the new networking environment will present end users with new charac­

teristics like:

• Variations in transmission delay.

• Variations in bit rates.

• Potential loss of data packets.

Thus it appears that with packet-switched networks the users cannot count on a stable

Quality of Service (QOS)b anymore. These new characteristics represent a challenge in the

design and use of packet-switched networks, since they may be lead to user impairments,

depending on the kind of source coding and compression used in the end-systems.

It is generally agreed that very litde is known about user expectations or perceptive

mechanisms and user behaviour in this new situation (Bouch et al., 2000b). Furthermore

it is not yet known how objective system quality relates to users' subjective perceptions

of quality. The reason for this situation is that to date the majority of research on QoS is

systems oriented, focusing on trafftc analysis, scheduling, and routing. Relatively minor

attention has been paid to user-level QoS issues (Bouch et al., 2000a). Moreover, it is not

yet known if and how users make trade-off decisions between variant quality perform­

ance and cost. As a consequence, it is presendy difftcult to base network engineering on

proper trafftc forecasts and real user requirements.

At the same time, the range of applications run by users is growing considerably

(Odlyzko, 2000), from traditional point-to-point phone calls to sophisticated computer­

based applications, involving both users and servers. In addition, the last few years have

seen mobile communication become all-pervasive, with telephony and short message

services (SMS) dominating. With mobile users, yet another phenomenon is observed:

b The basic Quality of Service (QoS) parameters are: throughput, transit delay, jitter (delay vari­

ance), and error rate. For the numerous deftnitions of the QoS-concept see Fluckiger (1995).

Note that the QoS concept is also applied for a broader scope including e.g. picture and sound

quality as well as security aspects.

Page 25: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2.1 Background and Aims 11

Many such users value the ability to communicate freely at least as high as some per­

formance measures for the actual service. Two examples may illustrate this observation:

• Audio quality of mobile end devices is obviously very often tolerated at a

level considerably below POTS standards.

• SMS users accept an extremely unwieldy user interface.

This observation is in some ways synonymous to the well-known masking effects in

auditory perception (Zwicker et al., 1967), where certain stimuli are not noticed when

some other stimulus is present at a level above a certain threshold. Such effects may pos­

sibly be generalized to a fundamental 'masking principle' where impairments are judged

in the light of the attained benefits, i.e. perturbing stimuli could be masked by more val­

ued stimuli. However, the inverse effect is also true, describing a 'negative masking', re­

ferred to here as an 'amplifying principle', where negative circumstances amplify a per­

turbing stimulus. For example as might be the case in emergency situations where a lack

of effective communication quality may cause adverse effects. In contrast to the quality

factors based on technological sourcesc, masking and amplifying quality factors cannot be

controlled so much, since they are based on contextual and psychological causes.

To recapitulate, it appears that evolutions in telecommunication technology as well as

the growing number of applications deployed by users reveal a broad field of unanswered

questions concerning quality issues. This lack of knowledge is the driving force behind

our work, aiming to examine the end-user's perception and acceptance of QoS­

parameters, thereby answering the question:

• How do network-induced impairments affect the interaction oftwo or more users ofa tele­

communication .rystem?

The investigations to answer the above question are undertaken in the framework of a

project called QEDd (Kiindig et al., 2001), which aims at making a substantial contribu­

tion towards quality-based network engineering, emphasising multimodal person-to-

c which in fact are addressed by the Quality of Service (QoS) concept.d QED is the acronym for Quality ofSmice Expectationsfor Real-Time Dialogue Communication, whichis accomplished in collaboration with Albert Kiindig and Alexander Braun from the ComputerEngineen'ng and Networks Lzboratory (IlK) of the Swiss Federal Institute of Technology Zurich (ETHZ).On his part, Alexander wrote a thesis (Braun, 2003) with emphasis on technological aspects.

Page 26: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

12 CHAPTER 2. INTRODUCTION

person and person-to-computer communication, thus emphasising an user-centred per­

spective.

2.2 Scope of Investigation

We will now briefly describe the scope of the investigations undertaken in this thesis.

For this purpose, the following hierarchical diagram (see Figure 1) classifies some out­

standing attributes of communication. The chosen emphasis is drawn bold, whereas

situations not in our focus are drawn grey.

Communication

Technologically-Mediated

Real-Time (Synchronous)

Figure 1 Tree diagram showing the chosen emphasis on technologically medi­ated dialogue communication in real-time. Note that not all possible connections aredepicted.

The emphasis on technologically mediated dialogue communication in real-time has

been chosen for two essential reasons:

• Promising Future Applications

• Predictable User Expectations

Page 27: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2.2 Scope of Investigation

In the following the two reasons are briefly explained.

Promising Future Applications

13

Real-time dialogue communication between users will- despite upcoming new types

of applications - most probably remain an important and revenue-generating application

in both fixed and mobile telecommunication, and in both private and business communi­

cation. Examples of such applications are pure videoconference applications, CSCW

tools, or services using UMTS technologies.

Predictable User Expectations

Face-to-face communication between people - which is the unmediated pendant to

technologically mediated communication - requires extremely sophisticated and well­

trained pattern recognition skills: in contrast to computer-based pattern recognition, hu­

mans are capable of interpreting very subtle variations in facial expression, voice pitch

and timing. As a consequence, we are all experts in the recognition of behavioural devi­

ances from what we consider normale. For this reason it should be easy to model and

predict user expectancies for technologically mediated interpersonal communication in

real-time: Since the users compare such services with face-to-face communication, they

are assumed to expect from the application a behaviour which is equivalent to natural

face-to-face communication. This is in contrast to many other applications in the area of

man-machine interaction (e.g. browsing the WWW), for which user expectations are dif­

ficult to predict. The reason could be that there is no natural equivalent for these kinds of

applications.

In summary, it appears that applications should support the fundamental information

exchange by the use of audio (hearing each other), video (seeing each other), and shared

tools (such as chat or whiteboard, and application sharing), at best without perceivable

differences to the face-to-face situation (which of course is hardly to attain). Ultimately,

users expect telecommunication services to include the proper conveyance of relevant

environmental aspects (e.g. background noise). Moreover - and probably most crucial-

e On the other hand, a lifetime is probably not enough to attain perfection in face-to-face com­munication, in the sense that the communicating partners can be sure that the meanings of theirstatements are understood in the intended way.

Page 28: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

14 CHAPTER 2. INTRODUCTION

technologically mediated real-time communication is expected to allow for temporal pat­

terns, which are similar to face-to-face communication.

As mentioned in the previous section, it is generally agreed, that very little is known

about user expectations in regard to QoS issues (Bouch et al., 2000b). On the other hand,

it was assumed in this section that users expect an application behaviour that allows simi­

lar to face-to-face communication. In fact, these contradictory statements delineate the

objectives of this thesis: Under the assumption that users assimilate technologically medi­

ated communication with face-to-face communication, we aim at measuring the bound­

ary values of particular QoS-parameters that should not be exceeded in order to allow for

a 'feeling of naturalness'. This - in regard to the lack of quality - upper boundary is called

acceptance threshold.

Furthermore it is assumed that people perceive maximum communication quality re­

garding QoS-parameters in face-to-face situations. Or inversely, they do not perceive a

lack of quality in face-to-face situations. As such, this would mean that in face-to-face

communication people are familiar with having maximal information throughputs, no de­

lays, no jitter and no error rates. Of course this premise is somewhat out of touch with

reality, and strictly speaking - in the case of delay - incorrect: There is always a minimal

delay due to the propagation speed of sound and light. The following three reasons may

illustrate why this premise, nevertheless, makes sense:

• The transmission delay in face-to-face communication is constant and negli­

gible small (approximately 3 ms per meter of communication distance).

• The comparison is drawn by means of an idealised face-to-face situation,

where no disturbing outer influences are present.

• The face-to-face situation is chosen as an idealised point of reference, in or­

der to position the quality perception in technologically mediated communi­

cation.

With this premise in mind, we aim at investigating a second threshold, providing an

answering to the question: Which degradations of particular QoS-parameters are 'just'

noticed by the users? This question is mainly important for economical reasons, since the

knowledge of the so-called perception threshold provides network planners and content pro­

viders with a basis for decision. In fact, below perception threshold values, users will not

benefit from optimisation of network and end-system infrastructure referring to QoS­

parameters.

Page 29: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2.2 Scope of Investigation 15

In summary, the scope of our investigations consists of perception and acceptance

thresholds (see Figure 2) in technologically mediated, real-time communication. Note

that in Figure 2 the face-to-face situation is assumed to be at the point where no perturb­

ing stimulus is present.

Perturbing Stimulus

I'-------- -------

••••""...--/'

",*50% i"

",l'Perception

ill'## Acceptance

Threshold##

J l1li" Threshold",,'""_,fill..-•••••• -~

100%C\)uc.f3i"~u .­u -« co• :sca0_z 0... ~o uc co0-1.- -0. 0

~C\)c..

0% o

Figure 2 Delineation of the scope of investigation for this thesis, showing theacceptance and perception thresholds for an arbitrary perturbing stimulus. The curvesdepict hypothetical user response behaviours.Straight line: Perception of lack of quality.Dashed line: Non-acceptance of lack of quality.

2.2.1 Delay as Quality of Service (QoS) Parameter

So far the question of the users' QoS-perception has been addressed in a rather ge­

neric manner, incorporating the basic QoS-parameters. The sy,tematic investigation of allthese parameters, including interdependencies such as masking and amplifying effects,

would require a study design of exorbitant scale. Therefore the investigation is restricted

to a selection of QoS-parameters, which is considered most relevant, needful, and feasi­

ble. We decided to short-list the QoS-parameter delay, which includes intermedia syn­

chronisations as well as roundtrip delays. The reasons for this choice are exposed in the

following:

Page 30: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

16 CHAPTER 2. INTRODUCTION

• The timing of interpersonal real-time communication contains important

prosodic (or non-verba~ information about the mind frames of the communi­

cating partners. E.g. a bigger than accustomed delay between one partner's

proposal and the other partner's answer leads to misinterpretations (e.g. (1)

the latter would need to think about what was said, (2) would not be certain

of the answer, or (3) would simply have a slow reaction). Thus, timing plays

an important role in appraising individual characters and is therefore consid­

ered a key parameter in quality-based network engineering.

• Networked audio-visual communication requires - compared to audio only

- more end-system and network resources, since encoding, transmission,

and decoding of motion images are very data-intensive, requiring either high

network throughput or adequate processing power for compres­

sion/decompression in the end-systems (it should be noted that compres­

sion/decompression operations usually introduce considerable additional

delay). Thus, there are several interdependencies between throughput, com­

pression, and delay. Whereas throughput rates and the extent of compres­

sion of underlying network and end-system configurations cannot be di­

recdy perceived by the end users, this is not the case for delay. Moreover ­

beside picture and sound quality - it is the delay of a particular network ser­

vice that makes throughput and compression perceivable.

• Valid empirical data concerning perception and acceptance of various delay

parameters in multimodal communication remain sparse. Although a lot of

statements have been made about the upper limit of delay for real-time

communication, most of the values refer either to audio-only, or do simply

reflect the technical limits. An early example are investigations conducted by

Bell Laboratories (Helder, 1966). They were triggered by the introduction of

satellites with their inherendy big delays when high orbits are used. The re­

sults from these investigations are not fully convincing, since Bell Laboratories

were probably somewhat biased, as they were certainly not interested in

finding 'killing arguments' against satellite communication.

Page 31: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2.2 Scope of Investigation 17

2.2.2 Published Results for Perception and Acceptance of Delay

The subsequent tables list selective studies concerning perception and acceptance of

relative and absolute delays. There exists a trade-off between these two delay parameters

in terms of the possibility to set the relative delay to zero by buffering the faster stream

(usually audio), and - on the other hand - accepting additional wastage of network and

end-system resources. Thus, in order to optimise the allocation of resources without im­

pairing the users' quality perception, it is important to have profound knowledge of the

detection and impairment potential of both delay parameters.

Relative Delay

As mentioned, due to different compression/decompression needs for audio and

video data, the transmission of audio-visual data can result in considerable intermedia

synchronisation errors (referred here to as relative delqy). Table 5 lists some selective fmd­

ings about the perception of relative delay for both asynchrony orders: Auditory before

visual (AV), and visual before auditory ryA). The results of the listed studies - except

Steinmetz' findings - showed that AV stimuli were detected easier than the opposite or­

der. A further result concerned the type of the presented stimuli: synchronisation errors

of distinct stimuli are detected easier than synchronisation errors of the more complex lip

reading.

Table 5 Excerpt of studies concerning the perception of asynchronies.

Authors Condition... .. [msl \"",. ~(m.1.•• n;'

Lipreading 131 258(Dixon et al., 1980)

Hammer hitting peg 75 189

(McGrath et al., 1985) Drawn moving lips 79 138

(Lewkowicz, 1996) Bouncing disk 65 112

(Pandey et al., 1986)Lipreading with n.a.

Slump in performance

masking noise at 80-120

(Steinmetz, 1996) Lipreading ca. 80 ca. 80

Page 32: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

18 CHAPTER 2. INTRODUCTION

The relative delay thresholds do not vary too much, in consideration of the different

experimental designs used in the listed studies. Unfortunately some results of are lacking

from detailed specification of confidence levels. Furthermore, in the scanned literature

no studies were encountered, which provide psychometric functions of the synchronisa­

tion errors. In fact, perception thresholds were rarely obtained by means of psychophysi­

cal methods.

Absolute Delay

While transmitting multimedia data from one place to another, it is inevitable that a

certain amount of delay is introduced. For pure audio transmissions this delay can be

kept very small, depending on the system architecture, the coding and compression of

the signaL If there is video data in addition, more delay is added because of the bigger

complexity of the video information.

So far a lot of statements have been made about the upper limit of delay for real-time

communication that can be expected of the users. Unfortunately, most of the suggested

values do not reflect the needs of the users but result from technical limits. In Table 6

some selective findings about absolute delay are summed up.

Table 6 Excerpt of studies concerning roundtrip delay.

Authors Condition - co, •• DeI8J>tmsl

(Yamaguchi et al., 1986) audio 360

(Chen et al., 1989) audio 600

(Gonsalves, 1989) audio 400

(Ranta-aho et al., 1998) audio-visual 1400 - 1920

(Alfano, 2000) audio 300 - 800

(Wilkins et al., 1998) audio-visual20 (LAN)

380 (WAN)

(Bouch et al., 2000b) audio-visual 400

(Isaac et al., 1994) audio-visual 640 - 840

Page 33: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2.2 Scope of Investigation 19

2.2.3 APsychophysical Approach

Investigating humans' perceptions of external events is an interdisciplinary undertak­

ing involving several branches of study, such as physics, sensory physiology, cognitive

and social psychology, and even cultural anthropology. The inclusion of all branches

would of course go beyond the scope of this thesis. Therefore we will restrict ourselves

to a feasible approach. In our opinion a P!Jchop~sical approach is suitable for the investi­

gation of human perception and acceptance of perturbing influences from technologi­

cally mediated communication.

The following general definition of p!Jchophysics and its interpretation is offered by

John C. Baird and Elliot Noma (Baird et al., 1978) and used by the International Society ofP!Jchophysics: »Psychophysics is commonly defined as the quantitative branch of the study

of perception, examining the relations between observed stimuli and responses and the

reasons for those relations. This is, however, a very narrow view of the influence it has

had on much of psychology. Since its inception, psychophysics has been based on the as­

sumption that the human perceptual system is a measuring instrument yielding results

(experiences, judgments, responses) that may be systematically analysed. Because of its

long history (over 100 years), its experimental methods, data analyses, and models of un­

derlying perceptual and cognitive processes have reached a high level of refinement. For

this reason, many techniques originally developed in psychophysics have been used to

unravel problems in learning, memory, attitude measurement, and social psychology. In

addition, scaling and measurement theory have adapted these methods and models to

analyse decision making in contexts entirely divorced from perception.« Hence, accord­

ing to this definition, it appears that the term psychophysics is used to denote both the

substantive study of stimulus-response relationships and the methodologies used for this

study.

In Figure 3 a global model (Krueger, 1994) is introduced in which the psychophysical

approach is embedded. The model has been developed as a result of extensive research

concerning deterioration of health and well-b<.:ing, conducted at the Institute ofHygiene and

Applied P~siology (IHA) at the Swiss Federal Institute of Technology (ETH). It offers a means

to explain different user perceptions for objectively same interfering environmental stim­

uli. The model is also assumed to be suitable for investigating technologically mediated

communication, since the intermediate technology can be considered an artefact that

evokes perturbing influences.

Page 34: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

20 CHAPTER 2. INTRODUCTION

Figure 3 Global model (Krueger, 1994). The model explains the variance of userperceptions observed for objectively same stimuli.

The model in Figure 3 expresses the basic message that psychological effects may not

be disregarded when explaining environmental factors. I.e. the objective world is com­

municated to a subjective world of mental constructs. Subjective assessments (attributionf

and affective judgement) are done according to this mental construct system and not to

the objective world directly. As an entrance to the above model the classical stimulus­

response relationship measured by means of psychophysical methodology is dia­

grammed. This upper layer outlines the topic under investigation, whereas the deeper lay­

ers in the model are not subject to this thesis.

2.3 Structure of the Thesis

After having outlined background and aims of this thesis, and after having delineated

its scope comprising of the chosen psychophysical approach for measuring delay thresh­

olds, the remaining components of the thesis are now briefly described.

f For which the Attribution Theory offers means of explanations. The Attribution Theory wasfounded by Fritz Heider (1958) and advanced by Harold Kelley (1973), both social psycholo­gists. The theory is seen as relevant - among other things - to the study of event perception. Itdescribes how people explain events and the behavioural and emotional consequences of thoseexplanations.

Page 35: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

2.3 Structure of the Thesis 21

Chapter 3 deals with the theoretical background in the fields of communication,

cognitive psychology and psychophysics, which we consider necessary to elucidate. In a

first part of this chapter a taxonomy of communication is developed, which is arranged

in a layered order with higher positioned entities influencing the subjacent ones. Since

diverse - and sometimes contradictory - theories deal with communication we will give

insight in an excerpt, which is considered suitable and sufficient for our purposes. In a

next part some concepts and conceivabilities are presented dealing particularly with

neural processing speed of different modalities, and with mental representations of time

resolved on the neurological level. Subsequently, chapter 3 details the psychophysical

background necessary to understand the procedure of the experiments conducted in

chapter 4.Chapter 4 is subdivided into two parts, each describing procedures and results of ex-

periments conducted in the mode of either human-computer interaction (HeI) or human­

human interaction (HHI). The outcomes of these experiments are discussed in chapter 5,

where also concluding remarks are presented with an emphasis on human factors in net­

work engineering.

The following annex consists of a description of the software called best-PEST calcula­

tor, which has been programmed in order to run the threshold experiments and which

has been advanced to a fully independent, browser-based application aiming to make it

accessible for a broad audience.

A glossary, together with the cited references and an index of keywords concludes the

thesis.

Page 36: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Lee·r /Blank leaf

Page 37: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3

3.1

Theory

In this chapter we give an overoiew of the theoretical background concerning the topic underinvestigation, particularlY the experiments described in chapter 4. First we develop a taxon­omy ofcommunication, where a thread through the theories dealing with communication is es­tablished. SubsequentlY we give insight in current research ofmental representation of time,and processing speed in different modalities. The last part of this chapter deals with p[Ycho­physical theory and the methods applied in the experiments.

ATaxonomy of Communication

Understanding an entity under investigation implies analysing and describing it. When

this entity is too complex models have to be created, which classify objects and concepts.

Of course a model is an approximation of reality; nevertheless it should provide enough

resolution, so that gained insights are reproducible in reality.

The entity under investigation here is communication. Since there is no widely accepted

general taxonomyg of communication we are about to develop one in the sense of a 'co­

ordinate system', allowing for a concise description of specific communication settings.

For the users' quality expectations, the communication setting is considered crucial.

Therefore, it is important to avail oneself of appropriate models for such settings.

In the next sections an approach is described, consisting of the five aspects social con­

text, orientation, coding, modality, and timing. It is considered suitable for the investigation of

g Taxonomy is the science of the classifying laws. The notion of taxonomy means establishingclasses within a set; classes may form partitions, overlapping, or nested subsets.

Page 38: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

24 CHAPTER 3. THEORY

interpersonal communication. It will be shown that these five aspects can - to some ex­

tent - be ordered in a layered fashion as shown in Figure 4. The suggested communica­

tion layers will be defined and insight in the theoretical provenience will be given. There­

after the layered order is exemplified, pointing out the interface between the interpersonal

communication model and the OSI reference model. Parts of this chapter are published in

(Guttormsen Schar et al., 2002).

Technically skilled readers will recognise the interpersonal communication model of

Figure 4 as a variety of the famous 7-layers OSI reference model (description see in the

Glossary). In fact there is resemblance, and the concepts of these two models are not too

far from each other, albeit they cannot be transformed one-to-one. Rather, they belong

to different systems as it is depicted in Figure 4, differentiating between culture, individuals,

and technology. The model of the interpersonal communication can be thought of being

stacked above the OSI-model (technology), and being subordinated by the cultural con­

text. Before we take a closer view to the interpersonal communication model we will

shortly explain the other two systems, which are not in the focus of our investigation.

With OSI's technological approach, control is passed from one layer to the next. A

communication begins with the application layer on one end (for example, a user work­

ing with a videoconference (VC) application). The information is passed through each of

the seven layers down to the physical layer (which is the actual transmission of bits). On

the receiving end, control passes back up the hierarchy.

In the system culture, we are distinguishing between individualistic and collectivistic

cultures. Individualism holds that the individual is the primary unit of reality and the ulti­

mate standard of value. This view does not deny that societies exist or that people benefit

from living in them, but it sees society as a collection of individuals, not something over

and above them. Collectivism holds that the group - the nation, the community, the race,

etc. - is the primary unit of reality and the ultimate standard of value. This view does not

deny the reality of the individual. But ultimately, collectivism holds that the groups one

interacts with determine one's identity.

Page 39: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 ATaxonomy of Communication 25

Interpersonal Real-Time CommunicationApplication

Social Context IFormalInformalI

lIPerson

Non-PersonIOrientation

lIVerbal

Non-VerbalICoding

• IVisualAuditoryIModality

l ISynchronousTiming Le I

Figure 4 Outline of communication taxonomy in a layered fashion with typicalexamples. Attributes bounded by dashed lines are not in the focus of our investigation.Reading example: communication settings between people take place either in formal orinformal context (Short et aI., 1976). The transmitted information concerns the relationbetween the partners (person-oriented) or the content (non-person-oriented(Watzlawick et aI., 1967). It is encoded verbally or non-verbally, and is received with theaid of either the visual or the auditory sense organs. The timing decides - among others- which applications take place in the considered communication setting.

Page 40: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

26

3.1.1 Social context

CHAPTER 3. THEORY

The higher layers in the communication model (i.e. social context and orientation)

comprise a rather broad range of features. As a consequence, the study of higher-level

aspects of communication has been the source of many different theories. In what fol­

lows, a thread is established through some of these concepts. It should be noted that

most of these approaches were deftned for business communication, probably because

this area is regarded as most influential when quality requirements are established.

In the following we describe social context according to the aspect degree rifformality, i.e.

depending on how far there exist formal rules or some codex for the exchange of

information.

Formal and informal communication

Several theorists make the distinction between formal and informal communication.

Smith (1972) deftnes formal communication channels as »those emanating from official

sources and carrying offtcial sanctions [...]. Formal messages usually flow through these

channels, thus acquiring legitimacy and authenticity«. On the other hand, informal com­

munication channels »are not specifted rationally. They develop through accidents of spa­

tial arrangement, through friendships«. Both formal and informal communication can fol­

low an up-ward, downward, or horizontal path (to higher, lower, or equal authority). The

purposes of formal communication are to command, to instruct, and to ftnalise matters

through the application of regulations. The purposes of informal communication are to

educate through information sharing, to motivate through personal contacts, and to re­

solve conflicts through participation and friendship. It seeks to involve participants in

organizational matters as a means of maintaining their enthusiasm, loyalty, and commit­

ment. Table 7 lists some characteristics of formal and informal communication.

Page 41: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 A Taxonomy of Communication

Table 7 Characteristics of formal and informal communication.

27

......... ,.. co .// . 0;:'/rf/ Br; /ii

• v ...... -" "" I·-~'.I ... ..official, binding unofficial

precise, unlikely to be misunderstood personal, inaccurate

traceable, can be preserved hard to trace

can avoid embarrassment can refute rumours and gossips

restricted jargon, rigid more emotional

authoritarian, likely to be obeyed less intimidating

fails to motivate promotes disclosure of underlying motives

3.1.2 Orientation

Orientation describes to some extent the purpose of a communication setting and the

related view the participants should have about the tacit assumptions needed to under­

stand the topics being discussed. In some cases, these assumptions might be very limited

in range (e.g. comprising some technical knowledge needed to solve a specific task), while

in other cases, a common worldview is necessary for a fruitful discussion.

The summary below shows theories and methods that can be attributed to the orienta­

tion layer. It is necessarily far from comprehensive; it should rather be seen as an indica­

tion that it is extremely important to define a certain experimental setting properly, using

terminology and insight gained from the cited theories.

The Bales Categories

Already in the fifties Bales (1955) ran a series of experiments in which subjects held

simulated meetings. He analysed the nature of the interactions that took place. From

these experiments he elaborated four main categories: positive reactions, negative reactions,

problem-solving attempts and questions. At the Communication Studies Group (CSG), Short

(1976) reduced these four categories to two: Bales' positive and negative reactions were

classed as person-oriented, and the two other categories, problem-solving attempts and

questions were classed as non-person-oriented. The CSG considers person-orientation to be

the core category in understanding communication mediated by teleconferencing.

Page 42: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

28 CHAPTER 3. THEORY

A further step in developing classifying schema was the SYMLOGh-space. Elaborated

from the large amount of research conducted by Bales (1999) this approach indicates at

least three bipolar characteristics, that are fundamental to describe communications in

small groups. The three dimensions spanning the SYMLOG-space are:

• Dominance versus Submissiveness

• Friendliness versus Unfriendliness

• Acceptance versus Non-acceptance rfAuthority.

These characteristics were implemented in standardised questionnaires and were al­

ready applied innumerable times in different cultures. Thus, SYMLOG is supposed to as­

sess person-oriented parts of small group communications reliably.

The distinction between content and relationship

Watzlawick (1966) distinguishes between the content and the relation part of a message,

thus establishing a direct link to the terms report and command introduced by Bateson

(Ruesch et al., 1951). Watzlawick (1967) apdy points to the correspondence of these

terms to the computer science terms data and control Since control information specifies

what is to be made with the data at hand, it can be regarded as 'information on informa­

tion', i.e. metainformation. The following axiom describes this insight: »Every communica­

tion has a content and a relationship aspect such that the latter classifies the former and

is therefore a metacommunication.« (Watzlawick et aI., 1967). In the case of interpersonal

communication, the exchange of control information could be seen as 'downloading ap­

plets' to be executed by the communicating persons. In that sense, Watzlawick's view is

very near the categories established by Bales, and we can possibly simplify our taxonomy

by understanding a content-oriented approach to be non-person oriented, and, on the other

hand, relation to be person oriented.

How can one express person-, and non-person-oriented information? The answer to

this question leads to the section 3.1.3, where coding is discussed. But before, we take a

closer view of the orientation layer introducing the distinction of implicit and explicit in­

formation types. Both person and non-person oriented information can be of implicit and

explicit nature respectively, and they can be expressed by both verbal and non-verbal cod-

h SYMLOG is the acronym for SYstematic Multiple Level Observation of Groups.

Page 43: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 ATaxonomy of Communication 29

ing. And more precisely, there are no statements conveying solely implicit information,

there needs always to be the 'carrying' explicit information too. But in contrast to this,

there are statements conveying solely explicit information. An example will illustrate this

fact: »Joe drank ten beers last night« is a statement, which is by itself explicit and univo­

cal. But depending on the context and the sound of voice saying it, it can be understood

in the way of »Incredible how much alcohol Joe always drinks!« which is the implicit

meaning. For the sake of simplicity, when we speak of implicit information we mean

both implicit and the carrying explicit information.

In our approach there is no sharp distinction between implicit and explicit information

types. We differentiate the two by means of the degree rfambiguity, i.e. implicit information

is strongly ambiguous, whereas explicit information is slighdy ambiguous or not at all.

The use of the term ambiguity likewise implies that the communicating partners have

mutual and tacit assumptions about the rules of their information exchangei . This means

that partners should agree on, and be aware of the 'rules' of ambiguityi; fulftlment of this

requirement is - among others - a job of the education system, imparting verbal and cul­

turalliteracy. In order to clarify the distinction between implicit and explicit information

types see Table 8 (page 33) where some examples are depicted.

3.1.3 Coding

Information can be encoded and transmitted in many different ways, and different

forms of communication with specific codes may be used concurrendy. A voice signal

can conceptually be decomposed into a verbal part and a non-verbal part (or so-called pro­sodic information, like pitch, melody, level and timing). On the one hand, non-verbal fea­

tures of both visual and auditory modalities convey information allowing to interpret a

message properly (e.g. to differentiate between a question and an exclamation), or allow­

ing to make the speaking person out (e.g. by means of moving lips); on the other hand,

they help us in identifying a known person or to guess about his/her state of mind.

i otherwise - as a consequence - they would have to accept an impaired conversation (as it mighthappen between partners speaking different languages), or they might have to oversimplify thetopic.j we are not considering the particular meanings of an ambiguous statement to be vague, unclear,or obscure - far from it - they are very precise and clear; ambiguity is caused by the fact that oneis not sure which of the meanings can be accepted for true.

Page 44: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

30 CHAPTER 3. THEORY

In addition to this, we should consider the different recognition capabilities of differ­

ent codingk • Weidenmann (1988) points out this aspect referring to the learning charac­

teristics of different media types. He states that - when choosing the appropriate learning

media - our different familiarity in handling words and pictures comes into play.

Whereas linguistic skills like reading and writing (i.e. verbal skills) are systematically

trained in our educational system, the competence of handling instructional pictures (i.e.

non-verbal skills) needs to be developed. As far as we can see, the two outstanding at­

tributes on the coding level are the verbal and the non-verbal information types, as de­

scribed below.

Verbal Coding

We speak of verbal coding when written and/or spoken languages and/or numbers are

used, and when mechanisms (i.e. grammars or lexica) exist through which the correctness

of a text or utterance and its meaning can be determined. Thus, verbal coding can itself

be seen at different levels, as shown in Figure 5. It should be noted that the higher we

move up in Figure 5, the more it becomes difficult to set up grammars and lexica as a

formal and comprehensive basis. In fact, the complete interpretation of text and speech

is partly dependent on the semantic level, and to some extent also on the pragmatic level,

which is in addition represented by the orientation and the social context layers as discussed

in sections 3.1.2 and 3.1.1 (page 27).

k Note that we are considering external coding here, unlike internal coding which is used in cogni­tive psychology with emphasis on the mental coding of input and the resulting information proc­essmg.

Page 45: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 A Taxonomy of Communication

/ Pragmatics Derivation of actions fromthe meaning.

Semiotics SemanticsGenerally acceptedmeaning of words,sentences and texts.

~ SyntaxRules by which words arecombined to makesentences and texts.

~ Rules by which signs arecombined to make words.

Figure 5 A layered view of verbal information. Note that the attributes given onthe right side are typical examples.

Non-verbal Coding

31

Non-verbal coding is often associated with many different kinds of pictorial representa­

tion (e.g. gestures and facial expressions conveyed through video, graphs and pictures,

animations, pictograms, icons, etc.). Also, as already described, the prosodic features con­

tained in a speech signal represent non-verbally coded information, as well as any non­

verbal sound, for example instrumental music. Furthermore, it should be noted that vari­

ous forms of background information (both visual and auditory) might supply important

information about the context of a communication session. I 'or example, hearing the

background noise of a rail-way station makes a phone call mure credible when the sub­

ject of the call is about train delays - or even more when seeing the cabin in the back­

ground if using e.g. a UMTS device.

3.1.4 Modality

One of the most basic conditions for participation in any communication event is the

sensation and the perception of the transmitted signals conveying information. This in­

volves the human sense organs, which are able to detect light, sound, smell, taste, touch

and position, each corresponding to one specific mode (often, the term channel is used

alternatively). Although future developments in telecommunications might bring the in-

Page 46: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

32 CHAPTER 3. THEORY

troduction of olfactive (smell) and haptic (touch) modes in special contexts (e.g. telesur­

gery), we will restrict ourselves for the time being to the auditory and visual channels.

Multimodal Communication: Audio-Visual

Whenever the auditory and the visual channels are simultaneously invoked in a com­

munication setting, we normally speak of multimedia communication. Increasingly, the alter­

native term multimodal communication is used, where 'multi' does not just imply 'sound and

vision', but the fact that several different forms of communication (in the sense of sub­

modes) can be implemented within both the auditory and visual channels. For example,

the visual channel is involved when a video signal represents a 'head and shoulder' pic­

ture of the communication partners; alternatively, it is used as well when text and graphi­

cal information are exchanged in shared workspace applications. The latter application

usually comprises still another supporting communication mode in the form of a separate

channel linking a mouse or a joystick simultaneously with a local and a remote pointer.

These examples belong to the coding layer in our communication model, since the sub­

modes differentiate themselves through different forms of coding.

Page 47: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 A Taxonomy of Communication

Table 8 Exemplification of the three layers orientation, coding and modality (includ­ing implicit/explicit distinction). The implicit messages are made explicit in the onfy explicitcolumn (in the verbal-coding row only). Note that the distinction between implicit and ex­plicit is made by means of the degree ifambiguity.

33

Person oriented . ... ..:"". .~. ...

Implicit and explicit Only explicit Implicit and explicit Only explicit

Reading 'between Written text with rela- Reading 'between the Written, task-relatedthe lines' some rela- tional information. lines' some task-related text.

ro tional information. information.;:,en e.g. »Big parts of his at- e.g. »We will have to lay5

e.g. »Mr. Miller has tendance time Mr. Miller e.g. »Maintaining job se- off workers next month.«.gJextraordinary interper- was chatting with his col- curity will be abig chal-

1 sonal skills.« leagues.« lenge in near future.«

~ Hearing 'between Spoken text with rela- Hearing 'between the Spoken, task-related

~. the lines' some rela- tional information. lines' some task-related text.~ tional information. information..s:.0 e.g. »1 doubt about your e.g. »The Porsche engine;:,<{ e.g. »Are you sure of competence.« e.g. »The Porsche engine has to be reengineered.«

what you are talking still uses traditional injec-about?« tion.«

Extracting relational Gazes, gestures, im- Extracting task-related Task-related gestures,information from ages, emoticons [e.g. information from gazes, images, icons [e.g. ~,

ro gazes, gestures, :-( or ©] etc. with re- gestures, images, etc. ){] etc.;:, images, etc. lational information.en5

e.g. showing apicture ofCD e.g. configuration of linesc e.g. avoiding eye con- e.g. the referee showing indicating a3D-cube adefect of an aeroplane,;gCo) tact the yellow card

.......

~ Extracting relational Pitch, volume, etc. of Extracting task-related Task-related pitch, vol-1" information from voice and sounds with information from pitch, ume, etc. of voice andc pitch, volume, stac- relational information. volume, etc. of voice sounds.0z ~ cato, etc. of voice and sounds..s

:.0 and sounds. e.g. the sound of abeep;:, e.g. hooting, cheering<{

e.g. rattle noise from a instead of acensorede.g. talking with higher vehicle word in aspoken sen-pitch to someone tence

Page 48: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

34

3.1.5 Timing

CHAPTER 3. THEORY

As can be seen in Figure 4 we distinguish between synchronous and asynchronous

interactions1. Simplifying things we note that the main difference between the two timing

categories is the time magnitude of the interactions between the communicating partners.

Whereas synchronous interaction is in the range of milliseconds to seconds, asynchro­

nous interaction is in the range of minutes to hours, or even days to weeks. Since we re­

strict the model to synchronous and asynchronous timing, we implicidy restrict ourselves

to dialog or interactive communication.

According to Fluckiger (1995) the timing of interaction decides which applications

take place in the considered communication setting. Examples for synchronous or real­

time interaction are:

• Interpersonal applications: Only two individuals are involved. Also called person­

to-person applications, and sometimes called one-to-one applications.

• Distribution apph'cations: Sometimes called person-to-group applications, where

multimedia information such as a live audio and video is transmitted from

one source to multiple recipients in a one-way mode (no return channel

from the recipient to the source). This is analogous to 1V broadcasting.

• Group teleconferencing: Sometimes also called group-to-group teleconferencing,

which is a generic term referring to bi-directional conversational communi­

cation between two or more groups of people.

Examples for asynchronous interaction are:

• Multimedia e-mail: This is the conventional e-mail where the documents ex­

changed are not only plain text, but also include rich text, hyperlinks, and

audio or video sequences.

• A!)nchronous computer conferencing: Refers to a service where people exchange

multimedia messages asynchronously. The technique often consists of sub­

mitting or retrieving contributions to or from centralised servers.

1We define a basic interaction unit as one reciprocal action, consisting of an action triggered by asource, echoed by a sink, and received by the source again.

Page 49: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 ATaxonomy of Communication 35

Since asynchronous communication is not in the focus of our investigations, we will

only consider synchronous (i.e. real-time) settings in the following. Within these real-time

settings we focus on network-mediated interpersonal communication as well as on peo­

ple-to-systems communication.

Absolute Delay

The main issue of our investigation of real-time communications concerns absolute de­

Icry, which is only perceivable in dialog settings. Strictly speaking, also typical one-way ap­

plications like e.g. video-on-demand have a dialog part, namely between sending the re­

quest and receiving the video stream. This means that also one-way applications let the

user perceive absolute delcrys in an initial phase, but as soon as the connection is established

the user is not aware of absolute delays anymore, so that the term one-wcry for such kind

of applications is justifiable.

In Figure 6A we depict the definition of the absolute delcry in the way users of real-time

dialog applications are aware of. In the same figure there is also the technologically in­

spired definition of the term round-trip delcry, which is sometimes used synonymous. We

define absolute delcry as the elapsed time between the expression of an auditory, visual or

tactile trigger and the answer from a communicating partner (human or machine). I.e.,

the acting user at the source sets a primary internal time marker when executing an ex­

pression, and a secondary one when perceiving the answer. The estimation of the elapsed

time between these two markers is what the acting user perceives as absolute delay. It

remains for the time being an open question, whether the acting user sets the time

marker at the time he/she perceives his/her own expression, or at the time he/she is

planning to produce it. The absolute delay consists of:

• two network transit delays (hi-directional)

• two times the depacketising delay

• the source encoding and decoding

• the sink echo processing

• the neural transit delay of the user at the source receiving the answer

In Figure 6B we depict a magnified view of the sink echo processing consisting of:

• the sink encoding and decoding

Page 50: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

36 CHAPTER 3. THEORY

• of the reacting user's reaction time

On its part the reaction time consists of:

• the neural transit delay between peripheral excitation and conscious percep­

tion

• the cognitive processing time

• the time needed to produce and execute the output stimulus

Depending on the point of view, a particular user in a real-time dialog setting has both

roles: for oneself that of the source and for the partner that of the sink. Hence, when

perceiving the absolute delay the user does - beside the technologically generated delays

- estimate the reaction time of the partner, but not the own reaction time. I.e., the per­

ceived and estimated source echo processing time is not equal to the sink echo process­

ing time.

tA

Neural transit t Bdelay of acting

SUbject

Stimulusproduction

Cognitiv processing time

Consciousperception of

reacting subject

1

1 Neural transitdelay

I I 1

I!I-Ollll(""'--- Reaction time of reacting subject--...

01" Absolute Delay ---------~.-JI~ I~ I \J-li·.-----I-::.~=-I--Round-trip delay '-1 -co III~ 1(/)1 .c.51 I 1"1 .21~o I First bit I First bit II~ I Last bit First bit First bit 0.\ U)

~ Itransmittedl received ~ Ireceived transmitted received B I~o I by source I by sink 10ID by sink by sink by source ~ lit;51 a.. ca

'Ci) Sink echo processing IU)~~__

KII Source INetwork transit I I Sink I Sink INetwork transit l>< I coding I delay 1 I Idecodin I coding I delay 2 IWI

Figure 6 A: Schematic diagram of the round-trip delay according to Fluckiger(1995), and of the absolute delay according to our definition. Grey shaded areas indicatehuman information processing time. B: Magnified view of the reacting subject's reactiontime, and the neural transit delay of the acting subject.

Page 51: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 ATaxonomy of Communication

Relative Delay

37

In contrast to absolute delay, relative delqy is perceivable also in one-way settings. We

define relative delqy as the time difference between the appearance of the visual stimulus

and the appearance of its appendant auditory stimulus in an audio-visual presentation.

Furthermore we distinguish between the possible orders of the incoming stimuli: Audi­

tory precedes visual (AV), visual precedes auditory ryA), or they are in sync, i.e. there is

no relative delay. Relative delay is sometimes referred to as intermedia .rynchronisation, or lip

.rynchronisation, pointing to the particular synchronisation requirements needed either to

give the feeling of naturalness in audio-visual telecommunications, or to enable or en­

hance lip reading for hearing impaired people (e.g. to optimise hearing aids). These areas

comprise a rich body of literature as e.g. (McGrath et al., 1985; Pandey et al., 1986;

Summerfield, 1992; Kouvelas et al., 1996; Steinmetz, 1996; Stone et al., 1999; Oviatt et

al., 2000; Stone et aI., 2001; Van Hoesel et al., 2002), which in fact rarely treats perception

thresholds obtained by means of psychophysical methods. Further studies investigated

the intermedia synchronisation by means of distinct stimuli like bouncing disks

(Lewkowicz, 1996), or hammer hitting a peg (Dixon et al., 1980). (See also section 2.2.2

Published Results for Perception and Acceptance of Delay).

3.1.6 Exemplification of the interpersonal communication model

In the following we will explain the interpersonal communication model by means of a

videoconference ryC) user. Furthermore we will point out the interface between the OSI

and the interpersonal communication model, when we consider the interaction timing of

different applications.

Before the VC user will start sending information through the videoconference appli­

cation, s/he will be aware of the social context, in which the communication setting will

take place. In our approach, this means that s/he knows if the communication partner

belongs e.g. to the family, to the workmates, to the circle of friends, or to the circle of

acquaintances etc. Consequendy s/he has also an idea of the hierarchical position of the

communicating partner, of the overall importance of the event and the like. We subsume

these factors, saying that the user is aware of the degree rif the formality of the event. Fur­

thermore we predicate that the degree of formality determines the communication proc-

Page 52: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

38 CHAPTER 3. THEORY

ess to come. That is, the communicating partners will choose an appropriate languagern

as well as modify the topics of the conversation, voice pitch, gestures, gazes, and interac­

tion timing. When we consider these modified aspects separately, deeper layers in the

communication model will be probed.

As already stated before, communication between people comprehends content in­

formation, including also the problem or purpose, and metainformation, i.e. implicit in­

formation concerning the intended meaning of the verbal, usually ambiguous content.

Metainformation usually uncovers in which relation the partners are, and is therefore

considered as person-oriented information, unlike the non-person-oriented information, which is

the 'real' content. However, the orientation of the information stream to be sent to the

communication partner is the next crossway, where the VC user has to pass by. Accord­

ing to her/his appraisement of the actual communication setting (which also includes the

problem to solve), s/he will direct the information flow more towards the partner or

more towards the task. And s/he will choose a more explicit or a more implicit way to

express her/his message. Again, this influences the following layers.

In order to illustrate how the higher layers influence the coding of a message, let us

assume two examples for the use of a videoconference, which both are in a formal con­

text:

• Two industrial designers are working on improving the ergonomics ifa drilling machine.

• Superior and emplqyee are talking about the emplqyeejpersonalperformance.

First of all these examples show that the choice of verbal and non-verbal coding respec­

tively is determined by the problem to solve. The designers primarily will choose sketches

and schemes to solve the problem, whereas the superior will talk to the employee before

writing a letter of reference. Thus it appears that the purpose determines the orientation

of the conversation, and furthermore the suitable coding: The coding is mainly non­

verbal in the case of the (non-person-oriented) designers, and is mainly verbal in the case

of the (person-oriented) superior. The chosen examples are not inevitably typical exam­

ples, there are probably more examples proving the contrary, e.g. using non-verbal cod­

ing in person-oriented communication and using verbal coding in non-person-oriented

communication. Hence it appears that these examples do not imply rules for the use of

m For instance, they restrict their vocabulary, if they consider themselves in a formal conversa­tion.

Page 53: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.1 ATaxonomy of Communication 39

the particular coding. They only exemplify the layered order of our modeL We are sug­

gesting that depending on a particular communication setting, there are tacit agreements

about the accepted and optimal manner to encode the message. Referring to the two ex­

amples this means that it is probably not helpful to use to a great extent spoken and writ­

ten language in order to improve the ergonomics of a drilling machine. And it is unusual

and probably not accepted by the employee being qualified only by charts and diagrams;

a personal word is expected here.

Actually the next entity modality needs not to be underneath the coding, in terms of

being determined by it. But it makes sense when we consider the degree of conscious­

ness, which is necessary either to perceive or to decode a message. Perceiving visual and

auditory information is handled by the sense organs, their corresponding neurological

pathways and by the visual and auditory cortex, whereas decoding verbal and non-verbal

information involves higher levels of information processing and consciousness. In

short: An amoeba is capable of detecting light, but will fail to extract abstract information

from a visual pattern.

Considering the relative delcry of an audio-visual event, where sound is preceding the

visual component, we are leaving the 'natural' frame of reference: in a natural environ­

ment there is no sound preceding the corresponding visual event, whereas the contrary

situation - sound is lagging the corresponding visual component - is familiar to every­

one, e.g. seeing first a hammer hitting a peg before hearing the knock. A comparable rea­

soning can be followed in regard to absolute delcry: audio-visual communication in natural

environments creates no bigger transit delays than sound needs to travel through the

range of vision, whereas in technologically mediated communication this delay can be

theoretically of any value above a minimal delay due to physical constraints. Recapitulat­

ing, it appears that fundamental characteristics of the timing layer concerning order, or

asynchrony are not found in face-to-face communication. In contrast to that, all charac­

teristics of the higher layers in our communication model are found - together with

technologically mediated communication - also in face-to-face communication. This fact

predestines the timing to be the most basic layer, representing the interface to the system

technology, which, on its part, is instantiated by the OSI-modeL

The basic layers timing, modality and coding of the communication model in Figure 4 are

mainly of elementary nature. They can be regarded as absolute prerequisites for any

communication between people. Procedures for the investigation of these layers are ex­

pected to be manageable. This is not the case for the upper layers orientation and social

Page 54: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

40 CHAPTER 3. THEORY

context, where many diverse situations are conceivable, usually very much depending on

the nature of the tasks performed. Moreover, at these layers psychological characteristics

of the involved persons will play an important role; thus, the character of the involved

people, and - after all - group dynamics may have to be taken into account when design­

ing experiments or interpreting their results.

It has been found that the investigation of most multi-participant (dialog or conversa­

tion type) settings is a move into 'terra incognita', i.e. generally accepted research ap­

proaches do not exist, and most often there is a lack of methods, taxonomies and even

proper definitions of the entities under investigation. This is especially true for psycho­

physics, where traditionally many problems associated with 'one-way' situations were in­

vestigated (humans as stimulus receivers), and where, on the other hand, research

concerning dialog settings appears to be extremely sparse.

3.2 Processing Time of Auditory and Visual Stimuli

When we consider the perception of events conveying coexistent information of dif­

ferent modalities, such as - in our case - auditory and visual, we have to take into ac­

count that different receptors and perceptual pathways are involved for different modali­

ties. Therefore it is obvious taking into account the possibility of different processing

times in different modalities. In fact there are differences. In the following we will pre­

sent two ways of determining them: indirectly through differences in reaction time for

different modalities, and directly trough measurement of Event Related Potentials (ERP).

3.2.1 Indirect: Reaction Time Differences

Reaction time has been a favourite subject of experimental psychologists since the

middle of the nineteenth century. Thereby three basic kinds of reaction time experiments

have been conducted.

• Simple reaction time experiments

• Recognition reaction time experiments and

• Choice reaction time experiments

Page 55: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.2 Processing Time of Auditory and Visual Stimuli 41

In simple reaction time experiments, there is only one stimulus and one response. If the

stimulus appears the response is required as fast as possible. In recognition reaction time

experiments, there are some stimuli that should be responded to and others that should

be ignored. And in choice reaction time experiments, the experimental subject must give a re­

sponse that corresponds to the stimulus, such as pressing a key corresponding to a letter

if the letter appears on the screen.

Since the beginning of the reaction time research, many researchers have confirmed

that reaction to sound is faster than reaction to light. The accepted figures for mean sim­

ple reaction times for college-age individuals are about 190 ms for visual stimuli and

about 160 ms for auditory stimuli (Galton, 1899; Fieandt et al., 1956; Brebner et al., 1980;

Welford, 1980). Differences in reaction time between these types of stimuli persist

whether the subject is asked to make a simple response or a complex response (Sanders,

1998). The time for motor preparation (e.g., tensing muscles) and motor response is the

same in all three types of reaction time test, implying that the differences in reaction time

are due to processing time (Miller et al., 2001).

Hence, there is evidence from reaction time experiments that the mean processing

time of auditory stimuli is about 30 ms shorter than the mean processing time of visual

stimuli. On the other hand there is also evidence, that processing speeds are not fixed

values, rather they are influenced by various forms of facilitation effects: The difference

between reaction time to visual and auditory stimuli can be eliminated if a sufficiently

high visual stimulus intensity is used (Kohfeld, 1971). Cross-modal facilitation can be

demonstrated with experiments showing that reaction time to multimodal inputs pre­

sented in close spatial and temporal proximity are typically faster and more accurate than

those made to the unimodal stimuli alone (Hershenson, 1962; Welch et al., 1986; Giard et

aI., 1999; McDonald et al., 2000).

3.2.2 Direct: Event-Related Potentials (ERPs)

Electroencephalography (EEG) provides a direct and non-invasive technique to di­

rectly measure processing speed of different modalities: Embedded within EEG signals

are short-term transient waves known as Event-Related Potentials (BRPs). These waveforms

reflect the singular experience associated with an external stimulus such as an auditory or

visual event.

Page 56: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

42 CHAPTER 3. THEORY

When a stimulus is presented to a subject, and brain activity is recorded following the

presentation of the stimulus, an ERP can be recorded. I.e. the voltage fluctuations re­

corded at the surface of the scalp contain elements specific to the presented stimulus.

Typically, ERPs are largely contaminated by other activities of the brain. By averaging

across several tens or hundreds of trials, individual ERPs become apparent. A specific

ERP becomes evident by adding a series of individual EEG samples time-locked to the

evoking stimulus. By summing these samples, the background brain activity, which is as­

sumed to vary randomly over time, will tend to average out.

Accepted figures of visual processing time derived from ERP-studies are between 45

ms and 55 ms as represented by the onset of the earliest cortical potential (Clark et al.,

1995; Clark et al., 1996; Foxe et aI., 2002). On the other hand, the earliest auditory

evoked potential reaches the cortex between 9 ms and 15 ms (Celesia et al., 1971;

Vaughan et aI., 1988), or in less than half the time of visual input, approximately 30 - 40

ms earlier than the visual stimulus. The consequences of different processing times are

that asynchronies of audio-visual events are perceived differendy in respect of the stimu­

lus order: Same relative delays for both incoming modality orders would evoke a bigger

perceived delay when auditory precedes visual, than in the opposite order (see Figure 7).

However, this effect might be compensated by recendy discovered fmdings: two stud­

ies (Giard et al., 1999; Molholm et al., 2002), which investigated the integration of audio­

visual (AV) information by means of ERPs, showed an early AV effect after 46 ms over

the right parieto-occipital scalp. This finding suggests that the auditory part of AV-inputs

modifies early visual sensory processing and leads to the following interpretation: Firsdy,

auditory input activates primary auditory cortex (A1) within 15 ms after stimulus presen­

tation and is then transmitted up the auditory processing stream. This input is then pro­

jected to visual areas. The critical issue is one of timing. The question is whether there is

sufficient time for auditory input to reach early visual areas to result in modulation of the

later arriving visual input. Given the above mentioned processing times between the ini­

tial auditory and visual inputs to their respective primary cortices, there is a window of 25

ms - 30 ms in which the auditory evoked process can prepare visual areas for arriving

visual evoked processes.

Page 57: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.3 Mental Representation of Time

I i

'!i-ooII1III(f-----perceived relative delay AV----.....;..~II

1l1li( .. :

Iperceived relative delay VAI

43

o 20 40 60 80 100 120 t(ms)

Figure 7 Effect of the different processing times of auditory and visual stimuli inthe human brain disregarding the early AV effects described by Molholm (2002). Greyshaded areas indicate processing time for both modalities.

3.3 Mental Representation ofTime

Synchronous interaction is immediate. Knowing from real-life situations, the term imme­

diate is used with considerable tolerance. In some situations the reaction of a request

should be as fast as possible, whereas other situations allow for a reaction after a certain

delay, e.g. after a commenced workstep is accomplished. Anyway, whenever an immedi­

ate reaction is required, it is expected to be executed now. Therefore the term now - which

means the present - is afflicted with big tolerance too.

This real-life experience has an analogy in the r hilosophical discourse: If one argues

on an abstract level, the present can be considered as the dimensionless border between

the past and the future, thus the present does not last since it is a timeless cut-off point.

On the other hand, we know by experience that the present has a certain duration, i.e. we

are aware of the present and we can easily distinguish between what is now, what has

been before and what is still to come. Otherwise we would be riven between past and fu­

ture. This discrepancy between experience and theory represents a profound problem,

and philosophers were dealing with it since antiquity (for some examples see Poppel

(1997a». Since we focus on phenomenological reality, we are not treating the abstract

Page 58: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

44 CHAPTER 3. THEORY

. connotation of the present, but the experienced 'nowness', which is called su,?jective present

(Stern, 1897; Poppel, 1978).

3.3.1 Low Frequency Processing

Given that suijective present is experienced as a certain amount of time, how can it be

determined then? Poppel (1997a) describes some experiments dealing with the duration

of subjective time. In the following we give an excerpt of these experiments, concerning

the visual and the auditory modality.

Figure 8 shows an ambiguous line drawing, named after its founder Louis Albert

N ecker. It is a wire-frame drawing of a cube in isometric perspective, which means that

parallel edges of the cube are drawn as parallel lines in the picture. When two lines cross,

the picture does not show which is in front and which is behind. This makes the picture

ambiguous, i.e. it can be interpreted in two different ways. When a person stares at the

picture, it will often seem to flip back and forth between the two valid interpretations. In

order to reproduce what follows, it is helpful making us familiar with both perspectives.

The black spot in the corner of the cube in Figure 8 is an aid to envision the two per­

spectives: in one perspective it is in the foreground of the cube, in the other it is in the

background. After we are capable of swapping deliberately between the two perspectives,

an experiment can be conducted demonstrating the scope of the human time integration

capability: We stare at the cube and try to hold one perspective as long as possible. What

happens then is, that after a few seconds the perspective swaps automatically. Now we

try to hold the swapped perspective as long as possible. We will notice that once again af­

ter some seconds the cube swaps against our wishes. A possibility to overcome the

cube's forced swapping is staring at an arbitrary point of the cube and trying to think at

something different. As a result, the cube remains stable, because we have banned it from

conSCiousness.

Page 59: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.3 Mental Representation of Time

Figure 8 The Necker Cube is an optical illusion first published in 1832 by theSwiss crystallographer Louis Albert Necker. It offers a means to estimate the durationof the subjective present.

45

The spontaneous alteration of ambiguous figures is an effect that is observed also in

the auditory modality. A similar experiment can also be conducted interpreting e.g. the

ambiguous phoneme sequence CU - BA - CU - .... For some seconds one hears BACU

whereupon for another couple of seconds one hears CUBA (poppel, 1997b). Such spon­

taneous alteration rate in the two modalities suggests that a low-frequency mechanism

binds successive events of up to 3 s (poppel, 1994) into perceptual units. After this pe­

riod attentional mechanisms are elicited that open sensory channels for new information;

if the physical stimulus remains the same, the alternative interpretation of the stimulus

will gain control. Metaphorically, up to every 3 s the brain scans the sensory inputs and

asks: »what is new?«

Evidence for the 3-seconds-hypothesis is also supplied by experiments using other

paradigms. Studies on the temporal reproduction of stimuli with different duration show

that stimuli are reproduced almost truthfully up to 3 s. Longer stimuli are reproduced

significandy shorter and with much greater variability (see Figure 9). Intervals of up to 3 s

can be mentally preserved, or grasped as a unit, whereas longer stimuli are likely to be

squeezed into the 3 s interval.

Page 60: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

46 CHAPTER 3. THEORY

723456Duration of stimulus (s)

1o+-....,....~_r___r_.......,...___r___r--..,.__----.r_.._..___...___.

o

7

6

-en 5--Q)enc0

4Q.en~-0c 3 _.----------_ ..._---0

:0::;m.....~ 20

1

Figure 9 Example for the reproduction of temporal stimuli between 0.5 and 7 sduration from one subject. Stimuli were given in random order. A continuous light wasused as stimulus. At S=R, stimulus duration equals reproduction. ALth is the geometricmean of all stimulus durations. Note that for stimuli longer than 3 s temporal reproduc­tion remains short. Data from Poppel (1971).

3.3.2 High Frequency Processing

Evidence for a high-frequency processing system comes, in part, from studies on tem­

poral order thresholds (Hirsh et al., 1961; von Steinbiiche1 et al., 1996). If the temporal

order of two stimuli has to be indicated by experimental subjects, independent of sensory

modality, a threshold of 30 ms is observed. Data picked up within 30 ms are treated as

co-temporal, that is, a relationship between separate stimuli with respect to the before­

after dimension can no longer be established. This does not mean that the central nerv­

ous system cannot process information for shorter intervals than 30 ms (e.g. the localisa­

tion of objects in auditory space requires a much higher temporal resolution. For detailed

explanations concerning microsecond timing, see section 3.4.1), however, distinct events

require a minimum of 30 ms to be perceived as successive.

Page 61: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.3 Mental Representation of Time 47

Support for distinct system states come from a variety of studies using different para­

digms: Under stationary conditions response distributions of reaction time Ookeit, 1990),

or pursuit eye movements (poppel, 1986) show typical characteristics in the sense that

frequencies of preferred response latencies are separated approximately by the 30 ms in­

terval (see Figure 10). These effects can be explained on the basis of neuronal oscilla­

tions. After the transduction of a stimulus, an oscillation of 30 ms is initiated that is

phase-locked to the stimulus. Such an oscillatory mechanism, under environmental

stimulus control, allows integration of information from different sensory modalities, i.e.,

data from various inputs can be collected within one period, which defines a basic system

state. The separate response modes possibly represent similar successive and discrete de­

cision-making stages, as is assumed in high-speed short-term memory scanning

(Sternberg, 1966).

! ! ! !!

2

12

10

8Cl)IDCl)

56c..Cl)

~~4o

O-+-....-........-'l'~~

o 50 100 150 20(' 250 300 350Latency (ms)

Figure 10 Histogram of 463 latencies of pursuit eye movements in three subjects.Data are summarised in 10 ms bins. Arrows indicate temporal positions of the preferredlatencies that are separated by 30 to 40 ms. Data qualitatively from Poppel (1986).

Further support for the 30-ms-hypothesis is supplied by neurophysiological observa­

tions. The auditory evoked potential in the midlatency region shows an oscillatory com­

ponent with a period of 30 ms (Galambos et al., 1981). This component is a sensitive

marker for the anaesthetic state because it selectively disappears during general anaesthe-

Page 62: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

48 CHAPTER 3. THEORY

sia (Madler et al., 1987). Thus, oscillations with a period of 30 ms represent functional

system states that are apparently necessary prerequisites for the establishment of events

(Schwender, 1994).

3.4 Neural and Cognitive Models of Time Perception

In a strict sense, time perception should not occur because receptors of what we refer

to as 'time' do not exist. Following the reasoning of the previous section, where the nota­

tion of the subjective present was introduced, time can be regarded as a mental construction

based on sensory processing. Conceivabilities about the underlying neural functioning as

well as cognitive models of time perception positioned on a higher level of abstraction

are topics of this section.

A fundamental part of sensory processing is pattern recognition, that is, how central

neurons develop selective responses to spatial and temporal patterns of activity from en­

vironmental stimuli. Sensory stimuli can be decomposed into spatial and temporal com­

ponents. Spatial patterns refer to those that can be discriminated based on a static 'snap­

shot' of which neurons are active (e.g. retinotopy of cortical activation). Temporal pat­

terns refer to those in which the order, duration, or interval between the activation of

neurons is required for stimulus discrimination. The duration of flashed bars of light and

the voice-onset time of phonemes are examples of temporal stimuli ranging between few

orders of time magnitude only. All together the brain processes temporal information

over a range of at least ten orders of magnitude - from microseconds to daily circadian

rhythms (see Figure 11).

Page 63: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.4 Neural and Cognitive Models of Time Perception

TASK ApPROPRIATE MODEL

10.3 Microsecond Processing: Delay Lines0.01 Sound Localisation Labelled Lines

0.1

1Millisecond Processing:

Speech Generation/Recognition Population Models10 Motion Detection

W 100 Motor Coordination.§.

olI( 1 sCD 10 3.$ Second Processing: Pacemaker-Switch-~ 10 4 Conscious Time EstimationIl( 1min Accumulator-Models

10 5

10 6olI( 1 h

Circadiane Rhytm:10 7

Il( 1d Appetite10 8 Sleep-Wake

10 9

Figure 11 Scales of temporal processing. Human process temporal informationover a scale of at least ten orders of magnitude, executing tasks in the microsecond tothe daily scope. At the right side of the figure, appropriate models for particular tasksare listed. There is no sharp border between the use of the appropriate models. Ratherthey are assumed to overlap the particular tasks. Modified from Buonomano et al.(2002).

49

3.4.1 Labelled Lines

The Labelled Lines models are used to explain microsecond temporal processing,

which is primarily responsible for the detection of interaural delays used to localise sound

sources. In humans it takes sound approximately 600 fls to 700 fls to travel the distance

between the left and right ear. The auditory system uses these intervals to calculate the

spatial location of the sound source. A relatively simple but extremely sensitive mecha­

nism is used to determine these microsecond intervals: A sound arriving in each ear willactivate neurons in the cochlear nucleus. The axons from these neurons function as delay

lines; that is, the distance an action potential has to travel is proportional to the time it

takes. Neurons in the medial superior olive function as coincidence detectors and use the

delays to respond selectively to different intervals. Together these neurons establish a to­

pographic map of auditory space (Carr, 1993). Whereas Labelled Lines models have

Page 64: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

50 CHAPTER 3. THEORY

proven suitable to explain microsecond processing, they are not well suited for complex

forms of temporal processing such as sequences and speech (Buonomano et al., 2002).

Computationally, the Labelled Lines models are very effective, but only for simple tasks.

3.4.2 Population Clocks (Neural Networks)

In Population Clocks (or population models), time is coded in the population activity

of a network of neurons, where any given neuron will contain litde temporal information.

An additional difference from Labelled Line models is that there is not an explicit range

of time constants or time delays specifically set to capture specific intervals. These mod­

els generally rely on local network dynamics and time-dependent changes in network

states, which appear as a result of e.g. plasticity of synaptic delays. Central to 'biologically

feasible' population models are oscillatory pacemaker neurons. The idea of using oscilla­

tors to store an arbitrary temporal sequence was introduced in the sixties by Longuet­

Higgins (1968). Since then a series of refinements took place triggered by the use of

computer simulations.

Figure 12 shows a recent approach aiming to model stored time intervals (Miall, 1996).

This model relies on a large population of pacemakers with only a narrow distribution of

oscillation periods. A unique group of pacemakers is selected that have the appropriate

beat frequency to store any particular time interval. Consider a group of oscillators

(pacemaker neurons), each with a slighdy different frequency of oscillation, and each

spiking for a brief part of each cycle. The beat frequency of any pair of these oscillators is

then the frequency at which they spike simultaneously. Thus their beat frequency is much

lower than their intrinsic oscillation frequencyn. It is given by the difference between the

frequencies on the two cells. For a population of oscillators the beat frequency is given

by the lowest common multiple of the periods of their oscillations. A group of a few

hundred pacemaker cells, even with similar oscillation frequencies, can encode a wide

range of time intervals and can recall the interval at a later time.

n which is the requirement for storing time intervals in the second range. With such a model it isnot necessary to assume pacemaker neurons with a great variability in the oscillation frequency,as other models do.

Page 65: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.4 Neural and Cognitive Models of Time Perception

A Btime

~

1 11111111 111111111

2 111111 I I I I I I I 1.00-...--

3 I I I I I I I I ~0-ca

4 I I I I I I I I 'uIn0

5 I I I I I I I

to t1 t2~ ..

interval tobe stored

Figure 12 Storing time with oscillating neurons. A: A schematic diagram of activ­ity in five oscillators, indicated by short vertical bars. The interval to - t1 can be encodedby selection of those oscillators active both at to and at t1 (oscillators 1, 2, 5). B: Thenetwork: a heterogeneous population of oscillators mutually excite an output neuron,which sums incoming activity and fires when a threshold is reached. Modified fromMiall (1996).

51

Computer simulations of this model show both impressive characteristics and severe

weaknesses regarding the comparableness to biological systems: With such a model it is

neither necessary to assume unrealistically accurate pacemaker neurons, nor to assume

them firing with unrealistic variability (e.g. from tens of milliseconds to tens of seconds).

Furthermore the model is very robust regarding noise: great random fluctuations of the

pacemaker neurons have little impact on the system's behaviour. But as soon as there is a

directional shift instead of random fluctuation of the unit's periods, recall is poor. A fur­

ther failure to mimic biology is the relationship between interval duration and accuracy:

The networks, as modelled, are either accurate or they fail. There is no distribution of re­

sponses about the desired time that might lead to the typical Weber's Law relationship

between errors and duration. The remaining difficulty with the model presented here is

that the group of selected units encoding a particular time interval or sequence needs to

be synchronously reset to allow recall of the stored interval. This is possible, but would

require some powerful reset signal to reach the entire group of oscillators.

Page 66: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

52

3.4.3 Pacemaker-Switch-Accumulator Models

CHAPTER 3. THEORY

Positioned on a - compared to Labelled Lines and Population Clocks - higher level of

abstraction is a class of models referred here to as Pacemaker-Switch-Accumulator models.

As the name suggests, central to these models is a three-step process beginning with a

pacemaker unit that emits pulses (whose rate can be increased or decreased). These

pulses are gated to an accumulator through a switch, which can be closed (so that

pulses pass) or open (pulses cannot pass). The closure of the switch is triggered by in­

coming significant temporal information, its opening by the end of the temporal episode

to be estimated (Church, 1984; Gibbon et al., 1984). The accumulator is a perceptual

store similar to an 'up' counter incremented by pulses which have passed the switch.

The Temporal Information Processing Model (TIP) of Figure 13 is an approach, which uses

the Pacemaker-Switch-Accumulator model in its core. It explains the variance of duration

estimations in humans and animals. The model stemmed from animal timing behaviour

experiments where - by means of Classical Conditioning procedures - particular dura­

tions were reinforced. From the animal's recall behaviour of reinforced durations the TIP

has been developed. TIP also suites well human duration estimation, where - instead of

reinforced - consciously learned durations have to be recalled, albeit some mechanisms

concerning attention and arousal are not fully clarified. For attention and arousal-related

work see e.g. Treisman et al. (1990), Block (2001), or Zackay (1998).

ISwitch IA "IPacemaker I' I I I '/~IAccumulatorl

Working .iiii.~:v1 ReferencelMemo t " / n* Memory .X ? )I'.L b*

Comparator

Clock Level

Memory Level

Decision Level

YESif

abs (t-n*)/n* < b*

NOif

abs (t-n*)/n* > b*

Figure 13 The Temporal Information Processing Model (TIP) composed of thethree interacting levels: clock, memory, and decision. Modified from (Church, 1984).

Page 67: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.4 Neural and Cognitive Models of Time Perception 53

Following the Pacemaker-Switch-Accumulator level of the TIP-model, additional two lev­

els are introduced and discussed here. The memory level includes a short-term memory

store (working memory), which is functionally equivalent to the accumulator, and a long­

term store (reference memory), where reinforced (or learned) durations are transferred at

the end of a trial.

Finally, at the decision level, a comparator compares the number of pulses t currently

in the short-term store, and a random sample n* from the reference memory for the

standard duration, represented as a Gaussian distribution. A decision as to whether or

not to respond depends on the comparison of the absolute difference between t and n*,expressed as a fraction of n*, and a threshold b*, which is a random value drawn from a

Gaussian distribution. Thus, the equation describing the threshold for responding is ex­

pressed as [abs (n*- t)/n* < b*], with abs indicating absolute difference. If this normal­

ised difference is less than the threshold, responding is initiated (see Church et al. (1994)

for an application of this model).

Pacemaker-Switch-Accumulator models, including TIP, account for (or have been devel­

oped due to) several effects in human time perception showing that the subjective dura­

tion of a stimulus can be influenced by factors in addition to its actual physical length.

For example, stimuli that are 'filled' (e.g. continuous tones) are usually perceived as

longer than equal-length stimuli that are 'empty' (Thomas et al., 1974). likewise, moving

stimuli have been judged as lasting longer in duration than static ones (Goldstone et al.,

1974; Brown, 1995), presentations of familiar words were judged as lasting longer than

unfamiliar ones (Witherspoon et al., 1985). Frequent results from the classical timing lit­

erature are that more intense stimuli tend to be judged as lasting longer than less intense

ones (Fraisse, 1964), as well as 'sounds are judged longer than lights' (Goldstone et al.,

1974). The latter refers to the phenomenon that auditory stimuli frequently appear to

have longer subjective durations than do visual stimuli of the same real-time length.

Explanations for these effects mainly concern either the closure latency of the switch,

or the speed of the pacemaker. The closure latency of the switch is supposed to exceed

its opening latency, and this difference might depend on the modality or the degree of

expectation of the temporal signal (Lejeune, 1998). Or pacemaker speed can be increased

or decreased with arousing or calming stimuli (Boltz, 1994; Wearden et al., 1999).

Page 68: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

54 CHAPTER 3. THEORY

3.5 Psychophysical Theory for Measuring Thresholds

As mentioned in the introduction to this thesis, we consider the psychophysical ap­

proach suitable in order to measure the perception and acceptance thresholds of particu­

lar delay parameters. In what follows, the fundamentals of psychophysical testing, a

specification of the psychophysical function as well as the adaptive psychophysical pro­

cedure applied in the empirical part of the thesis are described.

3.5.1 Testing paradigms

Psychophysical procedures dispose of various testing paradigms, of which we describe

the yes-no and the forced-choice (nAFC: n-Alternative-Forced-Choice) mode. With the yes-no

mode subjects are given a series of trials, in which they must judge the presence or ab­

sence of a stimulus at each case. The ratio between the number of trials containing a

stimulus and the total number of trials is usually 0.5, but can be any other value. Usually

this ratio is told to the subject in advance. The rate of yes-responses for all tested stimu­

lus intensities is defmed as the dependent variable.

Basically a different testing mode is represented by the forced-choice mode: Subjects are

given a variety of n alternatives, from which they have to choose the one containing the

stimulus. The alternatives are presented with either spatial or temporal coincidence, or

without either coincidence. The subjects know that exactly one alternative contains the

stimulus, and that the rest has a zero-stimulus. The differences between these two meth­

ods become obvious when the presented stimuli are faint. In the yes-no paradigm the

proportion of yes-answers approaches zero, whereas in the forced-choice paradigm the

proportion of correct answers approaches the value of equal probability for all alterna­

tives, which is the reciprocal value of the number of alternatives. Likewise this means

that e.g. in two-alternative forced-choice (2AFC) tasks the threshold is located where ob­

servers give 75% of correct responses, since they already gave 50% of correct responses

due to the 2AFC-inherent guessing. The basic advantage of 2AFC consists of its well­

founded assumption that subjects will opt for the stimulus evoking the strongest percep­

tion, regardless of their tendency to say 'yes' or 'no'. This is in contrast to the yes-no

paradigm, where decision making in the presence of uncertainty is according to the sub­

ject's psychological characteristics, like e.g. prudence. Unlike the yes-no mode, the de­

pendent variable of nAFC is the rate of correct responses for all tested stimuli instead of

Page 69: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.5 Psychophysical Theory for Measuring Thresholds 55

the rate of yes-responses. In the following we subsume both kinds of dependent vari­

ables under the term positive-response rate 'If.

For most of psychophysical testing, be it in the clinic or in the research lab, efficiency

is of great importance, i.e. the threshold should be estimated with satisfying accuracy af­

ter as few as possible trials. The requirement of minimal number of trials is given by the

fact, that after a long run of trials experimental subjects tend to fatigue and to be bored,

resulting in an apparently drift of their thresholds. For this reason, so-called adaptive p.ry­

choplrysicalprocedures have been developed, whose prior purpose is to minimize the number

of trials. We will recapitulate the adaptive procedure called best-PEST in chapter 3.5.3, for

more details about adaptive procedures see the overview of Treutwein (1995). In the next

chapter we describe the theoretical background necessary to understand this procedure.

3.5.2 Specification of the Psychometric function If/ =f (t/J )

The psychometric function assigns a positive-response rate 'If to the range of stimulus

intensities. The particular properties of this function are described in the following:

The range of 'lfis bounded as lower limit by the probability to give positive responses

without perceiving the stimulus (false positive or false alarm rate). This false positive rate

consists of a methodical part (only in nAFe), and the 'proper' false positive rate c. The

methodical part is equal to the reciprocal value of the alternatives n. The upper limit of 'If

consists of (1-b): Big stimulus intensities effect positive responses in virtually all the

cases, only reduced by the false negative rate (i.e. misses) 8. The error terms g and care

caused by observers' inattention or fatigue for instance.

Page 70: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

56

If/-00 =P (positive response I~ ~ -00 ) =.!. + £n

If/+00 =P (positive response I~ ~ +00 ) =1-g

~ : stimulus intensity { ~E lR }

£ :false positive {£E lR I 0~£~0.5}g :false negative {g E lR I 0 ~ g ~ 0.5}

n : number of alternatives {nE N 12~n~100}o

CHAPTER 3. THEORY

eq (1)

eq (2)

We define the threshold () to be that value of stimulus intensity that yields a specified

positive-response rate. For practical reasons in testing, the threshold is located at the

steepest slope of the psychometric function (derivation see section 3.5.3). In the follow­

ing we will exemplify the psychometric function by means of the logistic model, because

this is the kernel function of the adaptive procedure best-PEST, which is the topic of

section 3.5.3:

If/* (~) = (1+eP'(O-rfJ) )-1 eq (3)

: kernel function

: steepness parameter

: threshold

Since the logistic function is rotationally symmetric in the inflection point, the thresh­

old is in the middle of the response range [If/-00' If/+00]' Therefore, the rate of positive re­

sponses at threshold is:

eq (4)

In order to create a formal link between the two testing paradigms, theyes-no situation

can be considered as forced-choice situation with an infinitive number of alternatives. In this

case the threshold converges to the value where the positive-response rate is:

o the number of alternatives are restricted to 100, since the practicability of experiments withmore that 100 alternatives is doubtful.

Page 71: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.5 Psychophysical Theory for Measuring Thresholds

If/(}(YesINo) = limO.5(1-£5+..!.-+£) =0.5(1-£5+£)n~oo n

57

eq (5)

The psychometric function If/* (tfJ) has to be adjusted due to the observers' false posi­

tive and false negative rates. For these purposes the kernel function is shifted to n-1 + £

and scaled to the response range [If/-oo,lf/+oo], which distance is - according to eq (1)

and eq (2) - equal to 11-£5 -n-I -£\:eq (6)

If/( tfJ) : adjusted psychometric function

In order to deal with a well-known constant, which is comparable between different

magnitudes of stimuli, we let 13 be the slope of the inflection point of the normalized psy­

chometric function. We define the threshold to be at stimulus intensity of 0.5, thus we

normalize the stimulus intensity to two threshold units, with the result of obtaining the

'real' slope in an equal-scaled plot (i.e. the slope is equivalent to the tangent of the gradi­

ent angle):

dlf/ 13* (1- £5 - n-I - £ ) . * 4 f3f3 = - = that IS f3 =---'-----

dm 4 l-£5-n- I -£'I' rp=(}

f3 : slope of the psychometric function at threshold (inflection point)

eq (7) inserted in eq (6) leads to:

eq (7)

eq (8)

Equation eq (8) is the underlying, generic formula for the threshold estimation by the

best-PEST calculator. Figure 14 depicts the mapping of eq (8) with different parameter

settings:

nE {2, 4, oo}

f3 E {1.5, 3, 7}

£=0.07

£5 =0.04

Page 72: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

58 CHAPTER 3. THEORY

positive-response rate '¥ ~1.00-r-----------'8-------------

0.75

4AFC

0.25-+--------.-----~-----r

Yes/No

1/

--------~ ~ =1.5

---~=3.0

-- ~=7.0

1.000.50Normalized stimulus intensity <l> [28]

O.OO+---~-----~~--%.----------,

0.00

Figure 14 Logistic psychometric graphs depicting yes-no and forced-choice situa­tions (nAFC). The asymptotes are at (lln + e), and at (i-b). The slope pis 3 (straightlines), and 7 and 1.5 (dashed lines) respectively. The stimulus intensity is normalized to 2threshold units.

Typically psychometric functions are - as depicted in Figure 14 - of statistical value

(unless they represent a heaviside step function with its 'step' at the threshold value). I.e.

when an observer is presented on several occasions with the same stimulus, s/he or she

is likely to respond yes on some trials and no on other trials. Thus, the threshold cannot

be defmed as the stimulus value below which detection never occurs and above which

detection always occurs, but rather as the stimulus value which is perceptible in a prede-

Page 73: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.5 Psychophysical Theory for Measuring Thresholds 59

fined percentage (usually 50%) of the trials. Experimenters are confronted with the ques­

tion, how to determine the psychometric function of an experimental subject or of a par­

ticular study cohort. For that purpose classical psychophysics offers several methods,

which we will not explain here in detail. Readers interested in this topic may consult the

standard work of Gescheider (1997). Recapitulating, we note that with these methods we

determine the detectability of several stimulus intensities, and fit an appropriate sigmoid

shaped curve to the data to obtain the psychometric function. From this function the

50% threshold for instance can be read out.

In order to measure the empirical threshold, the experimenter must decide what

stimulus intensities should be used in the experiment. It should be clear that choosing in­

tensities that are all greatly above or below the threshold would provide little information

leading to an accurate estimation of the threshold. In addition to the problem of requir­

ing a large quantity of trials to obtain the threshold, waste trials are likely to occur with

these methods, unless the testing range is known in advance. An approach with these

characteristics is far from optimally efficient and consequently the adaptive methods for

measuring threshold have evolved.

3.5.3 Adaptive Psychophysical Procedures

In all adaptive procedures, the intensity of a stimulus presented on a particular trial is

determined by the observer's performance in detecting stimuli presented on prior trials.

Except for one class of procedures called maximum-likelihood methods, all other methods

described in Gescheider (1997) suggest more or less heuristic rules after how many trials

and how much the presented stimulus intensity has to be adjusted. Even though it is a

characteristic of all adaptive procedures to recall information from the past history of an

experimental run, only the maximum-likelihood procedures determine the next stimulus

presentation based on a statistical estimation of the observer's threshold, which is made

from all of the results obtained from the beginning of the run. The statistical technique

of maximum-likelihood estimation assumes that the underlying psychometric function

has a specific form. For example it could be a Gaussian (the cumulative normal distribu­

tion), logistic, Weibull, or some other sigmoid-shaped function. Because these functions

have similar forms, the estimated thresholds are not greatly different, and the choice may

only be of importance if e.g. a particular perception model is under test. In the following

Page 74: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

60 CHAPTER 3. THEORY

we describe the best-PEST method (pentland, 1980), which uses the logistic function as

underlying modelP.

Maximum-Likelihood: best-PEST

In best-PEST the approach taken to the problem of determining a threshold is to

maximise the information gained with each measurement. In so doing the smallest possi­

ble number of measurements will be required. First we derive the choice of the sampling

point on the psychometric function:

For any value fjJ of the stimulus range [O,k], there is a probability tp of a positive an­

swer. Given N samples taken at fjJ, of which p were positive, our estimate of tpis:

..... pIf/=­

N

If/ : estimate of the probability of a positive response

p : number of positive responses

N : number of samples

the variance is

If/(l-lf/)a=--~

N

a : variance of estimation

and the confidence intervals are

Cl~ =w#If/

Cl~ : width of the confidence interval about ~If/

W : level of desired confidence (e.g. 0.95)

Equations eq (9) and eq (10) inserted in eq (11) leads to

P PEST is the acronym for Parameter Estimation qy Sequential Testing.

eq (9)

eq (10)

eq (11)

Page 75: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.5 Psychophysical Theory for Measuring Thresholds

Cl~ = W~P(N - p)'If N3

61

eq (12)

To get the stimulus range ljJ corresponding to the confidence interval of the dependent

variable, it has to be divided by the slope of the psychometric curve:

eq (13)

Cl~ : width of the confidence interval about ~

Thus, in order to minimise the estimated confidence interval about the stimulus ljJ for

a given number of trials we have to maximise the slope of the psychometric function.

For all sigmoid-shaped functions, the steepest slope is located at the inflection point. In

the rotationally symmetric logistic function used in best-PEST this point is at the centre

of the curve. In the yes-no mode this is at 50% (if E=O and S=1); in the 2AFC mode this

is at 75% (ifE=0.5 and S=0.5).

In order to explain the best-PEST procedure we reformulate eq (8) and obtain the

probability of getting a positive (if r=1) or negative (if r=-1) response at the i-th trial:

eq (14)

rj : response of the observer at i-th triaL 1'; E {1, -1}

--(Jj : i-th estimate of the threshold

E

Sandeq (2)

: elevation of the psychometric function accord LOg to eq (1)

: scaling of the psychometric function to th( response range according to eq (1)

The strategy in best-PEST is to calculate the likelihood of the sampling point is being

at each point within the testing range and taking as new estimate the stimulus value that

is assigned to the highest probability. After N-1 trials, we find the N-th point of meas­

urement by solving:

eq (15)

Page 76: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

62 CHAPTER 3. THEORY

where (0, k) is the test range of the stimulus (J, and (0;, ri) denotes the results of the

i-th measurement that was taken at value 0;.

The maximum likelihood estimator is known to be the most efficient unbiased estima­

tor. One problem arises: the product of all the probability distributions approaches zero

for large numbers of trials. To overcome this problem, we apply a logarithmic transfor­

mation to the likelihood function with the result of obtaining the sum instead of the

product of all likelihood functions. That way, the log-likelihood functions do not run into

underflow and need not to be standardised to the overall probability of 1. Since the loga­

rithmic transformation is stricdy monotonic increasing, the locations of relative maxima

are preserved:

N Nmax IIf(x) = max :Llogf(x)

xe(a,b)i=1 xe(a,b)i=1eq (16)

For the case of the used function eq (14), the N-th threshold estimation is calculated

according to eq (15) and eq (16):

( ( ~ )-IJ-- N-l r, 8.-fIJ 4pS-lON = max :L log E +S 1+e ,( I )

tflE(O,k) i=1eq (17)

Figure 15 depicts the expansion of the log-likelihood functions according to eq (17).

The parameter settings in Table 9 are used:

Table 9 Parameter settings of the curves depicted in Figure 15.

~s~.<>

"'">

A=2AFC B=yes/no

N 10 10

E 0.5 0

S 0.5 1

»1 f3 2 2

r {1, 1, -1,1,1, -1,1,1,1, -1} {1, -1,1, -1,1,1, -1, -1,1, -1}

Page 77: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

3.5 Psychophysical Theory for Measuring Thresholds 63

koko stimulus intensity <pO-r---------===::-.........

"0oo..c

~

-10

Figure 15 Expansion of the log-likelihood functions in the stimulus interval [0, k]of the adaptive procedure best-PEST. Circles indicate the relative maxima; dashed linesshow the progression of the threshold convergence. Bold lines represent the predefmedinitialisations; thin lines are calculated according to the responses r.A: 2-alternativ forced-choice (2AFC) paradigm. B: yes-no paradigm.

Page 78: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /Blank leaf

Page 79: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4 Experiments

This chapter consists of the description and the results of the conducted threshold determina­tion experiments. The ftrst part concerns experiments in which single subjects interact withthe computer, aiming to determine perception thresholds for relative and absolute delays.Relative delay thresholds are obtainedfor auditory and visual stimuli in both orders. Abso­lute delay thresholds are obtainedfor the interaction with both vocal or mouse input, and thecorresponding visual response. The secondpart describes experiments in which pairs or tnplesof subjects interact over a videoconference using an emulated communication network. Thispart consists ofexpenments mming to determine absolute delay thresholds for basic auditoryand visual interaction as well asfor realistic communication tasks.

4.1 In Human-Computer Interaction (HCI) Mode

The experiments conducted in the human-computer interaction (HCl) mode comprise

threshold determinations where the experimental subjects receive computer-generated

stimuli triggered by the subjects' inputs. Such experiments can be conducted without us­

ing an emulated communication network, and require single subjects only. This makes

these experiments easier to control, since there is no group dynamic aspect present. Fur­

thermore, due to the plain technical infrastructure of such experiments, there is much

less effort needed to install and calibrate the whole. The HeI experiments consist of the

following threshold determinations:

• Relative delay thresholds: auditory before visual (condition AV), and visual be­

fore auditory (condition VA) (Zuberbiihler et al., 2002).

• Absolute delay thresholds: vocal trigger - visual response (condition VocVis),

and mouse trigger - visual response (MouVis) (Zuberbiihler et al., 2003).

Page 80: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

66

4.1.1 Experimental Setup

CHAPTER 4. EXPERIMENTS

The experimental set up, including stimulus presentation, best-PEST algorithm, and

data acquisition was implemented in a fully computerised environment using Macrome­

dia Director's object oriented scripting language Lingo. The temporal resolution capacity

of the entire system was in the range of ±5 ms, whereas the minimal increment adminis­

tered by the adaptive procedure has been set to 10 ms. The settings of the particular pa­

rameters are listed in Table 10.

Table 10 Parameter setting for threshold determinations in HeI (for an explana-tion of the parameters see the Annex on page 101).

HD.,!y- .. ,

Relative Delay (AV and VA) Absolute Delay (VocVis and MouVis)

Mode 2AFC 2AFCl<

Start value k 400ms 610 ms

Smallest step size 10 ms 10ms

Termination criterion 12 trials 12 trials++

Slope of best-PEST 1.75 1.75

J False negative 8 0 0

False positive e 0 0

Mean of x trials 3 3fH

Runs per subject 3 2+1training

4.1.2 Procedure

We used 2AFC tasks, and applied the adapti' e procedure called best-PEST, suggested

by Pentland (1980), in all experiments investigating HCI. Best-PEST is described in sec­

tion 3.5.3 (page 59). 2AFC tasks are designed to dissuade biased influences from the ob­

servers' decision criterion, and best-PEST is assumed to deliver thresholds after smallest

possible number of trials. In all experiments the subjects received their instructions via

written text, displayed at the appropriate time during the experimental run. Their task

was to detect whether a delay appeared on the right or on the left side of the screen. The

appearance of the delay position was randomly balanced. In addition, every six trials we

Page 81: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.1 In Human-Computer Interaction (HCn Mode 67

presented one intermittent trial with a large delay. This trial neither contributed to the re­

sults nor did it influence the best-PEST-estimation. This particular procedure was chosen

based on insights gained via pre-tests that subjects tended to be bored after reaching their

approximate threshold (at this point the stimulus is very faint). The presentation of an in­

termittent trial with large delays gives the subject the experience of success, resulting in

an increased motivation.

In the following we describe those parts of the procedure that were not common with

all threshold determination experiments.

Relative Delay: Auditory before Visual (AV) and Visual before Auditory (VA)

The delay occurred between the presentation of a black disc (diameter of 4 arc de­

grees) on yellow background and the presentation of a 1 kHz tone of 60 dBA with

rise/ fall time of 10 ms. Both disc and tone lasted for 500 ms, therefore stimulus onsets as

well as stimulus offsets served as clues for delay. Stimuli order (auditory before visual or

vice versa) was randomly chosen. That way subjects could not gain insight to the logic of

the best-PEST procedure, and should not have been able to predict the next trials.

",

visual stimulusc:::

visual stimulus.2 -.:!: .c::lI) ,0,

& c:::-.eauditory stimulus 0 ot:: auditory stimulus c:::

~ .!!! ,~-c::: e ~~ ~t ~ ~ ~t ~.2! e =u ..- a-'"- -o 250 500 750 1000 1250 1500 1750

t [ms]

Figure 16 Test sequence of one trial. The duration of ~t is equal to the maximumlikelihood of the threshold computed in the best-PEST procedure. The occurrence of~t is randomly balanced between the left and the right side of the screen. The questionafter each sequence was: »On which side did you perceive a time difference betweensound and picture?«

Seven female and nine male subjects (aged between 25 and 56, mean=32) participated

in the 20 minute experiment. The experimental design was within-group, i.e. all subject

Page 82: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

68 CHAPTER 4. EXPERIMENTS

performed all conditions. The 1kHz hearing threshold level and the visual acuity of the

subjects were tested by means of the audiometer Bosch ST10, and the Landoh-rings acu­

ity chart. All subjects had normal hearing and normal or corrected to normal vision. Each

subject had to complete the threshold procedure three times for both orders, resulting in

a total of 96 threshold estimations. The median of three threshold estimations per person

was taken for further analysis.

Absolute Delay: Vocal Trigger - Visual Response (VocVis)

The delay occurred between a vocal trigger and the disappearance of a visual stimulus

on the screen. As visual stimulus, a black cipher with a height of 4 arc degrees appeared

on a white background. The subjects were told to pronounce the displayed cipher. As

soon as the sound level of their voice exceeded 65.5 dBA the cipher disappeared either

after a small, system-inherent delay or after the delay calculated by best-PEST. A sound

level of 65.5 dBA was chosen as the trigger point because (1) this sound level represented

the average voice level of subjects and (2) it is high enough to avoid the disappearance of

the cipher due to background noise.

.(

_) stimulus

T 0

voiceinput

c::.0..Q

~ to~Cl)Cl);:,0- t [ms) "

, -

a

tDL+500

\.(I

stimulus visy

T 0ce

c::voice .2 -:t::: .cinput ~ .th

t::Q. .s-0 It::Cl) J!!th

~ L\t+to ~ i e'5 e

'-=I I,I

o

ViSU~

D=disappearan

T=trigger on65.5 dBA

Figure 17 Test sequence of one trial of the voice-visual interaction threshold ex­periment. The duration of Lit is equal to the maximum likelihood of the threshold com­puted in the best-PEST procedure. The occurrence of Lit is randomly balanced betweenthe left and the right side of the screen. to is the average response latency of the micro­phone device. The question after each sequence was: »On which side did you perceive adelayed disappearance of the cipher?«

Page 83: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.1 In Human-Computer Interaction (HCI) Mode 69

Absolute Delay: Mouse Trigger - Visual Response (MouVis)

The delay occurred between a mouse trigger and the appearance of a visual stimulus

on the screen. The visual stimulus consisted of a red square with a side length of 5 arc

degrees on a white background. The subjects were told to click into a white square,

whereupon the square changed its colour to red, either immediately or after a delay calcu­

lated by best-PEST.

,,,

visual stimulus,

T=D

c:: ......2 .-- 5~ .c:

8- .th ct:.e .::tIt.- .20 tt:: U c::

& J!! L..- .0.;::~ e Cl)

Cl)

'5 e :::st[msL'to: 0-., ,.

tOL+500 tOR tOR+500

~t

ViSU~ stimulus

T 0

r-,------1I~--...._IIIIIi-----+_-___1

o

D=disappearance

T=trigger onmouse up

Figure 18 Test sequence of one trial of the click-visual interaction threshold ex­periment. The duration of ~t is equal to the maximum likelihood of the threshold com­puted in the best-PEST procedure. The occurrence of ~t is randomly balanced betweenthe left and the right side of the screen. The question after each sequence was: »Onwhich side did you perceive a delayed change of colour?«

Seven female and 17 male subjects (aged between 19 and 41) were recruited for the

two experiments testing two different input modalities (VocVis and MouVis). The ex­

perimental design was within-group, i.e. all subject performed all conditions. Each of the

experiments lasted approximately 20 minutes. The subject had to complete the threshold

procedure three times for both modalities, resulting in a total of 144 threshold estima­

tions. The first threshold estimation in each condition was considered as practice and was

therefore excluded from further analysis.

Page 84: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

70

4.1.3 HCI-Results

CHAPTER 4. EXPERIMENTS

Relative Delay: A=Auditory before Visual (AV) and B=Visual before Auditory (VA)

Figure 19 shows the logistic psychometric functions for the perception of relative de­

lays (curve fitting by means of the method of the least-squares). The 75 % thresholds ob­

tained are 74 ms (AV), and 98 ms 01A), respectively. The thresholds - obtained by the

adaptive procedure best-PEST - are 71 (±17) ms for the AV-condition, and 105 (±25)

ms for the VA-condition, (numbers in brackets stand for the 95% confidence levels) (see

Figure 20). A one-sided, paired t-test shows that the mean of the VA-threshold is signifi­

candy higher (p<0.05) than the mean of the AV-threshold. Gender and age had no sig­

nificant effect on the detection of relative delays.

0.25

o Auditory before Visual (AV)

o ••••••. Visual before Auditory (VA)

o 50 100 150 200 250Delay (ms)

300 350 400

Figure 19 Psychometric functions for perception of relative AV delay (straightline) and relative VA delay (dashed line). Arrows indicate the 75% thresholds, which areat 74 ms (AV), and 98 ms ryA), respectively. The data are fitted with a logistic model.

Page 85: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.1 In Human-Computer Interaction (HCn Mode 71

200A n=16 • audio before visual

175 + plus 95% Col.

+ minus 95% C.1.150 i - - - threshold AV

Ui' 125E';: 100III + + +'ii + + + + + +"0 75

50

25

0 ------------'----_._-_._--~, __ ._____....l ..,._~~_._

___L---___

1 2 3 4 5 6 7 8 9 10 11# of trials

200 -

B n=16 • visual before audio175 ; + plus 95% Col.

150 t minus 95% Col.- - - threshold VA iL-__,.. ____._....______~____~

Ui'125 + + + + +E

+ + +

';:100III'ii 75 ~"0

50 ~

25

01 2 3 4 5 6 7 8 9 10 11

# of trials

Figure 20 Delays calculated by best-PEST for every trial. The curves representthe mean of all subjects. Continuous lines show the progression of the threshold con­vergence. Dashed lines indicate the final thresholds; grey lines indicate the 95% confi­dence interval. A: Auditory before visual, B: Visual before auditory.

Page 86: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

72

Absolute Delay: VocVis and MouVis

CHAPTER 4. EXPERIMENTS

Figure 21 shows the logistic psychometric functions for the perception of absolute de­

lays (curve fitting by means of the method of the least-squares). The 75 % thresholds ob­

tained are 98 ms when vocal inputs trigger visual responses (VocVis), and 65 ms when a

click inputs trigger visual responses (MouVis). The thresholds - obtained by the adaptive

procedure best-PEST - are 115 (±23) ms for the VocVis-condition, and 78 (±14) ms for

the MouVis-condition, respectively (numbers in brackets stand for the 95% confidence

levels). A one-sided, paired t-test shows that the mean of the VocVis-threshold is signifi­

cantly higher (p=O.Ol) than the mean of the MouVis-threshold. Gender and age had no

significant effect on the detection of absolute delays in Her.

--- Mouse Trigger I Visual Response

0······· Voice Trigger I Visual Response

0.25

o 100 200 300Delay (ms)

400 500 600

Figure 21 Psychometric functions for delay perception between mouse triggerand visual response (straight line) as well as between voice trigger and visual response(dashed line). Arrows indicate the 75% thresholds.

Page 87: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.1 In Human-Computer Interaction (HCn Mode 73

_ ...•.-.-~--._._-._--_..,~

n=24 " voice trigger-visual response+ plus 95% C.I.

minus 95% C.1.

- - - threshold voice-visual

350 ~

A300

250

UiE 200....>-.!!! 150III"C

+ + + + + + + +

50 -

o1 2 3 4 5 6 7

# of trials8 9 10 11 12

+ ++ +... - "+=,

n=24 " mouse trigger-visual response I

+ plus 95% C.I. '

minus 95% C.I.

- - - threshold mouse-visual

5 6 7# of trials

8 9 10 11 12

Figure 22 Delays calculated by best-PEST for every trial. The curves representthe mean of all subjects. Continuous lines show the progression of the threshold con­vergence. Dashed lines indicate the final thresholds; grey lines indicate the 95% confi­dence interval.A: Voice trigger - Visual response, B: Mouse trigger - Visual response.

Table 11 summarises the results of the threshold determinations conducted in the HeI

mode.

Page 88: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

74 CHAPTER 4. EXPERIMENTS

Table 11 Summary of results of the HeI experiments.

.... ..i,lay.....

R!lifi'l'YDelar

AV VA VocVis MouVis

Threshold best-PEST 71 (±17) ms 105 (±25) ms 115 (± 23) ms 78 (± 14) ms

Threshold fitted model 77 ms 98ms 98ms 65 ms

Slope Ps of the standardised psy- 1.756 2.252 1.111 1.133chometric function (threshold at 0.5)

Experimental design Within-group Within-group

Significance difference AV <VA (p<0.05) VocVis > MouVis (p=0.01)

Significance age (p<0.05) no no no no

Significance gender (p<0.05) no no no no

4.2 In Human-Human Interaction (HHI) Mode

The experiments conducted in the human-human interaction (RH!) mode comprise

threshold determinations in which the experimental subjects interact with each other

over a videoconference that uses an emulated ATM-network infrastructure. Thus, in a

strict sense these experiments investigate network-mediated HHI. The participating subjects

act as both stimulus producer and stimulus receiver. In contrast to the experiments de­

scribed before, the intermediary computer-system is not involved in producing stimuli, in

the sense of newly created ones. Rather it is used to process and transmit the human ex­

pressions, and to reproduce them as realistically as possible. The HHI experiments con­

sist of the following threshold determinations:

• Absolute delcry: Basic auditory interaction between two subjects (condition

AudBas), and basic visual interaction between two subjects (condition Vis­

Bas).

• Absolute delcry: Realistic audio-visual interaction ber leen three subjects (con­

dition AudVisReal), and realistic auditory interaction between three subjects

(condition AudReal).

Page 89: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.2 In Human-Human Interaction (HHI) Mode 75

4.2.1 Experimental Setup

The experimental set-up depicted in Figure 23 consists of two (in the case of basic in­

teractions) or three (in the case of audio-only and audio-visual interactions) videoconfer­

ence stations accommodating the so called ETHMICS Kubus (which contains the ETH­

MICS videoconferencing system developed at the Computer Engineering and Networks Labo­

ratory (Rothlisberger, 1998), the ATM Transmission Hardware, as well as a built-in Mac­

intosh computer), a monitor, a camera, a microphone and headphones. The workstations

are connected via fibre passing through a system called ARES (Kurmann, 1997), which

simulates the behaviour of ATM channels in real-time, with performance degradations

(such as delay or errors) for various network configurations and assumptions about

background traffic. The whole is supervised by a control station, which sends delay set­

tings to ARES. The control station is also connected to the workstations, in order to ask

the test participants periodically to give their ratings concerning the delay (by means of a

UDP based client/server application). The values are sent back to the control station

where the next delay, according to best-PEST, is calculated.

__ VideoconferenceD-- Network (ATM)

Control NetworkD(Ethernet)Recording (IEEE 845i)

" "".··.'.'m'w.' ·.=·,~,..,,,w,,...... ..·'··m"o'o.".·w···· ,

, '""""""""":;;=::::1"1 111Kubus 3 • I! 11

! fiiiiiiic:Jiiiiiiitll !I1I! 11! 111

."."" ......_ ..._.=.J I1I

III

~Recordingostation

ARES

IKubus2 8 I

-,=--r-J:JJ

111 serial bus

O!Control station

Figure 23 Wiring of the experimental set-up.

Furthermore the video signals from the videoconference cameras are displayed on a

monitor in the observation area. Additionally these signals are recorded on digital video­

tape and are saved directly onto a hard drive. For further analysis, the data is encoded

Page 90: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

76 CHAPTER 4. EXPERIMENTS

into MPEG 2 and burned to a DVD. The whole experimental set-up has a built-in one­

way delay of about 65 ms (with buffered audio stream). This means that the no delay

situation presented to the subjects has in fact an absolute (sub threshold) delay of 130

ms. Table 12 lists the separate delays inherent in the particular videoconference compo­

nents. It can be seen that major delay contributions are due to the capturing and process­

ing of visual information.

Table 12 Minimal one-way delay in the videoconference network subdivided intothe particular components. Data from (Rothlisberger, 1998).

Average Delay Ems]

CCO-Camera 30

Oigitiser 0

JPEG-Encoder 1

Channel Buffer outgoing 0.5:t:::c:: ATM Network <1:::;) f-------------1I----------J

4.2.2

Channel Buffer incoming

JPEG-Oecoder

Scaling and De-Interlacing

Graphics-Card

TOTAL

Procedure

0.5

1

24

7

65

All experiments investigating HHI are conducted with the above-specified set-up em­

ploying the best-PEST procedure. It is not possible to approach the thresholds of all in­

teracting subjects simultaneously, since the subjects share the same delay values calcu­

lated on the response basis of only one subject. Therefore, in all HHI experiments, using

adaptive methods, one has to assign a subject whose threshold is determined thereafter.

The remaining subjects contribute only with their corresponding ratings. In the following

the particular procedures for the HHI experiments are described.

Page 91: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.2 In Human-Human Interaction (HHI) Mode 77

Absolute Delay: Basic Interaction Task (AudBas and VisBas)

The aim of this task was to evaluate the absolute deltry threshold for basic auditory and

visual interactions. For these purposes the experimental subjects had either to count

from one to ten in alternate order (auditory condition), or had to give hand signs in the

same way (visual condition). One of the two subjects held the relevant information re­

quired for the best-PEST calculation. If the answer of the this subject was correct the de­

lay value for the next trial was decreased, otherwise it was increased. The subjects were

instructed to react as fast as possible after recognising the partner's expression. That way,

the unknown reaction time could be better controlled, in the sense that no reasoning

took place about the answer to give. Applying a 2AFC paradigm, this procedure had to

be accomplished twice (see Figure 24): one course with an introduced delay computed by

best-PEST, and another course without any additional delay. The delay was randomly in­

troduced either in the first or the second course, and the subject's task was to indicate in

which of the two courses the delay was. With this task we expect to measure the lowest

possible threshold for absolute delays, since the subjects communicate with maximal de­

gree of interactivity.

A sA.1

B• • • • •

~..~--_ .I-----------~t [ms]

(2)A sA.1

BFigure 24 Test sequence of one trial consisting of two courses with five stimulusexpressions per subject. The stimuli s is of auditory or visual type. 1:0 is the built-in delayplus reaction time. ~t is the transit delay equal to the maximum likelihood of thethreshold computed with the best-PEST algorithm. The occurrence of ~t is randomlybalanced between the first and the second course. The question after each sequencewas: »In which course did you perceive a delay?«

Six female and 14 male subjects (aged between 21 and 35, mean=25) completed the

basic interaction task, giving a total of 640 ratings for different delay values. The experi­

mental design was within-group, i.e. all subject performed all conditions. The subjects

were mainly recruited on campus and received 10 CHF for participation in the 20 minute

experiment.

Page 92: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

78

Absolute Delay: Realistic Task (AudVisReal and AudReal)

CHAPTER 4. EXPERIMENTS

The aim of this task was to evaluate the absolute delay thresholds for a realistic

communication scenario. Evoking natural conversations between subjects that are

captured by video cameras and observed by experimenters is still a great challenge.

Therefore, the least we can do is select a discussion topic, which is familiar to the target

group acting as experimental subjects. As for the basic task, the subjects for this task

were mainly university students. We consider them to be familiar with the problems of

shared flats, either from experience or from hearsay, and consequently they should have

a well-founded opinion to communicate. This makes this topic suitable to be discussed in

the experiment.

The task was structured in two parts: At ftrst the subjects introduced themselves over

the videoconference (condition AudVisReal) or over the audio channel (condition

AudReal). They could do that autonomously or according to predeftned questions to be

asked to each other. During this phase the supervisor introduced pronounced or no delay

values and gave the relevant information to the subjects, in order to acquaint them with

the delay issue. In the second phase the subjects were required to communicate freely ac­

cording to the following scenario: One of the three subjects has rented a four-room flat

and needs to ftnd two flat-mates. The remaining two subjects perform the prospects. As

discussion hints they were delivered with catchwords such as shopping and food, visits of

friends, or cleaning regime. Furthermore they had floor plans of the flat that were to be

used for the room allocation. During the second phase delay was introduced according to

the best-PEST algorithm, using a yes-no paradigm, i.e. after one minute of conversation,

the subjects were asked whether they perceived a delay or not. For these experiments we

ran two interleaved best-PEST calculations: One aiming to approach the perception

threshold, and another aiming to approach the acceptance threshold. After having per­

ceived a delay the subjects were asked whether it was disturbing or not.

In the audio-visual condition, 30 female and 47 male subjects (aged between 20 and 45

mean=24) completed the realistic task, giving a total of 954 perception and 602 accep­

tance ratings for different delay values. The subjects received 30 CHF for participation in

the 45 minute experiment. In the audio-only condition, 8 female and 22 male subjects

(aged between 19 and 32 mean=24) completed the realistic task, giving a total of 438

perception and 326 acceptance ratings for different delay values. The subjects received 10

CHF for participation in the 30 minute experiment.

Page 93: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.2 In Human-Human Interaction (HHI) Mode 79

4.2.3 HHI-Results

Absolute Delay: Basic Interaction Tasks (AudBas and VisBas)

Figure 25 shows the logistic psychometric functions for absolute delays. The 75 %

thresholds are at 196 ms (AudBas) and 204 ms (VisBas). The thresholds - obtained by

the adaptive procedure best-PEST (see Figure 26) - are 216 (±44) for AudBas, and 237

(±92) ms for VisBas, (numbers in brackets stand for the 95% confidence levels). A one­

sided, paired t-test shows that the two means are not significandy different (p>0.05).

Gender and age had no significant effect on the detection of basic interaction delays.

oo 0

Visual Perceptiono • • • • •• of Absolute Delay

Auditory Perception0---- of Absolute Delay

o

o

1.00

~ 0.75 ~---------C)----171a..;

;~

"'C(

<:;~ 0.50 ...-----~

<3 0­oJ2~

0.25

o 80 160 240 320Absolute Delay (ms)

400 480

Figure 25 Psychometric functions for absolute delay perception with auditory in­teraction (straight line) and visual interaction (dashed lir ~). Arrows indicate the 75%thresholds. The experimental data are fitted with a logisti( model. The data points repre­sent rates of correct answers for particular delay values that have been obtained withequal or more than 30 measurements.

Page 94: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

80 CHAPTER 4. EXPERIMENTS

500 r I'----~·,···_--_·__·,,· ..-··_--_··_--_·_·__·_-....---------,,

n=9 i --auditory interaction '450 + plus 95% Cl.

I400A

minus 95% Cl. I

350 - - - threshold auditory+.... ~... _....~+-_ ...~.. -..-~-- ..III 300 + +E';:250III --------"i 200 ~

"C

150 ~

100

50 r

01 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# of trials

500 r---------------- ..-.-----...----....---.-----.~,

+ n=9 I --visual interaction I

450 I + plus 95% Cl.I

400 + + !

minus 95% Cl.

350 B + threshold visual+ ---....III 300 - + + +E';: 250

+

III -------"i 200"C

150

100 r

50

0 ------'--------- .._-----'----- ___L ____ --------"-----.-,----------'---- ..- -------l __ ,_____~_.___.L._______ ,_.J

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16# of trials

Figure 26 The curves represent the mean of all subjects. Continuous lines showthe progression of the threshold convergence. Dashed lines indicate the final thresholds;grey lines indicate the 95% confidence interval. A:. Auditory interaction, B: Visual inter­action.

Page 95: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.2 In Human-Human Interaction (HHI) Mode

Absolute Delay: Realistic Task (AudVisReal)

81

Figure 27 shows logistic psychometric functions for perception and acceptance when

interacting audio-visually with a realistic task (condition AudVisReal) (curve fitting by

means of the method of the least-squares). The 50 % perception threshold obtained is

1220 ms, and the 50 % acceptance threshold is 2080 ms.

1.00

--- Perception of Delay

••••••. Non-Acceptance of Delay

0.75

0.25

o 400

o

800 1200 1600Absolute Delay (ms)

2000 2400 2800

Figure 27 Psychometric functions for absolute delay perception (straight line) andacceptance (dashed line) in a realistic, conversational task using both the audio and thevisual channel. Arrows indicate the 50% threshold~. The experimental data are fittedwith a logistic model. The data points represent rates of yes-answers for particular delayvalues that have been obtained with equal or more than 100 measurements.

Page 96: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

82 CHAPTER 4. EXPERIMENTS

Absolute Delay: Realistic Task (AudReal)

The thresholds of absolute delays - obtained by the adaptive procedure best-PEST ­

in a task where only the audio channel is supported are at 970 (±330) ms for perception,

and 1760 (±410) ms acceptance, (numbers in brackets stand for the 95% confidence lev­

els). A one-sided, paired t-test shows that the mean perception threshold is significandy

higher (p<0.01) than the mean acceptance threshold. Gender and age had no significant

effect on the perception and acceptance of absolute delays. Figure 28 shows the particu­

lar logistic psychometric functions, obtained by a curve fitting procedure by means of the

method of the least-squares. The 50 % thresholds obtained are 800 ms (perception), and

1690 ms (acceptance), respectively.

--- Perception of Delay

•••••• , Non-Acceptance of Delay

28002400

o

20001200 1600Absolute Delay (ms)

800400o

1.00

0.75

~Cl)c:,:tI)

~0.50....0 0.e&!

0.25

Figure 28 Psychometric functions for absolute delay perception (straight line) andacceptance (dashed line) in a realistic, conversational task using only the audio channel.Arrows indicate the 50% thresholds. The experimental data are fitted with a logisticmodel. The data points represent rates of yes-answers for particular delay values thathave been obtained with equal or more than 30 measurements.

Page 97: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

4.2 In Human-Human Interaction (HHI) Mode 83

Table 13 summarises the results of the threshold determinations conducted in the

HHI mode. Note that in the AudVisReal condition, no best-PEST procedure was ap­

plied, thus no individual thresholds were obtained. As a consequence it is not possible to

quote confidence levels and statements about significance.

Table 13 Summary of results of the HHI experiments (n.a. means: not available).

B8$.lc Interaction

AudBas VisBas AudVisReal AudReal

Perception Threshold best-PEST 216 (±44) ms 237 (±92) ms n.a. 970 (±330) ms

Perception Threshold fitted model 196 ms 204 ms 1220 ms 800 ms

Acceptance Threshold best-PEST n.a. n.a. n.a. 1760 (±410) ms

Acceptance Threshold fitted model n.a. n.a. 2080 ms 1690 ms

Slope Ps of the standardised psy- 3.316 3.1620.8889 (perc.) 1.056 (perc.)

cho-metric function (thresh. at 0.5) 1.575 (accept.) 2.324 (accept.)

Experimental design Within-group Between-group

Significance difference (p<0.05) no n.a.

Significance age (p<0.05) no no n.a. no

Significance gender (p<0.05) no no n.a. no

Page 98: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /

Blank leaf

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

II

Page 99: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5

5.1

Discussion and Conclusions

In this chapter we discuss the results of the experiments described in the previous chapter.The discussion is divided into relative deltrys, absolute deltrys, and a section where we discussthe task dependenry ofthe perception and acceptance ofde/try.

Regarding Relative Delays

5.1.1 In Human-Computer Interaction (HCI)

The summary of the HeI experiments from Table 11 (page 74) shows that the percep­

tion threshold of a visual stimulus preceding an auditory stimulus is approximately 30 ms

higher than the perception threshold of reverse ordered stimuli. This is plausible since it

reflects human experience in a natural environment, where the propagation speed of light

is much higher than that of sound. Thus, humans are adapted to this situation and

thereby less sensitive to it. Other studies (Dixon et al., 1980; McGrath et al., 1985;

Lewkowicz, 1996) found that synchronisation errors are detected easier the more artifi­

cial the presented situation is. For our experiment, the chosen presentation is highlyarti­

ficial. Thus, we consider the thresholds we found to be suitable for most stringent condi­

tions, as might be present e.g. in telesurgery applications. A~ suggested, just noticeable

relative delays may serve as a decision support for content and service providers and

network planners, in such a way that below these values users will not benefit from op­

timisation of the network referring to relative delays.

Page 100: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

86 CHAPTER 5. DISCUSSION AND CONCLUSIONS

Since detection performance of relative delays is distributed over user populations, it is

- from a service provider's point of view - a 'political' question, which user percentage

will be accepted to perceive a particular relative delay. From the psychometric functions

of Figure 19 (page 70) such detailed information can be calculated (see Table 15). For

this purpose, the inverse function of the generic psychometric function of eq (8) (page

57) is determined:

[

In (I00lf/-1 - I)Jt/J =() I - -----'-__---C-

2Ps

t/J : delay [ms] {t/J E R !t/J ~ O}

If/ : user percentage [%] {If/E RiO < If/ < lOO}

Ps :standardised slope (i.e. threshold is at stimulus intensity of 0.5)

() : threshold [ms]

eq (18)

Table 14 Parameter values obtained from the resulting psychometric function ofthe relative delay experiments. These values inserted in eq (18) lead to the values listedin Table 15.

8 77 98

Ps 1.756 2.252

Page 101: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.1 Regarding Relative Delays

Table 15 Relative delay values perceived by particular percentages of users.Reading example: It can be expected that not more than 25 % of users will detect anAV-delay of 53 ms, and a VA-delay of74 ms, respectively.

87

Percentage of UsersDetecting Asynchrony in Hel

'1/[%]

5

10

25

33

50

67

75

90

95

Extent of Asynchrony whenAuditory Precedes Visual (AV)

{lema]

12

29

53

61

77

92

101

125

141

Extent of Asynchrony whenVisual Precedes Auditory (VA)

{lema]

34

50

74

67

98

113

122

146

162

Note that - due to consistency reasons with the adaptive method best-PEST - we

used a logistic model to fit the data, thus assuming the user's detection performance to

follow a logistic function. In some respects this proceeding might look unusual, since

measured values are commonly mapped by Gaussian distributions. But, since the nota­

tion of the logistic function is more practicable and the two distributions only differ neg­

ligibly, we consider the chosen procedure more maintainable in praxis. This proceeding is

also supported by the fact that threshold values obtained with the model fitting proce­

dure are close to the threshold values obtained with the adaptive procedure, i.e. they are

within the 95 % confidence interval. From these results it can be concluded, that the lo­

gistic model is appropriate - at least in the middle of the response range, i.e. around the

threshold value.

Ifwe take a closer view at the results taking into consideration the different processing

times of auditory and visual stimuli, it appears that the threshold differences between AV

and VA are inverted: Considering the point of time when auditory and visual stimuli are

available to consciousness, it becomes obvious that a bigger internal time difference 'tAV

than 'tVA is needed to perceive an audio-visual event as asynchronous (see Figure 29). At

the AV-threshold value of 71 ms, the perceived difference - in consideration of the dif­

ferent processing times for auditory (15 ms) and visual (55 ms) stimuli - takes the value

Page 102: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

88 CHAPTER 5. DISCUSSION AND CONCLUSIONS

'tAV of approximately 110 ms, whereas the perceived difference 'tVA at the VA-threshold

value of 105 ms is approximately 65 ms.

:2 .......--------------.o auditory.cf visual.c-•~

auditory

auditory

500 t [ms]400300

visual

visual

200

Figure 29 Perceived time differences at the thresholds obtained from the relativedelay experiments, when considering the processing time differences of auditory andvisual stimuli.

Recent ERP-findings (Molholm et al., 2002) suggest that the auditory component of

an audio-visual event prepares early visual areas in the cortex for the awaiting visual

component. Under the assumption that this preparation leads to an earlier conscious per­

ception of the visual stimulus, this effect could account for an equalisation of differences

between 'tAV and 'tVA, and could thereby indicate an univer"al time quantum within which

multimodal synchrony is perceived. Following such argumentation, it would be necessary

to know, if the complementary effect is also observed, i.e. if the visual component of an

audio-visual event is found to prepare the auditory cortex. To our knowledge, such an ef­

fect is not known, and remains to be investigated. Although our data do not reveal suffi­

cient evidence to elucidate such effects, they can serve as basis leading to formulate well­

founded hypotheses.

Page 103: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.2 Regarding Absolute Delays 89

5.2

5.2.1

Regarding Absolute Delays

In Human-Computer Interaction (HCI)

The results of Table 11 (page 74) show that it is significantly easier to perceive an ab­

solute delay when interacting with the computer by mouse-clicks rather than vocally. Or

more precisely: Voice-visual interaction delays are less likely to be detected than click­

visual interaction delays. The difference of the two thresholds is approximately 30 ms.

Since the voice trigger is less distinct compared to the mouse trigger, this difference

makes sense, i.e. the sharp onsets of mouse clicks facilitates detection of delays, com­

pared to the blurred onsets of vocal utterances.

Table 16 Parameter values obtained from the resulting psychometric function ofthe absolute delay experiments in HeI. These values inserted in eq (18) (page 86) lead tothe values listed in Table 17.

VocVis. . .... ...8 98 64.8

Ps 1.111 1.133

Table 17 Absolute delay values perceived by particular percentages of users.Reading example: It can be expected that up to 75 % of users will detect an absolute de­lay of 146 ms when interacting by voice. And up to 75 % of users will detect an absolutedelay of 96 ms when interacting by mouse clicks.

Percentages of Users DetectingAbsolute Delays in Hel

",[OhJ

25

33

50

67

75

90

95

Absolute Delay inVocVis Interaction Mode

; {msJ

50

67

98

129

146

195

228

Absolute Delay inMouVis Interaction Mode

; {msJ

33

45

65

85

96

128

149

Page 104: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

90 CHAPTER 5. DISCUSSION AND CONCLUSIONS

Table 11 (page 74) shows that the threshold obtained with the model fitting procedure

is within the 95 % confidence interval of the mean threshold obtained with best-PEST.

Thus - like in the relative delay experiments - we suggest that, around threshold, the lo­

gistic model fits well to the experimental data. However, considering the psychometric

function of Figure 21 (page 72), it appears that this is no longer true for small delays: The

best fit of the logistic function intersects the ordinate at a point above 50 %. Translated

to user percentages this would mean that there is a certain percentage of users, say 10 %,which would detect a nonexistent delay. Actually, this scenario is conceivable in experi­

ments using the yes-no mode, when subjects give yes-answers without perceiving a delay

(commonly expressed by the false alarm rate l!). But in forced-choice experiments - thus in

the case at hand - this is assumed not to happen, rather they are applied just because one

aims to avoid such response bias. The following two reason illustrate, why these response

biases are unlikely to happen:

• The subjects had to assign, in which presentation the delay occurred, not ifthere was a delay. That way, the subjects could not pretend to perceive a de­

lay.

• The presentation containing the delay was randomly distributed, and varied

following the best-PEST procedure; thus it could not be anticipated.

Thus, since the bias is unlikely to result from methodical weaknesses, we must con­

sider and possibly revise the assumption made that the logistic (and also the Gaussian)

distribution maps the user's detection performance. At least for absolute delays, we have

to take into account other distributions. Qualitatively, it seems that right-skewed distribu­

tions (e.g. Log-Normal and Poissonq) could better match the data. In fact, Limpert et al.

(2001) suggest that the Log-Normal distribution maps multiplicative biological processes

better than the popular Gaussian distribution. Furthermore, they analysed arbitrary

Gaussian measuring data, and found, that the Log-Normal distribution matched the data

at least as well as the Gaussian distribution.

For further delay experiments, we suggest to fit the obtained data with a cumulative

Log-Normal or cumulative Poisson distribution, and to use one of these distributions as

the underlying function of the best-PEST procedure. The drawback of this proceeding is

q Interestingly, in Pacemaker-Switch-Accumulator models (see section 3.4.3 on page 52) the sci­entific discourse concerns - among others - the question, which distribution the pacemaker fre­quency is likely to follow. The hypothesis that the frequency follows a Poisson distribution ismore and more evident (Gibbon, 1992; Wearden et aI., 2001).

Page 105: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.2 Regarding Absolute Delays 91

that both suggested models are computationally impractical. As a consequence, numerical

approximations of the two functions must be implemented.

For the time being, we have to be satisfied with the data at hand. However, in order to

decide if the logistic model is appropriate enough for small delays, we suggest a heuristic

rule for its use: If the standardised slope (i.e. where the threshold is set to the stimulus in­

tensity of 0.5) of the particular psychometric function is greater than 1.7, the model willaccount for small delays. Complying with this rule, one can expect the logistic function

intersecting the ordinate at values smaller than 3.2 %. This means that less than 3.2 % of

the users would detect a nonexistent delay. As can be seen in Table 16, the standardised

slopes of the MouVis, and the VoiVis conditions are smaller than 1.7. That is why we re­

frain from declaring users percentages detecting small delays for these conditions (see

Table 17 on page 89).

5.2.2 In Human-Human Interaction (HHI)

In contrast to HCI, in HHI experiments it is not possible to approach the thresholds

of all interacting subjects simultaneously, since the subjects share the same delay values

calculated on the response basis of only one subject. Therefore, in all HHI experiments

using adaptive methods, one has to assign a determining subject whose threshold is fi­

nally determined. The remaining subjects contribute only with their corresponding rat­

ings. For this reason, the statistical power is not that high as it could be expected from

the chosen number of recruited subjects. In order to include all available information, we

therefore applied the curve fitting procedure already applied in the HCI experiments, i.e.

we fitted an assumed logistic model to the data by means of the method of least squares.

From the obtained psychometric function, the desired thresholds can be read out.

Basic Auditory and Visual Interaction

With the curve fitting procedure we found a threshold of 196 ms for auditory, and of

204 ms for visual interaction. These values agree (i.e. are within the 95 % confidence

level) with the threshold values obtained from the best-PEST procedure including only

half of the subjects (216 ±44 ms for auditory, 237 ±92 ms for visual interaction). Reca­

pitulating the results from the basic interaction task we found an absolute delay threshold

of about 200 ms for both auditory and visual interactions. We should bear in mind, that

Page 106: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

92 CHAPTER 5. DISCUSSION AND CONCLUSIONS

this value is a difference threshold DL, on the basis of the build-in delay of 130 ms plus

the reaction time of the subjects, which is about 190 ms (Brebner, 1980). The suggested

value has to be understood in the following way: When confronted with an absolute de­

lay of 320 ms, 50 % of the subjects were able to detect an additional delay of 200 ms.

These results are in line with the results from the HCI experiments (see also

(Zuberbiihler et al., 2003», where we investigated the absolute delay between vocal input

and delayed visual computer-generated response (condition VoiVis). This absolute delay

is 115 (±23) ms, or approximately half of the present value. This makes sense, since in

HCI experiments the subjects were not confronted with human interaction partners, and

had therefore not to consider the ambiguous (and fluctuating) human reaction time.

In contrast to the findings in HCI experiments, the logistic model accounts in this case

also for small delays (i.e. the standardised slopes are greater than 1.7). For this reason we

can quote percentages of users detecting small delays (see Table 19).

Table 18 Parameter values obtained from the resulting psychometric function ofthe absolute delay experiments in HHI. These values inserted in eq (18) (page 86) leadto the values listed in Table 19.

AudBas ,,--(J 196 204

Ps 3.316 3.162

Page 107: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.2 Regarding Absolute Delays

Table 19 Absolute delay values perceived by particular percentages of users.Reading example: It can be expected that up to 75 % of users will detect an absolute de­lay of 228 ms in auditory HHI. And up to 75 % of users will detect an absolute delay of239 ms in visual HHI.

93

Percentages of Users DetectingAbsolute Delays in HHI

Vtl%]5

10

25

33

50

67

75

90

95

Realistic Tasks

AAbsolute. Delay

;[ms]

109

131

164

175

196

217

228

261

283

..... - - ...

;[ms]

109

133

169

181

204

227

239

275

299

As we have seen from the basic interaction task, involving two or more people in in­

teractive tasks almost doubles perceived absolute delays. This eff<..:ct becomes even more

striking, when the involved persons solve a realistic task, instead of a task designed to fa­

cilitate the perception of absolute delays. For the task of free Jiscussion about a familiar

topic, we found a perception threshold of over 1200 ms, and an acceptance threshold of

almost 2100 ms (see Figure 27 on page 81). Since there was no evidence from other stud­

ies to support such high values, we have dimensioned the experimental set-up only for a

maximal delay of 2800 ms. In the course of the :..:xpt:riment, we noticed, that several sub­

jects did not even detect such a high delay, anc' a greater number of subjects did not find

it disturbing.

Such circumstances make the adaptive )rocedure best-PEST difficult to apply, since

after a few non-detections of the highesttimulus the algorithm requires a huge number

of detection trials in order to return to tht testing range. Hence we did not pursue further

adaptive procedures, but instead presented particular delay values in a random order and

recorded the respective ratings. As a cons,.;quence of this proceeding we were no more

able to determine individual thresholds, and thus cannot quote a confidence interval. In-

Page 108: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

94 CHAPTER 5. DISCUSSION AND CONCLUSIONS

stead we applied the model-fitting procedure described earlier to obtain the threshold

values, and the psychometric functions depicted in Figure 27. They show two things:

• The perception and the acceptance functions are relatively flat signifying

that either there exists no sharp thresholds (in this case one might discuss

whether the term threshold is appropriate in this context), or there are great

slope and threshold variances, i.e. some subjects have very good time dis­

crimination skills, while others have moderate to poor. Due to qualitative

observation of the subjects during the test we tend to favour the latter ex­

planation.

• The not-standardised slopes of the perception and the acceptance functions

are essentially of the same size (1.02 versus 1.06). This fact may indicate that

two similar, linearly interconnected mechanisms are involved in perceiving

and rating absolute delays. It is understood that this hypothesis must be con­

firmed by further experimentation.

Our finding that the perception threshold is much greater in the realistic than in the

basic task, as well as inconsistent threshold figures found in the literaturer suggest three

conclusions:

• Perception and acceptance of absolute delays are very much task-dependent.

Therefore it is probably not helpful to recommend universal threshold val­

ues, rather they should be suggested for different task categories.

• The choice of value ranges is not a simple task and should be kept as a busi­

ness strategy of the service provider.

• The main difference between the realistic and the basic task concerns the de­

gree cif interactiviry. Whereas in the basic task this variable is assumed to be at

maximum, it is at a considerably lower level in the realistic task, since the

subjects spend more time studying the documents. Thus, the degree cif interac­

tiviry could act as the variable upon which particular tasks (and communica­

tion settings) can be classified. We consider the degree cif interactivity as the

sum parameter including some of the verbal interaction parameters sug-

r Bouch for instance suggests a value no greater than 400 ms (Bouch et aI., 2000b), whereasIsaacs and Tang suggest a delay of between 640 ms and 840 ms to be acceptable (Isaac et aI.,1994) (the three figures refer to roundtrip delay).

Page 109: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.2 Regarding Absolute Delays

gested by O'Conaill et al. (1993): Backchannels, interruptions, explicit handovers

and number rf turns.

95

In order to fmd reasons for the unexpectedly high perception and acceptance thresh­

olds in the audio-visual realistic task, we ran an experiment with the same task, applying

the audio channel only. With this condition we found a perception threshold of 800 ms,

and an acceptance threshold of 1690 ms. These two values are within the 95 % confi­

dence interval of the threshold means obtained with the best-PEST procedure (970

(±330) ms for perception, and 1760 (±410) ms for acceptance). The fact, that thresholds

in the audio-only condition are well below the thresholds in the audio-visual condition

suggest two possible explanations:

• The visual channel in an audio-visual application acts as a distractor. I.e. the

focus of attention is divided into parts for the audio, and parts for the visual

channel. Since the audio channel apparendy suffices to execute the chosen

task, the additional visual information does not yield additional clues for de­

tecting delays, far from it, it hampers the detection of delays. This does not

mean that the visual channel does not yield usable information. But it seems

that the gain of 'media richness' in audio-visual communication has to be

paid by a loss of focussed perception.

• The use of videoconferences (VC) is still unfamiliar to the users acting as

experimental subjects, whereas audio-only conversation is not: Since teleph­

ony is very common for users, they are well-trained to perceive and evaluate

situations differing from the ones considered normal. This is not the case in

VC, insofar as the subjects do not have a point of reference to compare the

experimental situation withs•

The fact that, in the audio-only condition, perception and acceptance thresholds are

still above the values found in the literature suggest the following explanation:

• Three subjects participated in our experiment, whereas only two were em­

ployed in experiments described in the available literature. The higher

threshold of our experiments could signify that additional conversation

s This explanation resembles in some respects the reasoning, that computer-mediated real-timecommunication is assumed to be compared to the reference point of the natural face-to-facecommunication, whereas for other areas of computer-mediated communication (i.e. browsingthe WWW), such a reference point does not exist (see also section 2.2, Scope of Investigation).

Page 110: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

96 CHAPTER 5. DISCUSSION AND CONCLUSIONS

members act as additional distractors. I.e. the focus of attention is divided

into all members. A further reason could be that - since the videoconfer­

ence does not support gaze awareness - the members are not sure when

they are addressed. This slows down the degree of interactivity, and thus the

perceived absolute delays.

Table 20 Parameter values obtained from the resulting psychometric function ofthe absolute delay experiments executing a specific realistic task. These values insertedin eq (18) (page 86) lead to the values listed in O.

~erception

AudVisReal AudReal AudVisReal AudReal

8 1220 800 2080 1690

Ps 0.889 1.06 1.58 2.32

Page 111: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.3 Further Research

Table 21 Absolute delay values perceived by particular percentages of users.Reading example: It can be expected that not more than 33 % of users will detect an ab­solute delay of 734 ms when interacting audio-visually, and 535 ms when interactingonly auditory. And not more than 33 % of users will ftnd an absolute delay of 1610 msdisturbing when interacting audio-visually, or 1430 ms when interacting only auditory.Note that these values count only for the chosen task.

97

; [ms] ; [ms] ; [ms] ; [ms]

n.a. n.a. n.a. 617

n.a. n.a. 629 889

466 386 1350 1290

734 535 1610 1430

1220 800 2080 1690

1710 1070 2550 1940

1970 1220 2810 2090

2730 1640 3530 2480

3240 1920 4030 2750

Percentages of UsersDetecting or AcceptingAbsolute Delays in HHI

"rh]5

10

25

33

50

67

75

90

95

Perception of Absolute Delay

AudVisReal AudReal

.. , - • ft .1... .,.,....;, ,

AudVisReal AudReal

Considering the values of the standardised slopes in Table 20, it appears that the val­

ues for only the acceptance function in the realistic audio-only task were greater than the

suggested value of 1.7. That is why in Table 21, for the other conditions, no delay values

are listed for small user percentages.

5.3 Further Research

This thesis shows several areas, where further research concerning perception and ac­

ceptance of delays in multimodal real-time communication, as well as human time per­

ception is needed. In the following future research needs are divided into relative and ab­

solute delays.

Page 112: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

98

5.3.1 Relative Delay

CHAPTER 5. DISCUSSION AND CONCLUSIONS

Although figures of perception thresholds of relative delays between auditory and vis­

ual stimuli have been suggested by several authors (Dixon et al., 1980; Summerfield,

1992; Lewkowicz, 1996; Steinmetz, 1996), only a few of them adopted psychophysical

procedures in their study designs. While one might argue that this is not necessary, since

the suggested values are working well in practice, it is nevertheless of interest to verify

these values with different study paradigms, and for different contexts.

Furthermore, questions concerning the different asynchrony perception for different

modality orders are predominately answered by intuitive explanations. A consistent

model of multimodal stimulus processing is still absent. In this area, the upcoming medi­

cal imaging techniques present a promising means to investigate questions concerning

multimodal stimulus integration in humans. They could provide deeper insight, why e.g.

AV-stimuli are detected easier than VA-stimuli.

5.3.2 Absolute Delay

Since perception and acceptance of absolute delay is strongly task-dependent, we sug­

gested the degree rf interactivity to act as the variable upon which tasks and communication

settings can be assessed in terms of their delay-sensitive impact. The usefulness of this

variable must be verified. If it should turn out to be appropriate, further work has to be

done aiming to classify the abundance of relevant communication settings. Once the

communication settings evoking same degrees of interactivity are pooled, further experi­

ments must be conducted with some representative communication settings. The goal of

such a proceeding is to obtain psychometric functions for particular interactivity rates.

Since interactivity is considered a parameter, which can be continuously measured in

networked services, it should be possible to adjust delay values according to the meas­

ured degree of interactivity. Having knowledge of the appendant psychometric function,

the delay can be set according to a predefmed (or negotiated) percentage of users per­

ceiving, or accepting this particular delay.

On a more theoretical side, further research is needed for modelling time perception.

Although a lot of work has already been done in different research areas, it is still to dis­

cover, which neurological mechanisms are responsible for conscious time perception. As

a consequence, for the time being, it remains an open question, which distribution time

Page 113: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

5.3 Further Research 99

perception follows. And most notably it is not understood, how contextual factors, such

as attention, arousal, modality, mood, age, and intelligence influence the conscious esti­

mation of durations.

Page 114: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /Blank leaf

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

Page 115: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Annex

Developed Software: The best-PEST Calculator

As a methodical outcome of the threshold experiments, we advanced the used best­

PEST method to a fully independent, browser-based application. The idea was to pro­

vide experimenters with a tool for measuring thresholds, which can be used without

spending any installation, compilation, or even programming effort (this is in contrast to

other available software). The drawback of this premise lies in the missing interface. For

security reasons the program has no access to the client computer and therefore cannot

provide it with the estimated values direcdy. The experimenters have to insert the re­

ceived threshold values in their testing environment by hand. This fact makes the best­

PEST Calculator useful especially for these threshold estimations, whose stimulus pres­

entation cannot be done with the aid of common computer-equipment, like e.g. smell

and taste thresholds. This manual and the program can be downloaded from the follow­

ing internet address, also quoted in Zuberbiihler (2002):

http://www.psychophysics.ethz.ch/tools/

Depending on the version used, the browser has to be updated with the Macromedia

Director plug-in version 8.5. The software recognises automatically if an update is neces­

sary, whereupon it will be done within three or four mouse-clicks.

In the following the best-PEST Calculator is described. This description can also be

downloaded from the above-mentioned link.

Page 116: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

102 ANNEX

Description

In the following Figure 30, Figure 31, and Figure 32, screenshots of the three masks of

the program are shown and the input and output fields are explained where they are not

self-explanatory (indicated by numbers).

Settl n 5

Forced-~~~;~~=~;~~~d~~~-(~~F~)-···---·::!j CD

@@

;::.:=::.:::==:====.::=====:::.::.:.:.==::.:.:.=::======.,4 f4\Number of trials \::!/

1) 10.06-..·----.--.-------

0.08

®®CV

.. _,_ _ _ 3 __ _._ _ _. .. .__ --{ ®

Figure 30 Screenshot of the first mask (input), where the settings for the experi­ment are entered. If all the fields are filled out in the requested format, pressing the'start' button will lead to the second input mask. If not, a dialogue window pops up, in­dicating the missing or false input. Clicking the arrow opens the 'advanced settings'

fields. By default these settings are: 'slope W= 2, 'false negative 0' = 0, 'false positive E'

= 0, 'mean of x trials' = 3.

CD ModeIn the drop-down menu mode, the users have the choice between the yes-no and the forced­choice (nAFC) paradigm. If they choose nAFe, an additional input field appears, where thenumber of alternatives n is to insert. If n > 100 is entered, the program switches auto-

Page 117: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Developed Software: The best-PEST Calculator 103

matically to theyes-no calculation mode. It is to state that experimental subjects most likelywill be overstrained if they have to make repeated decisions about the presence of astimulus from more than a hundred alternatives. Anyhow, if such experiments areplanned, one can expect the error caused by the slightly inadequate calculation beingmuch smaller than the error caused by any other interference - for instance the subject'slapses.

eq (19)

{l E lR Il ~ o}

{kElRlk>O}k: stimulus maximum

CID Start value kSetting of the test interval [0, k], where k determines the highest stimulus value that can beobtained during the run. The upper limit k should be at least twice as large as the expectedthreshold value. Note that the start value will not be presented to the subject, assumingthis value is so high that subjects will perceive it in all the cases. In order to deal withcomparable slope values, the algorithm uses the normalized range [0, 1] of the stimulusintensity. The stimulus intensity fjJ denotes therefore:

lfjJ=k

fjJ*: stimulus intensity in desired unit

@ Smallest Step SizeDetermines the size of the smallest stimulus change that can be obtained. Ideally this isthe difference threshold of the particular stimulus. If this value is not known - in the casewhere we just want to determine it - we have to estimate a suitable step size. Experiment­ers need to be aware of step sizes that are too small or too big, since both result in largemeasurement bias of the thresholds. If the ratio between 'start value' and 'smallest stepsize' is larger than 1000, the program will prompt a warning and ask for either a biggerstep size or a smaller start value. This is a precautionary measure to prevent lengthy com­puting times.

cv Termination CriterionUsers have the choice between 'Number of Trials' and 'Number of Reversals'. A reversalR is defined as a change from increasing to decreasing (01 the other way around) of thepresented stimulus intensities M.

M ={m E lR I m is presented at trial i} eq (20)

Page 118: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

104

R ={mj E M I (mi-l > mi < mj+l) v (mi-l < mi > mi+l )}

M: set of presented stimulus intensities

R: set of reversals

ANNEX

eq (21)

@ Advanced Setting: Slope pAs an advanced setting, the users have the opportunity to enter the estimated or knownslope of the particular psychometric function. For the definition of the slope see Figure14 and eq (7). The slope value is calculated according to equal-scaled axes. Entering pim­plies knowledge about the tested cohort or subject, usually gained through pre-testing. Ifthe slope is not known, Pwill be set by default to two.

@ Advanced Setting: false negative b8 specifies the false negative rate (or miss rate). This rate is constituted by the observers'negative answers even though the stimulus intensity is at maximum. Entering 8 impliesknowledge about the tested cohort or subject, usually gained through pre-testing. By de­fault this value is zero.

(J) Advanced Setting: false positive E

£ specifies the false positive rate (or false alarm rate). This rate is constituted by the ob­servers' positive answers even tough the stimulus intensity is zero. In forced-choice ex­periments, £ does not comprise the methodical false alarm rate, which is the reciprocalvalue of the number of alternatives. Entering £ implies knowledge about the tested cohortor subject, usually gained through pre-testing. By default this value is zero.

® Advanced Setting: Mean of x Trialsx specifies the number of trials to take at the end of an experimental run for calculatingthe mean threshold value. As a rule-of-thumb, larger numbers of trials permit larger num­bers of x. By default this value is three.

Page 119: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Developed Software: The best-PEST Calculator

Next value to present to the sUbject is

110

The subject's response was

CORRECT INCORRECT@ 0

....................................

Calculate next value

Figure 31 Screenshot of the second mask (input/output), where the computationof the actual maximum likelihood threshold is done. Pressing the button 'back' willabort the computation and returns to the ftrst mask to modify the settings. Pressing the'cancel' button will abort the computation and switches the program to the results maskwhich displays the recent status of the experiment, without having reached the termina­tion criterion.

105

®

®

CID Step 1: Output from the best-PEST algorithmThe output value mi is to be presented to the subject. This value is the maximum likeli­hood estimation of the threshold, obtained from all available information. Since there isno information available from the subjects in the first trial, the initialisation is conductedassuming that 100% of the subjects will perceive the stimulus at the start intensity k, andthat at zero intensity they will be certain not to perceive the stimulus. Therefore the firstoutput will fall somewhere in the middle of the test interval.

Page 120: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

106 ANNEX

@) Step 2: Response of the subjectAfter the subjects were presented with the stimulus intensity obtained from step 1, the ra­dio button is to select corresponding to the subject's response. In the nAFC mode thebuttons are labelled with 'CORRECT' and 'INCORRECT', and in the yes-no mode theyare labelled with 'YES' and 'NO'.

@ Step 3: next valuePressing the button 'calculate next value' will trigger the next calculation, whereupon anew value will appear in the output field. Steps 1 to 3 have to be repeated until the termi­nation criterion is reached. Pressing this button will bring the program to the 'results'mask.

R e 5 U 5

Threshold is at

103 @

@Values

stimulusintensity220 •\

"

198 ...

\

17S \,"

\.154\

132 '. @110\ ...~ ,~,-

...-".,-"-- .....,,""'---..........--.. threshold\. .'

0·-.. -.::,," ......../ .. '9==::::::¥

88 ", ,.. .......

"-SS .......~.... !

44

22

o 0 2 3 4 5 S 7 8 9 10 11 12 13 14 15 is -number "'trials

Figure 32 Screenshot of the third mask (output), where the results of the entireexperimental run are displayed. Pressing the 'start again' button will return the programto the first mask, and leave the settings as they are.

Page 121: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Developed Software: The best-PEST Calculator

@ Threshold valueOutput of the final threshold estimation, which is the mean value of the x last trials.

107

@ All valuesThe presented stimulus intensities of the entire experimental run are displayed andmarked in the field 'values' in order to copy them to the clipboard (Ctrl + C).

@GraphThe values of the entire experimental run as well as the final threshold are shown in a dia­gram with stimulus intensity as ordinate and number of trials as abscissa.

Monte-Carlo-Simulations

The following Monte-Carlo-Simulations were made to evaluate the convergence be­

haviour of the best-PEST algorithm. All simulations were made in theyes/no mode with

equal start values. A built-in random process simulated the response behaviour of an as­

sumed experimental subject, which we call stochastic obseroer. For that purpose we assumed

that the stochastic observer answers in a logistic manner with a stable threshold - an as­

sumption that is in fact made by best-PEST:

...........According to eq (17) (page 62), ON is the n-th threshold estimate accomplished by

best-PEST. For this estimate there is - according to eq (8) (page 57) - a probability

'1/(0,:;) for a positive response. We obtain the p;::ticular answer of the stochastic ob­

server by applying the following procedure: If '1/(ON ) is greater than a jointl0stributed

random number between 0 and 1, the stochastic observer answers no, if '1/(ON ) is equal

or smaller than the random number, the stochastic observer answersyes. That way, after a

sufficient number of runs we map the outcome of the best-PEST procedure onto the as­

sumed psychometric function of the stochastic observer, and perhaps an empirical law of

the algorithm's behaviour can be established.

In the following we show the results of three simulation runs. Table 22 lists the corre­

sponding parameter settings for the conducted simulations, whose results are displayed in

Figure 33, Figure 34, and Figure 35.

Page 122: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

108 ANNEX

Table 22 Parameter settings used for the Monte-Carlo-Simulations separated forthree conditions. For an explanation of the parameters see the previous chapter.

Rarameter ... ..•"••"'v if:

Figure 33 Figure 34 Figure 35Mode yes/no Ives/no yes/noStart value k 1.7391 1.7391 1.7391Threshold {} of the stochastic observer 1.0000 1.0000 1.0000Start value k / smallest step size 40 40 40Termination criterion: Number of Trials 15 5to 50 50Slopes of best-PESTs model 1.0 to 3.5 0.1 to 5.0 0.1 to 5.0Slopes of the stochastic observer's psychometric func- same steps same steps 0.1 to 5.0tionFalse negative 8 0 0 0False positive £ 0 0 0Mean of x trials 3 3 3Number of threshold determinations per measuring 3 1000 1000pointNumber of measuring points 2500 2500 2500

In order to gain an idea of the accuracy the best-PEST algorithm provides, we ran a

simulation with realistic parameter settings: As a trade-off between accuracy and practi­

cability, the simulated subject accomplishes three threshold determinations consisting of

15 stimulus presentations with corresponding decision-making. As such the whole pro­

cedure corresponds to the real time of approximately 30 minutes, which is of course de­

pendent on the duration of each stimulus presentation. With such a scenario, experi­

menters can be sure that the subjects' fatigue will play a negligible role. For the simula­

tions, we ran the above-mentioned scenario with slopes from 1.0 to 3.5, resulting in a to­

tal of 2500 threshold means. The histogram of this distribution can be seen in Figure 33.

Page 123: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Developed Software: The best-PEST Calculator

Il2Jfrequencyl

", - .,..t'I.A.n.~ rnB_~_~_

109

0.7 0.8 0.9 1.0 1.1threshold (target value=1.0)

1.2 1.3

Figure 33 Distribution of the obtained threshold values with the best-PEST algo­rithm. The stochastic observer's threshold is 1.0 (target value). Basis for the distributionare 2500 threshold determinations, each representing the mean of 3 runs.

The distribution is approximately Gaussian with a mean of 0.99755, and a variance of

0.00764.

The aim of the second simulation was to gain insight to the convergence behaviour of

best-PEST for different numbers of trials until termination, and for different slope values

of both stochastic observer and best-PEST model. For that purpose we calculated the

variance of the mean threshold after 1000 runs as a function of the mentioned variables.

The contour lines of equal variance in the range lO, 0.05] can be seen in Figure 34.

Page 124: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

110 ANNEX

504540

--~---t-~...._...~~~~~I I I

0.5 L-----l._--l...._-"--_..J.-_l-----J._-..L-_--J...=.:..;:.;;.....J

5 10 15 20 25 30 35Number of Trials

1.0

.....o~2.0c..oen 1.5

Figure 34 Simulation of threshold determination with the best-PEST algorithm.The curves show contour lines of threshold variances up to 0.05. The number of trialsuntil a threshold determination stops is on the abscissa; the slopes of the psychometricfunctions of both stochastic observer and model are on the ordinate. The variance iscalculated on the basis of 1000 threshold determinations for each measuring point. Theslope's increment is 0.1; the number of trials' increment is 1.

The equal variances of the mean threshold describe approximately exponential curves,

which is coherent with the interpretation that increasing number of trials diminishes the

marginal utility. This interpretation is obvious when we consider the nature of the best­

PEST procedure: the information increase relative to the existing information is decreas­

ing with every additional trial, and therefore successive threshold estimations approach

the true threshold values. A further prediction that can be made from these data is that

the number of trials plays an important role only for large slopes of the psychometric

functions.

The third simulation was made in order to analyse the convergence behaviour of best­

PEST for different, interdependent slope values of the observer's and of the model's

psychometric function. For that purpose we calculated - as in the second simulation -

Page 125: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Developed Software: The best-PEST Calculator 111

1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0Slope of Model

the variance of the mean threshold after 1000 runs as a function of the two slope vari­

ables. The contour lines of equal variance in the range [0, 0.05] can be seen in Figure 35.

4.5

4.0

~3.5

~:g'3.0en'+-

~ 2.5c..o

en 2.0

1,0~~~~~m~i0.5

0.1 0.5 1.0

Figure 35 Simulation of threshold determination with the best-PEST algorithm.The curves show contour lines of threshold variances up to 0.05. The slope of themodel is on the abscissa; the slope of the stochastic observer is on the ordinate. Thevariance is calculated on the basis of 1000 threshold deterrninations for each measuringpoint. The increment is 0.1 for both variables.

On first sight the curves of equal variance indicate no reasonable and explainable

model of the interdependent behaviour of the two slope parameters. It can be read out

that there is no reason to choose much bigger model than observer slopes, since they in­

crease the variance for a given observer slope, especially in its lower range. As a rule of

thumb, we can say, that a model slope twice as big as the observer slope will provide best

results, since it seems, that there is a relative minimum at each of the contour lines at

these points.

Page 126: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite Leer /Blank leaf

Page 127: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

References

Alfano, M. (2000). QUASIMODO -Quality rifSef7Jice Methodologies and solutions within thesef7Jiceframework: measuring, managing and chargingQoS: EURESCOM: European In­stitute for Research and Strategic Studies in Telecommunications.

Armitage, G. (2000). MPLS: The Magic Behind the Myths. IEEE Communications Ma­ga=\?"ne, 38(1), 124-131.

Baird, J. c., & Noma, E. (1978). Fundamentals rifscaling andp[Ychophysics. New York:Wiley.

Bales, R. F. (1955). How people interact in conferences. Scientific American, 3-7.

Bales, R. F. (1999) . Social interaction {Jstems: Theory and measurement. New Brunswick:Transaction Publishers.

Block, R. A., & Zakay, D. (2001). Internal Clocks and the Representation of Time.In C. Hoed & T. McCormack (Eds.), Time and Memory - Issues in Philosophy andP[Ychology (pp. 59-76). Oxford: Oxford University Press Inc.

Boltz, M. G. (1994). Changes in internal tempo and effects on the learning and re­membering of event durations. Journal rifExperimental P{Jchology, 20, 1154-1171.

Bouch, A., Bhatti, N., & Kuchinsky, A. J. (2000a). Quality is in the rye rifthe beholder:Meeting users' requirementsfor InternetQuality rifSef7Jice. CHI'2000, Hague.

Bouch, A., Sasse, M. A., & DeMeer, H. (2000b). OfPackets and People: A User-CentredApproach to Quality ofSef7Jice. IWQoS 2000, Pittsburgh, PA.

Braun, A. (2003). Qualitiitsaspekte multimodaler Kommunikation: Subjektive und objektiveMessungen. PhD thesis, Swiss Federal Institute of Technology, Zurich.

Brebner, J. T. (1980). Reaction Time in Personality Theory. In A. T. Welford (Bd.),Reaction Times (pp. 309-320). New York: Academic Press.

Page 128: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

114 REFERENCES

Brebner, J. T., & Welford, A T. (1980). Introduction: An Historical BackgroundSketch. In A T. Welford (Bd.), Reaction Times (pp. 1-23). New York: AcademicPress.

Brown, S. W. (1995). Time, change, and motion: The effects of stimulus movementon temporal perception. Perception & P!Jchophysics, 57, 105-116.

Buonomano, D. V., & Karmarkar, U. R. (2002). How Do We Tell Time? The Neuro­scientist, 8(1),42-51.

Carr, C. E. (1993). Processing of temporal information in the brain. Annual Review 0/Neuroscience, 16,223-243.

Celesia, G. G., & Puletti, F. (1971). Auditory input to the human cortex during statesof drowness and surgical anesthesia. Electroencephalography and Clinical Neurophysi­ology, 31, 603-609.

Chen, T. M., Walrand,J., & Messerschmitt, D. G. (1989). Protocols for PacketVoice. IEEE Selected Areas in Communication.

Church, R. M. (1984). Properties of the internal clock. Annals 0/the New York Acad­emy 0/Sciences, 424, 566-582.

Church, R. M., Meek, W. H., & Gibbon, J. (1994). Application of scalar timing the­ory to individual trials. Journal o/Experimental P!Jchology: Animal Behaviour, 20, 135­155.

Clark, V. P., Fan, S., & Hillyard, S. A (1995). Identification of early visual evokedpotential generators by retinotopic and topographic analyses. Human Brain Map­ping, 2, 170-187.

Clark, V. P., & Hillyard, S. A (1996). Spatial selective attention affects early extrastri­ate but not striate components of the visual evoked potential. Journal 0/CognitiveNeuroscience, 8, 387-402.

Coffman, K. G., & Odlyzko, A. M. (1998). The size and growth rate of the internet.IIII/!:! I IIIJJ 'w. die. UJJlJl. cduI ~ot!1v:7ko It!oc!iJllcmcl. Jl>c./idt:

j _ 4: ~j r.

Dixon, N. F., & Spitz, L. (1980). The detection of auditory visual desynchrony. Per­ception, 9, 719-721.

Fieandt, K., Huhtala, A, Kullberg, P., & Saarl' K. (1956). 1ersonal tempo andphenomenaltime at different age levels. (2). Helsinki: University of Hllsinki.

Fluckiger, F. (1995). Understanding networked multimedia: applications and technology. Lon­don: Prentice Hall.

Page 129: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

REFERENCES

Foxe,].]., & Simpson, G. V. (2002). Flow of activation from V1 to frontal cortex inhumans: a framework for defining 'early' visual processing. Experimental Brain Re­search.

Fraisse, P. (1964). The p.rychology oftime. London: Eyre and Spottiswoode.

Galambos, R., Makeig, S., & Talmachoff, P.]. (1981). A 40-Hz auditory potential re­corded from the human scalp. Proceedings ofthe NationalAcademy ofSciences, 78,2643-2647.

Galton, F. (1899). On instruments for (1) testing perception of differences of tintand for (2) determining reaction time. Journal ofthe Anthropological Institute(19), 27­29.

Gescheider, G. A. (1997). P.rychophysics: The Fundamentals (3 ed.). Mahwah, NJ: Law­rence Erlbaum Associates.

Giard, M. H., & Peronnet, F. (1999). Auditory-Visual Integration during MultimodalObject Recognition in Humans: A Behavioral and Electrophysiological Study.Journal ofCognitive Neuroscience, 11(5),473-490.

Gibbon,). (1992). Ubiquity of scalar timing with a Poisson clock. Journal ofMathe­matical P.rychology, 36, 283-293.

Gibbon,)., Church, R. M., & Meek, W. H. (1984). Scalar timing in memory. Annals ofthe New York Academy ofSciences, 424, 52-77.

Goldstone, S., & Lhamon, W. T. (1974). Studies of auditory-visual differences inhuman time judgment: 1. Sounds are judged longer than lights. Perceptual and Mo­tor Skills, 39, 63-82.

Gonsalves, T. (1989). Comparative Performance of Voice/Data Local Area Net­works. IEEE Selected Areas in Communication.

Guttormsen Schar, S., Arial, M., Zuberbiihler, H. J., & Krueger, H. (2002). DistributedCo-operative Design Systems: supporting Human Factors with 'Communicate-It'. 28th An­nual Conference of the IEEE Industrial Electronics Society, Sevilla, Spain.

Helder, G. K. (1966). Customer Evaluation of Telephone Circuits with Delay. BellSystem TechnicalJourna4 38(9).

Hershenson, M. (1962). Reaction time as a measure of intersensory facilitation. Jour­nal ofExperimental P[Ychology, 63,289-293.

Hirsh, 1. J., & Sherrick, C. E. (1961). Perceived order in different sense modalities.Journal ofExperimental P.rychology, 62,423-432.

115

Page 130: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

116 REFERENCES

Isaac, E., & Tang,]. (1994). What video can and can't do for collaboration: a casestudy. Multimedia Systems, 2, 63-73.

Jokeit, H. (1990). Analysis of periodicities in human reaction times. Natunvissen­schaften, 77, 289-291.

Kohfeld, D. L. (1971). Simple reaction time as a function of stimulus intensity indecibels of light and sound. Journal o/Experimental P[Ychology, 88,251-257.

Kouvelas, 1., Hardman, V., & Watson, A. (1996). Lip Synchronisation for Use Over theInternet: Ana!ysis and Implementation. IEEE Globecom'96, London UK.

Krueger, H. (1994). Wahrnehmung und Be.ftndlichkeit ins richtige Lichtgeseli!. 11. Gemein­schaftstagung der Lichttechnischen Gesellschaften der Schweiz, Deutschlands,der Niederlande und Ostereichs, Interlaken.

Kiindig, A., Zuberbiihler, H. J., & Braun, A. (2001). QoS User Expectations: State 0/theArl- Kry Parameters - their Relevance and their Determination (QED-R-2). ZUrich:ETHZ / TIK, IHA.

Kurmann, H. (1997). On the Emulation o/Impairments inATM-Networks. PhD Thesis,Swiss Federal Institute of Technology, Zurich.

Lejeune, H. (1998). Switching or gating? The attentional challenge in cognitive mod­els of psychological time. Behavioural Processes(44), 127-145.

Lewkowicz, D. J. (1996). Perception of auditory-visual temporal synchrony in humaninfants. Journal o/Experimental P[Ychology: Human Perception and Performance, 22(5),1094-1106.

Limpert, E., Stahel, W. A., & Abbt, M. (2001). Log-normal distributions across thesciences - keys and clues. BioScience, 51, 341-352.

Longuet-Higgins, H. C. (1968). Holographic model of temporal recall. Nature, 217,104.

Madler, c., & Poppel, E. (1987). Auditory evoked potentials indicate the loss of neu­ronal oscillations during general anasthesia. Natunvissenschaften, 74,42-43.

McDonald, J. J., & Teder-Salejarvi, W. A. (2000). Involuntary orienting to sound im­proves visual perception. Nature(407), 906-908.

McGrath, M., & Summerfield, Q. (1985). Intermodal timing relations and audio­visual speech recognition by normal-hearing adults. J. Acoust. Soc. Am., 77(2),678-685.

Miall, C. (1996). Models of neural timing. In M. A. Pastor & J. Artieda (Eds.), Time,Internal Clocks and Movement (pp. 69-94). Amsterdam: Elsevier Science B.Y.

Page 131: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

REFERENCES

Miller,]. 0., & Low, K. (2001). Motor processes in simple, go/no-go, and choice re­action time tasks: a psychophysiological analysis. Journal ofExperimental P[Ychology:Human Perception and Performance, 27, 266.

Molholm, S., Ritter, W., Murray, M. M., Javitt, D. C, Schroeder, C E., & Foxe,J. J.(2002). Multisensory auditory-visual interactions during early sensory processingin humans: a high-density electrical mapping study. Cognitive Brain Research, 14,115-128.

O'Conaill, B., Wittaker, S., & Willbur, S. (1993). Conversations Over Videoconfer­ence: an Evaluation of the Spoken Aspects of Video-Mediated Communica­tions. Human-computer interaction, 8, 389-428.

Odlyzko, A. M. (2000). Internet Growth: Myth and Reality, Use and Abuse. iMP: In­formation Impacts Magazine(November).

Oviatt, S., & Cohen, P. R. (2000). Designing the User Interface for MultimodalSpeech and Pen-Based Gesture Applications: State-of-the-Art Systems and Fu­ture Research Directions. Human Computer Interaction, 15, 263-322.

Pandey, P. C, Kunov, H., & Abel, S. (1986). Disruptive effects of auditory signal de­lay on speech perception with lipreading. The Journal ofAuditory Research, 26, 27­41.

Pentland, A. (1980). Maximum likelihood estimation: The best PEST. Perception &P[Ychophysics, 28(4), 377-379.

Poppel, E. (1971). Oscillations as possible basis for time perception. Studium Generale,24,85-107.

Poppel, E. (1978). Time Perception. In R. Held & H. Leibowitz & H.-L. Teuber(Eds.), Handbook ofSensory Physiology CV01. VIII: Perception, pp. 713-729). Berlin:Springer.

Poppel, E. (1986). Neuronal oscillations in the brain. Discontinuous initiations ofpursuit eye movements indicate a 30-Hz temporal framework for visual infor­mation processing. Natunvissenschaften, 77,289-291.

Poppel, E. (1994). Temporal Mechanisms in Perception. International review ofneurobi­ology, 37, 185-202.

Poppel, E. (1997a). Grenzen des Bewusstseins. Frankfurt am Main: Insel Verlag.

Poppel, E. (1997b). A hierarchical model of temporal perception. Trends in CognitiveScience, 1(2), 56-61.

117

Page 132: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

118 REFERENCES

Ranta-aho, M., Wilkins, M., & Egloff, P. (1998). JUPITER -Joint Usabiliry, Performabil­iry and Interoperabiliry Trials in Europe: EURESCOM: European Institute for Re­search and Strategic Studies in Telecommunications.

Rothlisberger, U. (1998). The Architecture ifan Interactive Multimedia Communication Sys­tem. PhD thesis, Swiss Federal Institute of Technology, Zurich.

Ruesch, J., & Bateson, G. (1951). Communication: The SocialMatrix ifP.rychiatry. NewYork: W.W. Norton & Co.

Sanders, A. F. (1998). Elements ifHuman Performance: Reaction Processes and Attention inHuman Skill. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Schwender, D. e. a. (1994). Anasthetic control of 40-Hz brain activity and implicitmemory. Consciousness and Cognition, 3, 129-147.

Short, J., Williams, E., & Christie, B. (1976). The socialp.rychology iftelecommunication.London: Wiley.

Smith, R. L., Richetto, G. M., & Zima, J. P. (1972). Organizational behaviour: an ap­proach to human communication. In R. W. Budd & B. D. Ruben (Eds.), Ap­proaches to Human Communication (pp. 269-289). New York: Spartan Books.

Steinmetz, R. (1996). Human Perception ofJitter and Media Synchronization. IEEEJournal on Selected Areas in Communications, 14(1),61-72.

Stern, L. W. (1897). Psychische prasenzzeit. ZeitschriJtfur Prychologie und Physiologie derSinnesory,ane, 13, 325-349.

Sternberg, S. (1966). High-speed scanning in human memory. Science, 153,652-654.

Stone, J. v., Hunkin, N. M., Porrill, J., Wood, R., Keeler, V., Beanland, M., Port, M.,& Porter, N. R. (2001). When is now? Perception of simultaneity. Proceedings Bio­logical Sciences: The Rqyal Sociery, 268,31-38.

Stone, M. A., & Moore, B. C. (1999). Tolerable hearing aid delays. 1. Estimation oflimits imposed by the auditory path alone using simulated hearing losses. Ear andHearing, 20(3), 182-192.

Summerfield, Q. (1992). Lipreading and audio-visual speech perception. PhilosophicalTransactions ifthe Rqyal Sociery ifLondon, Series B: Biological Sciences, 335(1273), 71­78.

Thomas, E. A. c., & Brown, 1. (1974). Time perception and the filled-duration illu­sion. Perception & P.rychophysics, 16, 449-458.

Page 133: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

REFERENCES

Treisman, M., Faulkner, A., Naish, P., & Brogan, D. (1990). The internal clock: evi­dence for a temporal oscillator underlying time perception with some estimatesof its characteristic frequency. Perception, 19(6), 705-743.

Treutwein, B. (1995). Adaptive Psychophysical Procedures. Vision Research, 35(17),2503-2522.

Van Hoesel, R, Ramsden, R, & Odriscoll, M. (2002). Sound-direction identification,interaural time delay discrimination, and speech intelligibility advantages in noisefor a bilateral cochlear implant user. Ear and Hearing, 23(2), 137-149.

Vaughan, H. G., & Arezzo,J. C. (1988). The neural basis of event-related potentials.In T. W. Picton (Ed.), Human Event-related Potentials, Handbook ofElectroencephalo­grapf:y and Clinical Neuropf:ysiology (Revised Series ed., Vol. 3, pp. 45-96). Amster­dam: Elsevier.

von Steinbiichel, N., Wittmann, M., & Poppel, E. (1996). In M. A. Pastor & J.Artieda (Eds.), Time, Internal Clocks, and Movement (pp. 281-304): Elsevier.

Watzlawick, P., Bavelas, J. B., & Jackson, D. D. (1967). Pragmatics ofHuman Communi­cation. New York: W.W. Norton Co.

Watzlawick, P., & Beavin, J. H. (1966). Einige formale Aspekte der Kommunikation.In B. Badura & K. Gloy (Eds.), S0::dologie der Kommunikation: Eine Textauswahl iJlrEinfiihrung. Stuttgart: Frommann.

Wearden, J. H., & Bray, S. (2001). Scalar timing without reference memory? Episodictemporal generalization and bisection in humans. The QuarterlY Journal ofExpert'­mental Psychology, 54B(4), 289-309.

Wearden,J. H., Philpott, K., & Win, T. (1999). Speeding up and (... relatively...)slowing down an internal clock in humans. Behavioural Processes(46), 63-73.

Weidenmann, B. (1988). Psychische Prozesse beim Verstehen von Bildern. Bern: VerlagHans Huber.

Welch, R. B., & Warren, D. H. (1986). Intersensory interactions. In K. R Kaufman& J. P. Thomas (Eds.), Handbook ofPerception and Human Peiformance, Sensory Proc­essesand Perception (Vol. 1, pp. 1-36). New York: Wiley.

Welford, A. T. (1980). Choice Reaction Time: Basic Concepts. In A. T. Welford(Ed.), Reaction Times (pp. 73-128). New York: Academic Press.

Wilkins, M., & Tuominen, J. (1998). Recommended Network Parameter Valuesfor Accept­ability Tests: EURESCOM: European Institute for Research and Strategic Studiesin Telecommunications.

119

Page 134: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

120 REFERENCES

Wilson, G., & Sasse, M. A. (2000). Do Users Always Know What's Good ForThem? Utilising Physiological Responses to Assess Media Quality. In S.McDonald & Y. Waern & G. Cockton (Eds.), People and Computers XIV - Usabil­ity or Else! Proceedings ifHCI 2000 (pp. 327-339). Sunderland, UK: Springer.

Witherspoon, D., & Allan, L. G. (1985). Time judgments and the repetition effectsin perceptual identification. Memory and Cognition, 13, 101-111.

Yamaguchi, H., Wada, M., & Yamamoto, H. (1986). A 64 kbit/s Integrated VisualCommunication System - New Communication Medium for the ISDN. IEEESelected Areas in Communication.

Zakay, D., & Block, R. A. (1998). New Perspective on Prospective Time Estimation.In V. De Keyser & G. Ydewalle & A. Vandierendonck (Eds.), Time and the Dy­namic Control ifBehavior. Hogrefe & Huber.

Zuberbiihler, H. J. (2002). Rapid Evaluation of Perceptual Thresholds - The Best­Pest Calculator: A web-based application for non-expert users.IIt/p: //IJ'Il'WPD'c!/{)phl'JitJ.et!i:<;. cb / DolJ!J1/oadJ/EapEJ'iJl.pdt:

Zuberbiihler, H. J., Krueger, H., & Kiindig, A. (2003). Deltry Perception Thresholds inHuman-Computer Interaction: Fundamentalsfor CSCW-Applications. GfA - XVII In­ternational Annual Occupational Ergonomics and Safety Conference, Munich.

Zuberbiihler, H. J., Ruegg, S., Krueger, H., & Kiindig, A. (2002). Intermedia Synchroni­sation in Network Design: Using an Adaptive P{Jchophysical Method to Specify the Perceiv­able Audio-Visual Deltry. WWDU 2002 - Work With Display Units: World WideWork, Berchtesgaden.

Zwicker, E., & Feldtkeller, R. (1967). Das Ohr als Nachrichtenempfiinger. Stuttgart: S.Hirzel Verlag.

Page 135: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Glossary

2ZAFC ZAltemative-Forced-Choice. see Forced-Choice procedure.

AAbsolute Minimal detectable amount of stimulation.

Threshold

Application In our context, application describes what kind of processes the end user is try­ing to support when using services of a public network. This interpretation ispurposely wider than the meaning of application program running on some com­puter, e.g. application may also mean that a phone call is made for some spe­cific purpose.

ATM Arynchronous Tran.ifer Mode: High speed packet switching technology usingsmall packets (cells) of fixed-size (48 data +5 header = 53bytes). ATM is alsoknown as fastpacket.

BBandwidth Technically, the difference, in Hertz (Hz), between the highest and lowest fre­

quencies of a transmission channeL However, as typically used, the amount ofdata that can be sent through a given communications circuit.

Best-PEST see PEST

Bit rate Number of binary digits that the network is capable of accepting and deliver­ing per unit of time.

BPS Bits per Second: A measure of the data transfer rate of the data channel

Page 136: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

122

cCircuit­

SwitchedMode

Client

Codec

Compression

CSCW

CtrlControl

DDifferenceThreshold

DVD

GLOSSARY

Operational mode of a telecommunication network where connections are setup from an end system A to any other end system B, with network resourcesreserved in the network for this connection along a fIxed path. Within net­work nodes, a very low delay link is dedicated to each connection, and a fIxedbandwidth (bit rate) is reserved on each link participating in a connection.

A computer system or program, which communicates with another suchwhich provides special services (e.g. a workstation requesting the contents of aflie from a ftle server is a client of the flie server).

Beginning and end point of a videoconferencing system. Codec is an acronymfor compression decompression, compressor decompressor, or coder de­coder. A codec compresses its video and audio input using computed algo­rithms. The compressed signal is adapted for transmission over a particularnetwork.

Mapping sets of bits produced by a source into a smaller number of bits to betransmitted. With compression, the original information content may be re­tained (so-called lossless compression) or reduced (so-called lossy compres­sion). At the receiving side, suitable decompression algorithms restore theoriginal information as far as feasible.

Computer-Supported Cooperative WOrk applications enable real-time collaborationamong geographically-distributed work group members. They typically includeflie transfer, chat, shared whiteboard, application sharing, voice, and video.

A key on a terminal or computer keyboard which modifIes the effect of other(letter, number and some other) keys - in a similar way that the Shift keymakes letter keys generate capital letters

Smallest detectable difference between two stimuli, the just noticeable differ­ence (also calledjnd or Differenz Limen DL)

Digital Versatile Disc is an optical disc technology that holds 4.7 gigabyte of in­formation on one of its two sides, or enough for a 133-minute movie. Withtwo layers on each of its two sides, it will hold up to 17 gigabytes of video,audio, or other information. DVD uses th-.: MPEG-2 flie and compressionstandard.

Page 137: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

GLOSSARY

FForced­

Choice Proce­dure

H

123

The observer is given two or more observation intervals, one of which con­tains a signal. The observer is required to choose which observation intervalcontained the signal.

HCI Human-Computer Interaction is a discipline concerned with the design, evalua­tion and implementation of interactive computing systems for human use andwith the study of major phenomena surrounding them.

HHI In our context, Human-Human Interaction concerns information exchange be­tween two or more users, over an intermediary computer and/or communica­tion network.

IIEEE Institute rf Electrical and Electronics Engineers (US): Professional society, which

sets standards.

Internet The global collection of interconnected regional and wide-area networks,which use IP as the network, layer protocoL

IP Internet Protocol: The network layer, which describes a packet format for data topass on a TCP/IP network and on the Internet. It is a connectionless, best­effort packet switching protocoL

ISDN Integrated Se17Jices Digital Network: A switched digital network operating in cir­cuit-switched mode. International standard for digital phone and other ser­Vlces.

LLAN Local Area Network: A network spanning a small physical area (e.g. building or

campus) and operating at high speed (typically 10 - 100Mbit/sec)

Layer Communication networks for computers may be organized as a set of, moreor less, independent protocols, each in a different layer (or level). The lowestlayer governs direct host-to-host communication between the hardware at dif­ferent hosts; the highest consists of user applications. For each layer, pro­grams at different hosts use protocols appropriate to the layer to communi­cate with each other. TCP/IP has five layers of protocols; OSI has seven. OSIlayers:

Page 138: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

124

MMethod of

Least Squares

GLOSSARY

physicalconverts data bits (is and Os) into electrical (or optical) signals (specifying sig­nallevels and timing) to allow transfer of data across parts of a network

data linkframes data into packets and checks the data transferred by level 1 to correcttransmission errors (or retransmit lost data), and control the speed and direc­tion of flow of data between end points of the network

networkcontrols addressing and routing of data through the network, controlling con­gestion, negotiating packet sizes and protocols between networks, and ac­counting and billing for data transferred

transportprovides end-to-end data transport between users or processes on differentmachines, interfacing with the network layer to present network connectionsof appropriate types to the higher layers (e.g. an error-corrected point-to-pointchannel, transport of messages without guaranteed delivery, or broadcastingof messages to multiple destinations)

sessionallows higher layers to establish sessions across end-to-end transport links,controlling the direction of communications, providing tokens to regulate op­erations carried out across the link, and synchronising operations

presentationperforms conversion of data between end-systems' internal representations(e.g. ASCII or EBCDIC coding for characters, one's complement and two'scomplement representation of numbers etc) and abstract data structures, ena­bling interchange of data between different systems; and data compressionand encryption

applicationconverts between specific characteristics of end-systems' hardware and soft­ware and virtual models, enabling applications to run between different sys­tems (e.g. general flle-transfer protocols use a model of a flle system which ismapped into specific systems' representations of file's names, format, encod­ing etc; similarly for email, directory lookup, remote job entry, terminal emula­tion etc)

Method for determining particular parameters of a predefined function thatbest fitted a set of data points in which for each point, the Y value of thepoint is plotted as a function of its X value. The method minimises the sum ofthe squared deviations of the Y values from the drawn function.

Page 139: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

GLOSSARY

Maximum­Likelihood

Methods

MPEG

Monte-CarloSimulation

MPLS

N

125

Adaptive procedures for measuring threshold in which the intensity of thestimulus presented on each trial is determined by a statistical estimation of theobserver's threshold that is made from all of the results obtained from the be­ginning of the test run.

Moving Picture Experts Group develops standards for digital video and digitalaudio compression. It operates under the auspices of the International Or­ganization for Standardization (ISO). The MPEG standards are an evolvingseries, each designed for a different purpose. MPEG-2 images have four timesthe resolution of MPEG-1 images and can be delivered at 60 interlaced fieldsper second where two fields constitute one image frame. (MPEG-1 can de­liver 30 noninterlaced frames per second.)

A computer simulation with a built-in random process, allowing for testingdifferent possible outcomes of a hypothesized model.

MultiProtocol Label Switching. A data transfer mode blending the characteristicsofIP and ATM. For a detailed description see e.g. (Armitage,2000).

nAFC n-Alternative-Forced Choice. Psychophysical testing paradigm, in which the ex­perimental subject is forced to choose in which of n possibilities the stimuluslies.

Network A set of interconnected computers, peripherals and terminals. Its purpose isto enable each computing service to be accessed from other computers andterminals. Consists of an ensemble of switching nodes and transmission links;includes for mobile services all entities supporting mobile end-systems roam­ing through different cells of the network or even moving from one adminis­trative domain to some other domain

Network An application available on a network, e.g.: electronic mail, ftle transfer, jobservice transfer or interactive terminal connection.

N-ISDN Narrowband ISDN: Two 64 Kbps channels plus one 16 Kbps signalling chan­nel

oOSI Open ~stems Interconnection: A model developed by ISO (International Organi­

zation for Standardization) to allow computer systems made by different ven­dors to communicate with each other. The goal of OSI is to create a world­wide open systems networking environment where all systems can intercon­nect.

Page 140: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

126

OSI referencemodel

pPacket

Packet­Switched

Mode

Perception

PEST

Positive re­sponse rate lfI

POTS

Pragmatics

Protocol

GLOSSARY

ISO model for communication between equipment and networks - the fa­mous 7-layer model.

A block of information with a defined format containing control informationand data. "Packet" is a generic term used to describe units of data at all levelsof the protocol stack, but it is most correcdy used to describe application dataunits.

Operational mode of a telecommunication network where information is con­veyed in packets of constant or variable length, with packets undergoing tem­porary storage in nodes. Both the resources within nodes and on links are al­located dynamically, such that, on a statistical basis, a better resource utiliza­tion is achieved for bursty traffic. There are two variants of packet mode: (1)with connectionless operation, no network resources are reserved for a par­ticular end-user, i.e. the network is operating in a so-called best-effort mode(no QoS guarantee); (2) with connection-oriented operation, network re­sources are reserved for a so-called virtual connection such that some QoSguarantees (such as sustainable bit rate or limited delay) can be given.

The interpretation of sensory information to produce an internal representa­tion of the world.

Parameter Estimation by Sequential Trials. Adaptive psychophysical testingmethod.

Rate of 'YES' answers in the yes-no paradigm, or rate of correct answers inthe forced-choice paradigm.

Plain Old Telephone Seroice: The service provided by the conventional analoguetelephone network, i.e. circuit-switched analogue connections with a band­width of 3,1 kHz. Its digital equivalent is provided by ISDN.

The study of language seen in relation to its users, branch of semiotics.

A formal description of message formats and the rules two computers mustfollow to exchange those messages. Protocols can describe low-level details ofmachine-to-machine interfaces (e.g. the order in which bits and bytes are sentacross a wire) or high-level exchanges between allocation programs (e.g., theway in which two programs transfer a flle across the Internet).

Page 141: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

GLOSSARY

Q

R

127

QoS Quality l?! Service: Formal definition of quality for some specific telecommuni­cation service, using specific parameters. A certain QoS may be agreed by anetwork user and the network operator at different instances and for differentdurations, i.e. its validity may be limited to a connection (or even only partthereof), or it may be the subject of a so-called service level agreement. For a de­tailed description see (Fluckiger, 1995).

Response Bias

Retinotopy

Return tripdelay

sSemantics

Semiotics

Sensation

SMS

SourceCoding

Syntax

A tendency for the observer to favour one response over another, which isdetermined by factors other than the intensity of the stimulus.

The notion that receptor cells in the retina are mapped to points e.g. on thesurface of the visual cortex.

The elapsed time between the emission of the first bit of a data block and itsreception by the same end-system after the block has been echoed by the des­tination end-system.

The study of meanings, branch of semiotics.

The science of signs and/or sign systems.

Process of detecting a stimulus or some aspect of it.

Short Message Service: An E-Mail service with very limited capabilities offered inthe framework of the GSM mobile phone system.

Bringing the raw information produced by a source into a form suitable fortransmission. Usually involves A/D conversion and may involve compression.

The rules by which signs are combined to make f'tatements, branch of semiot­iCS.

Page 142: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

128

T

GLOSSARY

Threshold In our context, the term threshold describes what elsewhere is referred to asEmpirical or Statistical Threshold: The intensity of a stimulus required for a speci­fied level of performance by an observer. Examples are the intensity of thestimulus corresponding to reporting the stimulus 50% of the time in theyes-noparadigm, or correctly detecting the stimulus 75% of the time in a 2APeparadigm. See also Absolute Threshold, and Difference Threshold.

TCPlIP TCPlIP usually refers to the suite of transport and application protocols, es­pecially TCP, which run over IP.

Throughput see bit rate

uUMTS Universal Mobile Telecommunications ~stem. UMTS is one of the Third Genera­

tion mobile systems being developed within the framework, which has beendefined by the International Telecommunications Union (ITU) and known asIMT-2000. It seeks to build on and extend the capability of today's mobile,cordless and satellite technologies by providing increased capacity, data capa­bility and a greater range of services.

Underflow A condition that can occur when the result of a floating-point operationwould be smaller in magnitude than the smallest quantity representable. Un­derflow is actually negative overflow of the exponent. For example, a resultless than 10-128 would cause underflow.

vVoIP 10ice over IP. Sometimes called Internet telephony, IP telephony, or Voice

over the Internet (VOl). A category of hardware and software that enablespeople to use the Internet as the transmission medium for telephone calls. Forusers who have free, or fixed-price Internet access, Internet telephony soft­ware essentially provides free telephone calls anywhere in the world. There aremany Internet telephony applications available. Some come bundled withpopular Web browsers, others are stand-alone products.

wWeber's Law Says that the size of the just noticeable difference (see also Difference Threshold)

is a constant proportion of the original stimulus value.

Page 143: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

GLOSSARY 129

WAN Wide Area Network: Network extending over large distances/area (typically 10- 1000 km) operating at relatively slow speeds (10 kbit/s -10Mbit/s)

WWW World Wide web. Hypertext-based distributed information service, created byresearchers at CERN in Switzerland; WWW uses the HyperText Markup Lan­guage (HTML) for its formatting and interfaces for various systems are avail­able Users may create, edit or browse hypertext documents.

yyes-no Psychophysical testing paradigm, in which an experimental subject has to an­

swer after each presentation if she/he detects the stimulus. The presentationscontain a predefined percentage of stimuli.

Page 144: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Seite l.eer /Blank leaf

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

I

Page 145: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

Index

2AFC 54 66 77 121, , ,30-ms-hypothesis 473-seconds-hypothesis 45accumulator 52action potential 49adaptive psychophysical procedure 55, 59affective judgement 20ambiguity 29amplifying principle 11anaesthesia 48ARES 75arousal 52ATM 9 74 75 121, , ,attention 45, 52attribution 20atttention 96audiometer Bosch ST10 68axon 49backchannel 95background noise 68background traffic 75bandwidth 10, 121beat frequency 50best-PEST 55,60,66, 77, 78bit rate 10, 121b' ..raln aCtlvlty 42CCD-Camera 76circadian rhytm 48circuit-switched 1 9

definition of 122classic conditioning 52client 75, 101,122cochlear nucleus 49coding 29, 38

non-verbal 31, 38verbal 30, 38

11 ..co ectlVlsm 24communication 23

audio-visual. 15, 39

business 13 26dial 'og 34face-to-face 13 39formal :.26informal 26interactive 34interpersonal 24layered model of 24multimodal 32 97. 'pnvate 13taxonomy of 23

=;~f~~:..~~~~~~~ ..~~~~~.::::::::::::::: ~~comparator 53compression 10, 15,122content provider 85cortex

auditory 39, 42 88. al 'vlSU 39, 42,88

CSCW 13, 122culture 24decision criterion 66d f' ..egree 0 mteraCtlVlty 77, 94, 96de-interlacing 76

delah..·..· · ··· ..·· · · ··· · ·15a solute 3, 18, 35, 36, 39, 89interaural 49relative 3, 17,37,39,85ret\lrn trip 3, 127rOClndtrip 3,35,36t\ .lnsit 36

digitiser 76distractor 95distribution

cumulative normal 59gau;ss~an 53, 87, 90IOgJ.stlc 59, 90log-normal 90Poisson 90res~onse 47 51rig t-skewed :.90Weibull 59

DVD 76, 122electroencephalography (EEG) 41

Page 146: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

132

ETHMICS 75event-related potential (ERP) 41,88explicit handover 95facilitation

cross-modal 41intensity 41

false alarm 55false negative 55, 66, 104false positive 55, 66, 104forced-choice 54, 123formality 26,37graphics-card 76human-computer-interaction (HCI)3, 65, 85,

89defInition of 123

human-human-interaction 91human-human-interaction (HHI) 3, 74

defInition of 123individualism 24inflection point 57information

~t~i~:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::~~prosodic 15temporal 52

interactionbasic auditory 74,77,91basic visual 74, 77,91click-visual 89realistic audio-visual 74, 78realistic auditory 74, 78synchronous 43unit of 34voice-visual 89

intermedia synchronisation 3, 37interruption 95invers function 86IP 1,9,123ISDN 1,9,123isometric perspective 44JPEG-decoder 76JPEG-encoder 76labelled lines 49UN 18, 123Landolt-rings acuity chart 68Lingo 66lip synchronisation 3, 37logarithmic transformation 62logistic 56, 87Macromedia 66, 101

INDEX

man-machine interaction 13marginal utility 110masking principle 11maximum likelihood 59,125media richness 95memory

long-term 53reference 53short-term 47,53working 53

mental construct system 20method ofleast squares 70, 79, 82, 124miss 55modality 31,39,53

auditory 39visual 39

mode 66, 102Monte-Carlo simulation 107,125MPEG 2 76, 125MPLS 9,125n-alternative-forced-choice (nAFC) 54, 102necker cube 45network l, 10, 74, 125network planner 85network service 10, 15, 125neural networks 50neuron 49, 50number of turns 95orientation 27, 38

content 28non-person 27,38person 27, 38relationship 28

oscillationsneuronal 47

OSI 24,39,125pacemaker 50, 52, 53pacemaker-switch-accumulator 52packet-switched 1, 9

defInition of 126pattern recogni lon ·· 13, 48perception 126perceptual store ·.· 52population clocks 50population models 50POTS 11, 126pragmatics 31, 126present

abstract connotation of 44subjective 44

Page 147: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

INDEX

pre-test 67, 104processing

high frequency 4610:" frequency 44ffilcrosecond 49

processing timeauditory 40 87

. 1 'Vlsua 40, 87psychometric function 55

elevation 61scaling 61

psychophysics 40d fi ..e In1t1on 19theory 54

puIs 52,53QED 11Quality of Service 1 10

dflr" f 'e In1t1on 0 127reaction time 40

choice 41recognition 41simple 41

responsecorrect 54positive 55, 126yes 55

response bias 90, 127retinotopy 48, 127sampling point 61scaling 76semantics 31,127semiotics 31, 127sensation 31, 127

. 'dsetVlce prov1 er 85shared flat 78sink 35slope 57, 104smallest step size 66, 103SMS 10,127social context 26, 37source

coding oL 10 36 127d di ' ,eco ng of. 36

start value 66, 103step function 58stimulus 54

absence of 54auditory 53

f:ft~::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::~~filled 53intensity oL 53, 55, 56

133

moving 53offset of. 67onset of 67order 67presence of 54producer of 74receiver of 74static 53visual 53,68,88

stochastic observer 107superior olivary complex 49switch 52, 53

ATM 75ethernet 75

SThfLOG 28synchronisation error 3syntax 31, 127telesurgery 85temporal information processing (TIP) 52temporal pattern 14temporal reproduction 45termination criterion 66, 103threshold 128

absolute delay 65, 74acceptance 14, 83definition of. 56difference 92, 103, 122hearing 68measuring 54, 59perception 14,83relative delay 65temporal order 46

throughput 15, 128time

magnitudes of 48perception of 48

timing , 34, 39asynchronous 34synchronous 34

topographic map 49trigger

mouse 69vocal. 68

t-test 70, 79, 82two-alternative forced-choice 54UDP 75UMTS 13,31,128underflow 62, 128videoconference 1,37,38, 75, 78

. al .VISU acwty 68voice-over-IP 1, 128WAN 18,129

Page 148: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

134

Weber's Law 51,128WWW 13, 129

INDEX

yes-no 54, 78, 102definition of 129

Page 149: Rights / License: Research Collection In Copyright - Non ... · user perception and acceptance ofthe Quality ofService (QoS) parameters absolute and relative del(~ys (also referred

About the Author

Hans-Jorg Zuberbiihlerwas born 11. February 1968 inSt.Gallen, Switzerland. Afterprimary school, he completedan apprenticeship at the SwissFederal Laboratories for Ma­terials Testing and Research(EMPA) in St.Gallen. Aftersome years of industrial ex­perience, he studied environmental sciences and er­gonomics at the Swiss Federal Institute of Technol­ogy (ETH) in Zurich. In 1999 he received a mas­ter's degree with a thesis about human motion per­ception and its impact on the acquisition of proce­dural knowledge. Since then he has been employedas a research assistant at the Institute for Hygieneand Applied Physiology at the ETH. His researchinterests comprise the fields of cognitive ergonom­ics, sensory physiology and psychophysics as well asmethodical issues.