the qualitative experiment in hci: deﬂnition, occurrences ... · the qualitative experiment in...

The Qualitative Experiment in HCI:Definition, Occurrences, Value and Use

PAMELA RAVASIO, SISSEL GUTTORMSEN-SCHAR, VINCENT TSCHERTER

Swiss Federal Institute of Technology, Zurich, Switzerland

From time to time, the scientific community questions the suitability of traditional research meth-ods in HCI. However, despite this fact, quantitative experiments are treated as the only trulyrepresentative, and thereby valid, method of research. Alternative methods, specifically qual-itative ones, are generally not considered to be representative and therefore are deemed to beunacceptable.

This paper aims to provide evidence that the adoption of the qualitative experiment in HCI isa valid exercise. This method combines the strict design and systematic setting variation fromquantitative studies with the observation, induction and near-to-the-subject insights typical ofqualitative studies.

The questions that arise are: What is a qualitative experiment? What are the differences be-tween a qualitative experiment and a quantitative experiment? How have qualitative experimentsbeen used in the past? Of what use could this be in HCI? An experimental showcase illustratingthe various facets concerned is included to help answer this final question.

Categories and Subject Descriptors: D.2.1 [Software Engineering]: Requirements/Specifications—Elicitation methods; H.5.2 [Information Interfaces and Presentation]: User Interfaces—Eval-uation/methodology, Theory and methods, User-centered design; K.5 [Computing Milieux]:History of Computing—Theory

General Terms: Design, Experimentation, Human Factors, Theory

Additional Key Words and Phrases: HCI Research Methods, Qualitative Experiment, Quanti-tative Experiment, Qualitative Methods, Hypothesis Building, Existence Hypothesis, Cause Hy-pothesis.

1. INTRODUCTION

The qualitative experiment has a long history in modern Social and Natural Sci-ences. Some of the works reported date as far back as the mid-19th century, e.g.[Mach 1861]. However, it was not until the 1980s that the concept of the qualitativeexperiment was formally defined [Kleining 1986]. The qualitative experiment is stilla largely undiscovered approach in HCI despite the fact that it offers a methodicalapproach to investigate a range of problems frequently encountered in explorativeHCI research. Accordingly, the aim of this paper is to (re-)discover the qualitative

This research was partially supported by an ETH World grant.Corresponding Author’s present address: [email protected], Department of FrontierInformatics, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha,Kashiwa, Chiba 277-8561, Japan.Permission to make digital/hard copy of all or part of this material without fee for personalor classroom use provided that the copies are not made or distributed for profit or commercialadvantage, the ACM copyright/server notice, the title of the publication, and its date appear, andnotice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish,to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.c© 20YY ACM 1073-0516/20YY/0900-0001 $5.00

ACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY, Pages 1–24.

2 · Pamela Ravasio et al.

experiment as a formal method for HCI research, explicitly define its methodology(from a historic and methodological perspective), and determine the contexts inwhich qualitative experiments could be used in the field of HCI. Using structuredcharacterization as well research procedure descriptions, the authors will suggestthe types of issues for which this method is suitable and how it can complement‘standard’ methods, such as quantitative experiments or field studies. For thisreason, we will describe already existing work primarily originating from HCI re-search. This paper is therefore not a general discussion on the value of differentmethods available for HCI research, but aims to establish qualitative experimentsas important and useful tools alongside more traditional types of methodology.

The inclusion of different methods in a research process results in each methodcontributing valuable and relevant insights to the issue under investigation, whichmay have been neglected or overlooked had such a variety of methodologies not beenapplied. Often however, research questions are such that the traditional range ofmethods is not readily applicable. Examples of such questions are: What kindsof phases exist in the process of learning to read in children? What types of in-structional contents should be stored in a Learning Content Management Systemfrom the point of view of its users (teachers, students)? What criteria do users ofcommunity libraries utilize to look for books related to a specific topic?

These research questions are similar in that they all state the existence of fac-tors, procedures and processes. Consequently, structures are searched for, ratherthan tested, and may be in the form of any kind of dependency or relationship,etc. An additional similarity between the proposed questions is that none of themcurrently possess a precise enough causal relationship to be formulated. Withoutthis suggested causal relationship there is no opportunity to test a specific causalhypothesis. It should also be noted that the issues suggested by the example ques-tions are too focused to justify research ‘out in the field’. When a scientist ispresented with such questions as those proposed above, qualitative experimentscan be applied to obtain answers.

Consequently, this paper’s contribution is twofold: It provides a structured defini-tion of the qualitative experimental method while identifying research undertakingsin HCI that made successful use of this method.

The remainder of this paper is organized as follows: Section 2 first shows that thequalitative experiment is a method already in use in HCI, and thereafter presents astructured theory-driven approach to the definition of the qualitative experiment.Section 3 describes HCI studies that have applied qualitative experiments (thoughnot explicitly declared as such) and contrasts this method with three other ac-knowledged and widely applied qualitative study approaches. These two sectionsmay be read in any order a reader chooses. Section 4 presents an investigationalshowcase that illustrates various aspects of the use and benefits of the qualitativeexperiment. Finally, Section 5 presents our conclusions.

2. DEFINITION OF THE QUALITATIVE EXPERIMENT:A STRUCTURED APPROACH

A qualitative experiment is:

“The intervention with relation to a (social) subject which is executedACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.

The Qualitative Experiment in HCI · 3

following scientific rules and towards the exploration of the subject’sstructure. It is the explorative, heuristic form of an experiment.” 1

[Kleining 1986]

The term ‘qualitative experiment’ itself, was first explicitly used in [Mach 1905,p.204].

Therefore, the research process in the qualitative experiment aims to discoverstructures, circumstances, relations, connections and dependencies that are partic-ular and characteristic of the subject matter under research. It is heuristics thatdistinguish qualitative experiments from quantitative experiments. Quantitativeexperiments aim at testing existing hypotheses and calculate causal numericallyascertainable relations. In contrast, qualitative experiments serve to make ob-servations deducible from our senses based on an experimental setting (not frominstruments, numbers and calculations) and draw conclusions on the facts actuallyobserved.

2.1 Research Methods in Use: An Overview

With respect to research methods used in HCI, the question that immediatelyemerges is: “Which methods are widely known and therefore used and documentedby HCI researchers and practitioners?”

To answer this question, we analyzed two complete first volumes of recent CHIconference proceedings [CHI 2002; 2004]. The analysis focused on the types andfrequencies of the methods applied. Table I lists our findings. Due to the fact thatmany papers analyze more than a single study, the total number of studies exceedsthe total number of papers in the volumes.

While research methods are usually well defined with respect to procedure, themethod names referred to by the authors of the analyzed articles varied, even whendescribing procedures of the same or very similar characteristics. In order to classifyeach method encountered, the definitions as listed in Tables II and III were utilized.

The analysis revealed that alongside widely known and acknowledged methods,one method used is consistent in its characteristics with a method that thus faris almost exclusively known and applied in the Social Sciences: the qualitativeexperiment. In the context of HCI, however, to date, this method is lacking astructured definition and usage descriptions.

Therefore, the questions that arise are:

—What is the qualitative experiment from a theory driven point of view?

—How does the qualitative experiment in HCI compare to other methods used inHCI?

—Why is the qualitative experiment well-suited to be applied in HCI?

These questions are discussed in the remainder of this and subsequent sections.

1In the original German Version: “Das qualitative Experiment ist der nach wissenschaftlichenRegeln vorgenommene Eingriff in einen (sozialen) Gegenstand zur Erforschung seiner Struktur.Es ist die explorative, heuristische Form des Experiments.”

ACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


Table I. Method types found to be applied in the CHI2002and CHI2004 proceedings (sorted by frequencies in descendingorder).

Method Frequency

Quantitative Methods

Performance Study 63

Questionnaire (all types) 25

Model Evaluation 15

Quantitative Experiment 12

Usage Study 9

Qualitative Methods

Interviews 31

User Study 17

User Observation 13

Usability Evaluation 9

Case Study 4

Public Trial 4

Field Study 2

Other

Qualitative Experiment a 7

Single Occurrences 6

Total Number of Studies 217

No Studies

System Developments 11

Theory Presentations 5a As defined and described in the remainder of this article.

2.2 The Methodical Context of Qualitative Experiments

The qualitative experiment has been formally defined for the disciplines of Sociologyand Social Psychology through Kleining’s analysis of scientific methods as derivedfrom everyday life and from the interaction of the researcher with the object ofresearch [Kleining 1986]. While both everyday (i.e. non-scientific) and quantitativemethods leave room for more active and passive means of intervention for discovery,there is relatively little appreciation for active methods (stressing experiment asopposed to observation) in qualitative research.

In order to establish qualitative experiments among other widely used and ac-knowledged HCI methods, Tables II and III define other frequently used referenceresearch methods in HCI. Tables IV, V and VI hereafter summarize the character-istics of the qualitative experiment alongside HCI methods.

A part of the methodical gap left by quantitative methods has been filled moreextensively over the past decade by qualitative methods such as field studies andinterviews. However, while field studies provide an overwhelming variety of infor-mation on a task and its natural context, interviews fall short in that they representa verbal description which may or may not coincide with the actual work procedureperformed.

The qualitative experiment, therefore, reduces the impact of the original context(as practised by quantitative methods) and combines it with a degree of detail in thedata collected normally only provided by traditional qualitative methods. However,this phenomenon is accompanied by costs as the researcher has to deal with bothACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


Table II. Quantitative Methods’ definitions as applied in the CHI2002 and CHI2004 proceedingsanalysis (in order of frequency of occurrence).

Method Quantitative Methods: Definitions as applied in this paper

PerformanceStudy

Studies that employ a treatment driven by the goals of a developed system oruser-interface (UI), performance measures, experimental units, with or with-out the use of random assignment, to create the comparison from which theefficiency in user task performance due to the (novel) system in use is inferred,always in comparison to an alternative or antecedent system or UI (after [Cookand Campbell 1979]).

Questionnaire Individuals are mailed written material to be completed, or invited to respondto a series of items about some product or topic [...]. [Neale and Liebert 1986]

ModelEvaluation

Studies that employ theory driven treatment, causal quantitative outcomemeasures, and experimental units, in order to collect data against which amodeling algorithm for cognitive processes is compared. (after [Cook andCampbell 1979]).

QuantitativeExperiment

Studies that employ theory driven treatment, causal quantitative outcomemeasures, experimental units, and use random assignment to create the com-parison from which the treatment-caused change is inferred (after [Cook andCampbell 1979]).

Quasi-Experiment

Studies that employ treatments, outcome measures and experimental units ofanalysis, but do not use random assignment to create the comparisons fromwhich treatment-caused change is inferred. Instead, the comparisons dependon non-equivalent groups that differ from each other in many ways other thanthe presence of a treatment whose effects are being tested. (after [Cook andCampbell 1979])

Usage Study Large-scale log-file based test-scenario in which a newly designed system is ex-posed to its target users in order to test its resilience under real-use conditions,and if and how it is employed for the tasks for which it was intended.

qualitative data which involves considerably more effort, meticulous explanations,and justification with respect to the data’s detailed treatment.

The outlines presented in Table IV, V and VI illustrate the fact that the qualita-tive experiment embodies a unique combination of features otherwise spread overa range of different methods (qualitative and quantitative) and thereby fills a me-thodical gap: the lack of a method which, while exploratory, still offers a reasonablelevel of internal and external validity simultaneously, as well as a reasonable degreeof control.

2.3 Comparing Qualitative and Quantitative Experiments

A quantitative study aims to test and evaluate a hypothesized theory (sometimesphenomenon), i.e. a quantifiable theoretical model or a hypothesis in an experi-ment. As such, the researchers are required to know exactly which factors will haveto be measured and in what way (operationalization of variable model). The iden-tification and isolation of variables relevant to the hypotheses are the preconditionsfor determining the generally causal conceived dependencies (manipulation of oneor more test-independent variables in order to measure the effects on test-dependentvariables). The control of variables and their testing is therefore a critical issue inquantitative experiments. The data obtained must be quantifiable, all interferinginfluences must ideally be controlled, and experiments are required to be replicable[Greenberg and Thimbleby 1992].

As a consequence, the following holds true for quantitative methods:ACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


Table III. Qualitative Methods’ definitions as applied in the CHI2002 and CHI2004 proceedingsanalysis (in order of frequency of occurrence).

Method Qualitative Methods: Definitions as applied in this paper

(Structured)Interviews

Personal interviews following a strict guideline of questions to be asked, andpossibly also a sampling plan. (after [Neale and Liebert 1986])Unstructured interviews usually follow an associative procedure whose goal isto elicit the relevant information free from strictly pre-specified procedures orquestion guidelines.

User Study Observations conducted ‘in the wild’, though involving an invasive procedure(in HCI, e.g. use of a prototype system) that alter the behavior of the subjectsunder study vis–vis a Field Study. (in contrast to [APHIS 2004])

UserObservation

Observations not conducted ‘in the wild’ but in special circumstances such asa lab. The user is given a single, general goal to be accomplished using anewly designed system. The aim is to test the system’s resilience under real-use conditions, and if and how it is employed for the tasks for which it wasintended.

UsabilityEvaluation

A procedure in which information about the usability of a system is collectedwith the intention of refining the system through modification [USABILITY2005]. It is a collective term for methods such as expert or user walkthroughs,think aloud sessions, etc.

Case Study Intensive study of a single individual (also called ‘case history’). [Neale andLiebert 1986]

Field Study Observations conducted ‘in the wild’, i.e. in the natural habitat. This termexcludes any study that involves an invasive procedure, harms, or materiallyalters the behavior of the subjects under study. (after [APHIS 2004])

PublicTrial

Scenario in which a newly designed system is exposed to tests at public placesin order to receive informal feedback of its quality as perceived subjectively bythe individuals and to test its resilience under real-use conditions.

Table IV. The qualitative experiment belongs to the explorative-inductive method type.

Type Evaluative-deductive Elements of both types Exploratory-inductiveprimarily Transition Stages primarily

QualitativeMethods

Public Trial Interview Case StudyUsability Eval. User Study Field Study

User Observation Qualitative Exp

QuantitativeMethods

Model Eval QuestionnaireLab Exp Usage StudyQuasi-ExpPerformance Eval

Table V. The qualitative experiment’s degree of control in comparison to other standardresearch methods.

Degree of Control Total High Medium Low None

QualitativeMethods

Interview Case StudyQualitative Exp Field Study

Usability Eval Public TrialUser Study

User Observation

QuantitativeMethods

Model Eval QuestionnaireLab Exp Usage Study

Quasi-ExpPerformance Eval



Table VI. The qualitative experiment’s validity characteristics in the context ofother standard research methods.

Validity High External Medium External Low ExternalLow Internal Medium Internal High Internal

QualitativeMethods

Interview Case StudyQualitative Exp Field Study

Usability Eval Public TrialUser Study

User Observation

QuantitativeMethods

Model Eval QuestionnaireLab Exp Usage Study

Quasi-ExpPerformance Eval

—The hypotheses formulate expected dependencies of known factors, which areeach quantifiable by the variables in the experimental set up; i.e. existing struc-tures and behaviors are tested in order to learn about the strengths of their(inter-)dependencies.

—They are well controllable but have a tendency to be highly abstracted andthereby artificial in setting.

—The setting is changed according to the criteria of scientific intervention (treat-ment; conditions) in order to observe systematic changes in the measuring ofoutcome variables.

—Subjects are assigned randomly to different conditions.—Replicatability, i.e. precise enough documentation to allow the experiment’s rep-

etition under identical conditions with identical results, is compulsory.

These characteristics are not as strictly applicable to qualitative experiments.Qualitative experiments’ hypotheses do not formulate relationships between fac-tors but rather the sheer existence of factors, procedures and processes themselves.Variables are not used, but structures are searched for (and ideally found), whichmay include any kind of dependency, relationship, etc., not only causal ones. Ingeneral, qualitative relationships are not immediately or directly quantifiable but re-quire additional and special treatment. Furthermore, replicatability, though clearlydesirable, is not imperative in this context, since real life activities are only repli-catable if they are extensively abstracted. The level of abstraction required forreplicatability is normally higher than the one required for the qualitative experi-ment.

In summary, the following holds true for qualitative experiments:

—The hypothesis formulation does not imply quantifiable factors initially, but in-stead states the existence of factors, processes, procedures, etc.

—The results are only quantifiable after additional special treatment.—It explores presently unknown structures, dependencies and behaviors.—Its basis is as natural a setting as possible and it adapts as required in order to

be sufficiently focused.—The settings are also changed according to the criteria (conditions) of scientific

intervention, in order to observe systematic changes, to find structures, depen-dencies or behaviors.



—Subjects may or may not be randomly assigned to the different conditions.—Replicatability, though desired, is not imperative.

The aforementioned differences imply that while quantitative experiments arebest suited to test models and theories, qualitative experiments are best suited tobuild and complement models and theories in addition to knowledge gained fromliterature.

This strength makes the qualitative experiment particularly useful for the HCIdiscipline, i.e. in a discipline where despite being very model based, innovationis still often derived from completely new technologies with often little immediateprevious experience from which to draw upon.

Figure 1 summarizes the comparison of the qualitative and the quantitative ex-periment graphically.

Observation Induction'Cause' -

HypothesesDeduction Test

Qualitative Experiment Quantitative Experiment

'Existence' - Hypotheses

Structured Setting for Observation

Collection of Unstructured Data

(Manual) Analysis of Data

'Cause' - Hypotheses

Deduction

Operational Variables

Structured Observation

Data collection via

Operational Variables

Model Testing

Verification of Hypotheses

Verification of Hypotheses

Induction

Theory / Literature / Observation Theory / Literature / Observation

'Existence' -

Hypotheses

Initial Structuring of Data

Fig. 1. Relationship between the qualitative and the quantitative experiment.

Given its inductive and exploratory characteristics, the qualitative experimentcan be used most beneficially in two types of HCI settings:

—(1) - In research, when quantitative studies cannot help, e.g. “for what type oftasks do individuals use a computer mouse when it was not designed for thatparticular task?”, or early in the research process, when models and theory arebuilt. In these circumstances, the qualitative experiment supports the incre-mental construction of a knowledge-framework which is otherwise only partiallyattainable through literature research. Important influencing factors for a topicunder research can be identified, which may not or may only marginally be foundin literature to date. These factors then influence the actual design and the con-duct of, e.g. quantitative studies, at a later stage.



—(2) - In engineering and product design, where more detailed insights into thetask specific mental model of the user are needed. It is particularly useful forsystems that are completely novel, either with respect to the functionality or tothe user interface offered. In this context, the qualitative experiment is a methodthat provides a similarity akin to field studies but without the involvement of somany contextual factors. Executed correctly, it serves to close the knowledge gapsthat remain despite extensive use of traditional observational and prototypingmethods.

Section 3 illustrates these two points through a summary of the studies that havebeen undertaken using what can be considered a qualitative experiment.

2.4 Research Strategies in Qualitative Experiments

In quantitative studies, the test-independent variables are systematically altered inorder to observe the effect on the test-dependent variables in different experimentalconditions. A somewhat similar process exists in the qualitative experiment. Here,the alterations applied to the experimental setting follow one of six possible strate-gies: Separation or segmentation, combination, reduction or attenuation, adjectionor intensifications, substitution and finally, transformation. Each of these strategiesis explained hereafter in accordance with its original definition.

—Separation/Segmentation:The topic under research is either partitioned into various sub-topics or a desig-nated part of the topic is isolated from the whole. The focus question is: Howdoes the research object change through this treatment?

—Combination:Subject-matters are combined in a new way with others. The focus questionsare thereby: To what extent are the different subject-matters conflicting, indif-ferent, or compatible with each other? What new effects arise throughout thiscombination?

—Reduction/Attenuation:Stepwise, individual parts or functionalities of the subject-matter under researchare removed or attenuated. The focus question is: To what extent does thisprocedure affect the subject-matter under research as a whole?

—Adjection/Intensification:Opposite procedure of reduction/attenuation. Individual parts or functionalitiesof the subject-matter under research are added or intensified. The focus questionis again: To what extent does this procedure affect the subject-matter underresearch as a whole?

—Substitution:Individual aspects (parts) of the subject-matter under research are replaced bynew ones. The focus question is: When does a small substitution have a largeimpact, or a large substitution a small impact, respectively?

—Transformation:The subject-matter may be transformed as a whole, and only a number of selectedold attributes are kept. The focus question is: How does this change affect of theperceived qualities of the subject-matter under research?



2.5 The Qualitative Experiment’s Origins:Prominent Examples from the Social and Natural Sciences

The topic of this paper is focused on the qualitative experiment and its contextand use in HCI. However, earliest application examples of qualitative experimentsstem from the social and natural sciences. The examples provide a good picture ofthe application range and therefore merit a brief look, particularly since most hadan important influence on contemporary HCI research.

One of the first reported applications of a qualitative experiment was Euclid’s(300 B.C.) discovery of the rectilinearity of light beams and the identicalness of alight beam’s angle of incidence and its angle of reflection. This insight was merelyassisted by observations and the creation of situations (i.e. experiments) that would(re-)produce the fact. However, it was not until Descartes (1596-1650) that Euclid’sdiscoveries and observations were quantified [Mach 1921].

[Duncker 1926]’s aim for the use of qualitative experiments in his study was toinvestigate the reason why and how novel insights into a problem occur against abackground of previously non-existing evidence that could lead to the deductionof such an insight (the “Aha-effect”). One example of such an insight is Newton’sdiscovery of the planets’ circular motions as it was not deduced from any priorexisting theories but fully explained widely observed phenomena. For his exper-iments, Duncker presented five individuals with approximately 20 problems eachand asked them to solve them. The problems presented were new to the subjectsbut for each problem, the information necessary to work out a solution was given(though clearly not explicitly). Duncker was not specifically interested in any kindof performance in the form of grading or comparisons of solutions or reaction times,etc. He made the participants understand that his focus was on their thinking, theirbehavior, their trials, on whatever came into their minds.

[Katz 1953] (translation and comment by [Costall and Vedeler 1992]) used aqualitative experimental approach to investigate whether blind children’s drawingsrevealed the level of their intellectual development, as was true in the case of seeingchildren. A drawing device, an artefact, was employed, which is commonly used forteaching geometry to blind children. Strings were attached to a wooden frame whichin turn was covered with a special kind of robust paper on one side. The wholeconstruction resembled a tangible version of squared paper. With a special pen, ablind person was then able to draw or write on the paper with the string-squareson the opposite side of the paper which served to aid orientation. The special penand paper created a deepening on the writing side, instead of a visual display of thedrawing, and thereby a heightening on the other side of the paper. The heightenedside is the one used in observation of the final product. In separate experimentalsessions, Katz asked a total of 30 blind children, between 12 and 18 years old to a)draw a fantasy drawing (13 subjects), b) recognize what they themselves had drawnearlier (30 subjects) c) recognize a drawing drawn by a fellow child (10 subjects) d)draw four objects (small four-legged table, three-sided pyramid, three-sided prism,cylinder) given to them as models (3 subjects). Additionally, the drawings used inc) had been shown (in a visual or tactile manner) to seeing persons as well. However,these seeing persons were only occasionally able to recognize (first visually, thentactilely) what had been drawn by the blind persons. In his experiments, KatzACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


was not only able to show fundamental conceptual differences in the drawings ofblind children as compared to seeing children, but also that the quality of blindchildren’s drawings mirrored intellectual maturity. An added value of his resultswas the proposition of a didactic to teach blind children the projective drawing of(simple) spatial objects.

One of the most widely cited and at the same time most influential uses of quali-tative experiments have been reported by Piaget, whose work was ground-breakingfor developmental psychology. In his experiments, Piaget analyzed various factorsof children’s development and the learning process of basic behavioral capabilities.For example, in his situation Nr. 8 (Chapter 1.1 of [Piaget 1959]), a breast-feedingchild’s head was repeatedly taken away from its position at the mother’s breast,and re-positioned some five centimeters away from the nipple. The child’s searchprocess for the nipple could systematically be observed through this procedure.

2.6 Preliminary Discussion and Conclusion

The methodical principles underlying the qualitative experiment presented up tothis point in this chapter are not entirely new. Its value has been discovered earlyin the history of modern science, although it has unfortunately been neglectedover time. The qualitative experiment as a structured, investigative procedure wasprominently employed in the past in the social sciences, and to a lesser extent alsoin the natural sciences, as the aforementioned cited examples demonstrate.

In HCI research, the use of qualitative experiments is a relatively new phenom-enon. For instance, the authors of the works mentioned in the next section wereconsistently obliged to describe their undertakings in a more in-depth manner thanwould probably have been required if they had used another approach. Therefore,the goal of this section was to provide detail about the qualitative experiment be-yond the mere dissemination of knowledge about research undertakings that haveused it.

To build on this background, the subsequent section not only summarizes variousresearch undertakings that made use of qualitative experiments, but also tries topoint out the similarities and differences between qualitative experiments and userobservations, field studies and case studies.

3. THE QUALITATIVE EXPERIMENT IN HCI RESEARCH

While qualitative experiments were initially only known in the natural and socialcciences, they have also been used in HCI. In literature, qualitative experiments inHCI were relatively scarce until recently. However, the qualitative experiment stillremains unrecognized as a valid research method, and therefore still lacks a code ofpractise. The few authors who have made use of it make marked attempts to justifyand legitimize its use and have therefore provided extremely detailed explanationsof the concrete procedure applied within their settings. In this section, we willtherefore undertake a chronological look at a selection of recognized representativeHCI publications that applied qualitative experiments in their research. However,none of the authors mentioned hereafter refer to his methodology explicitly as a‘qualitative experiment’.



3.1 Qualitative Experiments as applied in Human-Computer Interaction

Alan Turing’s imitation game (later termed the “Turing Test”) was probably thefirst explicit use of an experiment where the decisive factor of insight was qualitative[Turing 1950] (for a detailed review of the Turing Test, see [Saygin et al. 2000]).The Turing Test’s imitation game is traditionally played by three parties: (A) ahuman person, (B) a computer, and (C) a human interrogator. (A) and (B) arein a room separated from (C). By communicating through a teletype connection,the interrogator tries to determine which of his two partners in other the room ishuman and which is a computer. As such, the question is: Can (B) be made to playthe part of (A) satisfactorily, so that (C) mistakes one for the other? Though thenumber of times that a computer passes or fails the Turing Test is assessable, thejudgment of the interrogator (C) is not. The interrogator’s judgment is a purelyqualitative result. The setting and procedure of the imitation game are strictlydefined; consequently, the whole exercise must still be considered an experiment.

Another widely cited use of a qualitative experiment is [Malone 1983]’s searchtask. At the end of each of a series of user interviews, he provided his intervieweeswith the task of searching their own offices for a small set of their own documentschosen beforehand by their co-workers. The purpose was to observe the differentprocesses the subject used to find the documents and to what extent the processeswere influenced by information coded in the physical locations of the documents.

Similarly, Carroll used an approach in his investigations of electronic file namingwhich can be considered a qualitative experiment [Carroll 1982]: Twenty-two staffmembers of a scientific research center were asked to annotate file listings of theirown personal files. The result were analyzed for occurring patterns and by way of aspeech analysis for abbreviation strategies. Carroll’s results demonstrated that theoutcome of a qualitative experiment may be the basis for additional quantitativeanalysis if needed.

Additionally, [Kwasnik 1991], made use of a qualitative experimental approachwith respect to personal information system’s research. Subsequent to an interview-based ‘guided tour’ around the participants’ offices and a follow-up think-out-aloudsession, in a third session, four of the eight subjects were asked to save a fewdays’ worth of mail and documents that had been recently used. Kwasnik thenused the descriptions and rules discovered in the first and second sessions in anattempt to sort the pile of documents in the same way the subjects might have done.The subjects were asked to provide comments on the accuracy of the researchers’decisions and on the reasons for any errors committed. Therefore, the experimentserved to discover differences in actual classification practises from the rules thathad previously been explained orally.

[Carlyle 1999] studied document representations in library catalogues, to findout the descriptors that users at a community library would utilize to search forworks. She applied a procedure called ‘free-sort’ with the aim of discovering howusers grouped documents related to a particular literary work. In her study, fiftypersons were asked to sort 47 items related to Charles Dickens’ A Christmas Carolinto different categories. The location of the experiment was a shopping mall and‘anybody’ over the age of 18 could participate. The experiment was recorded inwriting by the researchers in order to catch all the categories applied by the usersACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


as well as their reasons for creating them. The recordings were coded by indepen-dent researchers and the coding was analyzed through simple frequency statistics.Eventually, 11 categories were found to be relevant for the average public user ofcommunity libraries to categorize literary works and related resources. This appli-cation of a qualitative experimental procedure again shows that the results can besubject to quantitative analysis beyond the ‘mere’ discovery of user categories.

The study on the value of paper archives for the office environment reported by[Whittaker and Hirschberg 2001] must also be considered as utilizing the qualitativeexperiment. In this study, an office removal was the point of departure and therebyformed the experimental setting. It was the researchers’ goals to discover bothrelevance values in the process of keeping or discarding physical pieces of informa-tion, as well as office organization criteria. Here, qualitative and also quantitativedata could be collected, which again proves that while the original idea was todiscover attributes of indispensable pieces of information, parts of the results couldbe well-suited for quantitative analysis.

According to what criteria do users rate authors of usenet newsgroups and readposted messages? This question was asked by [Fiore et al. 2002]. The user data wasobtained through a lab reading session, during which time the users read their ha-bitually visited news groups. The researchers recorded the procedures followed andthe authors selected in writing. The findings and topics obtained were then exposedto the users first in an informal interview, and then in a series of (formal) questionsabout their author and message selection habits, procedures and criteria. Fromthese results, a qualitative and quantitative framework was defined and comparedagainst automated behavioral metrics. The results of the qualitative experimentalpart in this case were almost immediately feedable into follow-up investigations.Furthermore, these results were also subjected to extra quantification treatment.

The second of [Pirhonen et al. 2002]’s studies had for its design

“[...] two somewhat contradictory requirements: Firstly, the experimentshould resemble field conditions as much as possible. Secondly, the set-ting should allow high quality video filming with two cameras (close-upand over-all views) for good quality data collection.”

The study’s goals were twofold: On the one hand, a UI for a mobile music playershould be tested; on the other hand, the researchers should also focus on the par-ticipants’ conceptions of the device and the procedures they applied to actually usethe device while moving. Real-life conditions were simulated in the laboratory usinga mini-stepper and the actions were videotaped from several perspectives. Hence,while half of the results was intended for an evaluative, deductive analysis of thedesigned UI device, the other half of the results was focused on gaining inductiveinsights about the metaphorical picture users had “in their minds” when using amobile music player, and their approaches in manipulating the device while movingand unable to look at it. The results in any case were not as hybrid as one wouldexpect but are relatively well separable into an evaluative deductive componentand an inductive component, the latter being what can be considered a qualitativeexperiment. The paper therefore proves that given the right research questions,the qualitative experiment can be combined well with other methodical researchundertakings.



A chemist’s lab book is an artefact for which attempts have repeatedly been madeto try to replace it with a computerized version. So far, research and commercialintents have failed. [Schraefel et al. 2004] nevertheless presents us with a report onefforts to design an electronic lab book. While most of the methods applied arewidely acknowledged (such as scenario-based design, story telling), one approachcalled “Tea Making”, should be qualified as a qualitative experiment. The “TeaMaking” method consisted of moving the domain of expertise (chemical experi-ments) to one understood not only by the chemists, but also by non-chemistry-experts (to make tea in this case, hence the name), and applying working processesand artefacts from the chemical laboratory to this domain. This experimental pro-cedure led to the achievement of two points: a time-scale reduction from the fewdays or weeks typically required for a chemical experiment to about an hour; andthe usage of a vocabulary and comparative matter commonly understood (by theinvestigating researchers as well). Therefore, this procedure allowed for both thecollection of basic habits and attributes that a lab book supported in its physicalversion and that would need to be supported in an electronic version, and for theunderstanding of the underlying procedural paradigms of chemical experiments.The resulting design of a computerized lab book did not (visually) resemble itsphysical counterpart but supported the type of procedures completed when used inthe lab.

[Czerwinski et al. 2004]’s goal was to characterize how people interweave mul-tiple tasks amidst interruptions. For this purpose, they created a task trackingspreadsheet for their subjects. Among other requirements, the users were to detailtheir tasks when they switched from one task to another. Here, the goal was todiscover the different conceptual levels of task types that were important to them.This also implied eliciting the users’ definitions of what a “task” meant to them.Two researchers then coded the outcomes of the first day and their results showed98% agreement between them. The study results, in their entirety, (dis-)coveredcharacteristics of task types, task shift initiators, as well as difficulties experiencedwhen switching tasks. Based on these results, an initial prototype tool was designedand implemented.

[Paulos and Goodman 2004] were interested in the familiar patterns that comfortindividuals within the seemingly chaotic, crowded landscape of urban strangers.The goal of their qualitative experiment was to identify the properties and phe-nomenon of the Familiar Stranger relationship observed in public places. Theirrealization of the qualitative experiment consisted of two steps. In step one, theresearchers photographed clusters of people in various areas of Constitution Plazain Berkeley. Then, one week later at the same time and on the same day, thosepictures were distributed to all persons present on the Plaza. They were askedto label the pictures as to whether and how they knew the people photographed.The subjects were also asked to complete a questionnaire on relationships to theplaza and attitudes towards public places in general. In step two, four quantifiablefactors were derived from the categories and were characteristics obtained throughthe labeling request, and were reinforced by the results of the questionnaires. Thefactors resulting from this qualitative experiment were then evaluated by accompa-nying users on “walking tours” in to the user familiar city areas. The results of bothACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


steps were integrated into the design of a personal, wearable, wireless device thataimed at capturing and extending the essence of the Familiar Stranger relationship.

The goal of [Sillence et al. 2004]’s study was to find out about factors thatinfluence the users’ evaluation of online information and advice, with a particularfocus on online health sites. Here, the qualitative experiment consisted of foursessions for each of the selected 15 middle-aged women who were asked to searchfor information on the topic of menopause. Two of these sessions were directedto specific sites while two session were free web searches. The verbal protocolswere recorded, as was the group discussion subsequent to each search session. Acontent analysis was performed on the resulting transcripts in order to determinethe factors that influenced the women’s rejection and mistrust or acceptance andtrust, respectively. This study is an example of the fact that while the transcriptanalysis’ results could have been quantified, the qualitative results accentuate thefactors applied in the searches more clearly.

3.2 Difference by Example:The Qualitative Experiment in Comparison to other HCI Methods

In practical research, hybrid method forms often come into use. However, while thismakes sense within the specific application context, considerations, such as thosein this paper, require clear-cut definitions. This is the reason why we included“ideal” definitions as applied in this paper in Section 2.1. It is only in this way thatindividual methods can be distinguished against one another, so that the followingquestion can be asked: What is the principle difference between the qualitativeexperiment and other HCI-applied qualitative methods? This section compares thequalitative experiment with three other well known and widely applied qualitativeresearch approaches: user observation, field study and case study.

It is generally agreed upon that the principle, and therefore the most salientdifference between specific study methods, is their underlying research question.Subsequent attributes such as the exact procedure strongly depend on this question.As a result, we will point out the differences below based on the following studies:

—User observation:[Sugimoto et al. 2004] presented a system supporting face-to-face collaboration byintegrating personal and shared spaces using PDAs for the former and a sensingboard for the latter. Various groups worked on an urban planning task in orderto experience whether the system met the researchers’ ideas of how interactionswould occur.

—Field Study:[Heath et al. 1995] discussed the findings of a videotaped study of a Londoninternational security’s stocks dealing room. In the studied environment, stockswere worked collaboratively by dealer teams of two people.

—Case Study:[Grossman et al. 2002] presented a newly developed system that allowed a de-signer to construct non-planar 3D curves by drawing a series of 2D curves usinga special technique called 2D tape-drawing. Over several hours, a single profes-sional 3D modeler was studied while he tried to accomplish different tasks.



For a qualitative experiment reference study, we use the one published by [Schrae-fel et al. 2004] (design of an electronic lab book) and previously described in theliterature overview in Section 3.1.

3.2.1 Research Questions. The research questions of the studies can be summa-rized as follows:

—User Observation:Do users actually use the private and shared spaces offered while working in agroup on a task? Were there any particular interesting user behaviors in thesystem’s use?

—Field Study:How do people coordinate their work in exigent real world settings?

—Case Study:Is this one user able to achieve the work goals he envisaged using the proposedsystem?

—Qualitative Experiment:What is the relation between the experimental and the recording process, andmore specifically, what is recorded from a chemical experiment and how.

3.2.2 Task. The task to be completed in each of the studies can be characterizedin short in the following ways:

—User Observation:Given the sketch of an existing town on the sensing board, decide on locations forpublic facilities - where to construct highways, railway stations and bus serviceroutes. Take into consideration the town’s finances, the environment, and theconvenience of the residential areas.

—Field Study:Given the type of study, the tasks to be executed were the habitual everydayworking tasks of the videoed persons in the dealing room.

—Case Study:Construction of the primary curves of a Dodge Viper car model.

—Qualitative Experiment:Participating chemists should ‘make tea’ using kitchen equipment initially andthen laboratory equipment. The participants were to treat the procedure as ifit was a real chemical experiment and therefore had to use a paper lab book inorder to record their proceedings and results.

3.2.3 Data Collection and Type. Data collection is one of the important issuesof each study. For the selected studies, the method of data collection and the typeof data collected can be characterized in brief as follows:

—User Observation:Data collection: The working groups were observed and their actions and state-ments were recorded in writing. Also, the participants’ general subjective feed-back was collected.Data type: Qualitative.



—Field Study:Data collection: Multiple video cameras were placed to record the happeningsaround one particular dealing desk. In addition, the proceedings were alsorecorded in writing by the researchers, specifically the use of a range of arte-facts and the immediate personal working environments of each person. Prior tothe analysis, the video tapes were transcribed to include not only spoken text,but also video pictures, artefact protocols and timing sequences.Data type: With the exception of the timing measured between different actions,the data, in its entirety, was qualitative and extremely detailed.

—Case Study:Data collection: First, three 1-hour sessions were assigned to become acquaintedwith the system. Then, one 2-hour session took place in which the designerintended to accomplish the construction task. The information in this study wasobtained by observing the designer completing the construction task, as well asthrough the assistance given to the designer in the initial introductory sessions.Data type: Purely qualitative.

—Qualitative Experiment:Data collection: The participating chemists were asked to execute the experi-mental task and behave as if it was a real chemical experiment. They were askedto comment on their proceedings as well as on the resemblance or dissemblanceto a real-life chemical experiment.Data type: The data obtained was comprised of the lab book entries by theparticipating chemists as well as the minutes of their comments and procedures.The data was therefore qualitative in its entirety.

3.2.4 Preliminary Discussion and Conclusion. From these examples, it is ap-parent how the various types of qualitative studies differ or are akin to one another,both in their scopes as well as in their goals, namely:

User observations have a relatively broad focus. Any observation that accountsfor whether or not a (novel) system is useful to its users and how the system isactually being used is within its margins. Field studies, in contrast, try to definethe behavior and actions of everyday procedures without (further) interference withthe environment. Finally, case studies look at one single user in depth in order tolearn in detail how interaction with a system occurs. The qualitative experimentis different in that it is a study involving a specific task, within a similar thoughadapted environment, to a “habitual” environment, and therefore focused in asimilar manner as a quantitative study.

Furthermore, the type of tasks examined in each of the previously describedstudies shows the following: User observation, field study and case study basicallyrely on the “real life” tasks or goals in their entire complexity, that is, as theyare encountered in reality. In contrast, the qualitative experiment sharpens theperspective by solely reducing the focus to the extract of immediate interest beforean intended change is applied. This can happen - as in the present example - byreducing the complexity of the real life task or goal (chemical experiment in thelab) to a simpler though still complete version (the procedure of making tea aschemical experiment).

However, when it comes to the type of data collected it becomes clear why all ofACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


these methods are called “qualitative”. All of the data is collected in a unalteredmanner through primarily video, tapes and written protocols. Data evaluation andtreatment are one of the important challenges of these methods, even more so thanin quantitative studies. In this sense, the qualitative experiment is no exception tothis rule.

3.3 Qualitative Experiments and HCI: A Potent and Agreeable Combination?

HCI is about the development of optimal computerized support for humans. Thismight imply tackling application areas where user behavior or procedures are ob-servable, but their decision criteria, internalized reasoning for certain procedures,and so forth, are not (yet). It is plausible that not only directly observable crite-ria and procedures are of immediate importance with respect to whether and howcomputerized solutions are widely accepted by users.

[Schraefel et al. 2004] described this situation, as well as the drawbacks encoun-tered when using several qualitative research approaches to tackle it:

“Ethnography yielded the larger context of interactions in the experi-mental environment. This helped us appreciate the context of the labbook-as-artefact within the dynamic nature of the lab. Interviews gaveus a sense of the culture, the rationale for what chemists do, and whothe stakeholders are in the lab book life cycle. [...] Neither of these ap-proaches, however, gave us sufficient insight into the what of the practicesuch that we could build a model of the process to discover either (a)what of the recording practices itself could be translated into a digitalsupport service or (b) what the important experimental attributes ofthe artifact were that should also be translated from analog to digital.”[Schraefel et al. 2004]

Particularly in situations such as the one described in the above quotation, thequalitative experiment can beneficially be put to use. Here, the qualitative exper-iment bridges the gap between known qualitative methods and equally acceptedquantitative ones.

At this point, we have looked at the qualitative experiment from two differentperspectives, namely: the methodological point of view and the application pointof view. The next section will further detail the latter by working its way througha showcase application of a qualitative experiment.

4. THE QUALITATIVE EXPERIMENT IN A SHOW CASE

In this section, we will present in brief the application of the qualitative experimentin a showcase study. It is our intention in this section, to draft the qualitativeexperiment’s goals and procedure, together with the most important decisions tobe taken while preparing and realizing the study. This example’s basic idea is drawnfrom former research of ours [Ravasio et al. 2003]. However, it has been adjustedin order to be more to the point.

4.1 Background: Situation

A new help system for common PC hardware and operating system problems is tobe fundamentally improved. The system presently in use includes a very extensiveACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


collection of answers in its knowledge base. There are two ways for the users toaccess the problem database: (1) using a search functionality that matches textualphrases entered by the user against contents stored and indexed; (2) accessing ahierarchical classification scheme where answers are assigned to topics which, inturn, are grouped together in super-topics, etc., until forming a topic hierarchywith various levels.

4.2 Problem: Research Question

Through daily contact with (average skilled, non expert) users, subjective impres-sions of the system are known to be bad. Typical critiques are: “answers turn upjust anywhere”, i.e. answers to problems either turn up under categories where,subjectively, they do not seem to belong, or “I just can’t get sensible answers”, i.e.the search functionality is completely unsuccessful no matter what search term isentered.

During efforts to improve the existing system, the following questions arose:Given a problem, what characteristics and descriptors are used for searching forthe corresponding answers in categories as well as in text? Which of these charac-teristics and descriptors are the most important ones (i.e. should lead to an answerby themselves), and which are the most frequently used ones? Which ones areconsidered principal categories and how can these be organized into a hierarchy?

4.3 Methodical Options

It would be difficult if not impossible to conduct a quantitative study in order to an-swer the questions raised above, as they are currently too open and comprehensive,and lack any concrete causal hypothesis.

Nonetheless, field studies, interviews, and so forth will certainly be of some usein the context of improving the design of such a help system. However, for ourconcrete question - “which are the descriptors that a user applies initially andtherefore should be the principle ones in such a system?” - these methods areonly of limited use. Field studies, for instance, may reveal that there are plenty ofcomplementary problem-solving resources other than the system in question.

Therefore, requirements to be met by the method employed are:

—In the study, the problems for which answers are sought in the system should rep-resent the various classes of computer problems, but not necessarily the frequencywith which they occur in real life.

—The search for an answer is supposed to be as natural as possible, though onlyusing the system, and therefore is without access to other external resources.

—The outcomes have to be analyzable by individual problems that have arisen withthe help system, by problem class, and by characteristics employed.

4.4 Experimental Setting and Task

An qualitative experiment that meets these requirements could have the followingstructure:

—Task:“There are a number of computers in this room, each has a different problem. It



is now your task to fix those problems. To do that, each computer allows youto access the PCHelp system which contains all the answers to any questionsyou may have in order to solve the computer’s problem. You are not allowedto consult any other ‘resources’ (books, other people, etc.) in addition to thePCHelp system.”

—Setting(For this setting, the separation strategy as described in Section 2.4 is used.)A number of problems per participant; each problem is representative of one spe-cific computer problem class. One computer per problem; the computer is giventhe problem that the user is asked to solve. The computer problems presentedto the users could indeed occur in real life and not only under the experimentalcircumstances. The computers can be located in a normal office environment,possibly with other people going about their normal working duties in the sameroom. Each affected computer offers flawless access to the help system underreview. Actions on the screen of the experiment computers are recorded, as arethe experiment room and everything that is said. In addition, the users’ actionsare logged while they are using the help system as well as the local system.

4.5 Data, Analysis, Results and Transfer

—Data:One video tape per computer and one video tape of the experimental space (room)exist each with audio recording. Furthermore, logs from the users’ actions withinthe help system and on the computer screens exist.Initial data treatment: Transcription of the video and the audio material. Audiois used for tape synchronization. Timing measurements are used for synchroniza-tion with the log data. The transcripts are sorted by the problem type that wastreated.

—The data analysis can be done in the following steps:—The transcripts are read through and a code book is created simultaneously.

Focus questions during the analysis are: What factors are used? How is eachfactor precisely (reproducibly) defined? When is a factor really occurring, andwhen not? How is each occurrence to be marked in the transcripts? (Thiscould happen, e.g. by a color coding scheme when analysis happens manually,or by a defined tag when using a software package.)

—The transcripts are again meticulously examined and all occurrences of eachof the previously defined factors are noted.It has to be ensured that the first few transcripts are re-examined since theseare often treated differently by the analyzing researchers.

—Special treatment for quantification:Another review of the transcripts is performed. Using the summarized factoroccurrences tagged down previously, the following question is answered: Howoften was a factor used?

—Results and Transfer:For each problem class: A list of factors employed for searching the help systemand possibly their frequencies is compiled, as well as a description of the generalsearch procedure by problem class.



A conclusion is synthesized on the basis of these results. The identified factorscould, for instance, be taken over to form metadata-tags of the knowledge base.As such, the expected consequence would be that the users will be more successfulat finding the information they need in the future when the metadata coincideswith the descriptor they already use intuitively.Whether this hypothesis is correct or not remains uncertain and it is a question tobe tackled by applying a quantitative or quasi-experiment in upcoming projects.

4.6 Show Case: Discussion and Method Critique

The qualitative experiment’s procedure offers a means to tackle research questionsthat are more focused than those suitable for standard qualitative research methods,but are not yet quantifiable by causal, numerical ascertainable relations. It is theinductive (instead of deductive) version of an experiment.

We further illustrated in this chapter that the application of this method goeshand in hand with the habitual efforts for processing qualitative study result data.While in our opinion, the additional insights are worth the transcription and eval-uation efforts, we do not deny that care has to be taken in order to guarantee thatthe analysis process does not invalidate or falsify the outcomes and make themfutile.

As with other study methods, it is the research question that determines whethera qualitative experiment is conductible or not. Research questions for qualitativeexperiments formulate hypothesized existence of factors, procedures and processes;the study setting is a context reduced version of the natural setting, and thereforesubject to a relatively high degree of control.

5. CONCLUSION

Our intention with this paper is to introduce the qualitative experiment as a formalmethod into the HCI context.

In HCI, quantitative methods traditionally hold a strong position within the sci-entific research community (cf. Table I). Though their position is questioned nowand then, discussions concerning their advantages and disadvantages have neverdiscussed the general appropriateness of the whole range of available methods andtheir applications in the HCI context. Critiques voiced from within the HCI com-munity are contradictory (e.g. [Greenberg and Thimbleby 1992; Lieberman 2003])and address a variety of aspects which are always based on the traditional strictseparation of qualitative vs. quantitative methods.

Qualitative methods have held a far more difficult position within the communityas they tended to be considered neither representative nor scientific because theyonly portray individual cases instead of large populations; they pursue an inductiverather than deductive policy. Nevertheless, they have been applied more extensivelyin past decades (see e.g. the field studies in [Heath et al. 1995; Suhm et al. 2002]or the interviews in [Nardi et al. 1995; Barreau 1995; Ravasio et al. 2004]).

Experimental (i.e. hypothetical deductive) research is based on observations andliterature studies that give rise to hypotheses which again are formulated as opera-tional and causal and therefore testable conditions. There are, however, situationswhere knowledge on the existence of causal dependencies is missing from experienceand theory. Therefore, a structured method to gain insights is required in order to



come up with ‘testable’ (i.e. quantifiable) hypotheses. This is where the qualitativeexperiment can reveal its true strengths. It combines aspects of the design of quan-titative experimental methods (properly defined, and systematically varied settingsto see changes in outcome [Williams et al. 1988]) with the inductive, discoverable,heuristic approaches characteristic for qualitative methods (observation and induc-tion of new insight in potentially new settings [Kleining 1982; Kleining and Witt2001]. Therefore, the qualitative experiment naturally complements the traditionalrange of research methods. In the past, the qualitative experiment has intuitivelybeen used when more traditional methods, both quantitative and qualitative, didnot satisfy the methodical needs to attain a specific research goal.

It is the kind of questions and hypotheses that one formulates that strongly distin-guishes one method from the other. In the quantitative experiment, the hypothesisallows the formulation of a concrete causal existence of dependencies, relationships,etc. The hypothesis is then statistically proven to be true or false. Example:Reading abilities of children are related to the reading abilities of their parents.

In the qualitative experiment, however, research factors are not to be operational-ized and tested, but discovered. This means that hypotheses are more abstract oropen and formulate the discovery of hypothesized dependencies, relationships, etc.Example: Which phases exist for the process of learning to read in children? Arethey different from the phases that exist in adults?

Consequently, the application of a qualitative experiment is justified when it hasthe goal to discover (rather then to verify) structures, procedures, processes andtheir inter-dependencies, and when the setting should be as close as possible to real-life (field), but still requires a degree of controlled removal of context. (It could goas far as reproducing the field setting in the lab, but with the removal of a givencontextual factor.)

Due to its bridging characteristics, the qualitative experiment offers the oppor-tunity to develop a common ground between technical and cognitive scientific ori-entations, i.e. the two competing groups of HCI scientists: those focusing on thetechnical challenges implied with meeting user requirements and those intending toexplain the requirements on cognitive bases. Here, the qualitative experiment is amethod that is applicable to both sides: by the former, to solidify the requirementsto be tackled technically and by the latter, to start ‘hands-on’ investigations withusers even earlier than was previously possible.

At this point, however, general extensive experience with the qualitative exper-iment is still needed. It remains to be assessed when its application is sensible,what is good practise, and in which application areas it is most fruitfully employed.We therefore look forward to more scientists in the HCI community applying thismethod and reporting their experiences and the insights they gained through usingqualitative experiments.

ACKNOWLEDGMENTS

This research was partially funded through an ETH World grant awarded by theSwiss Federal Institute of Technology at Zurich, Switzerland.

The authors would also like to thank the three anonymous reviewers for theirtime and efforts.ACM Transactions on Computer-Human Interaction, Vol. V, No. N, Month 20YY.


REFERENCES

APHIS 2004. Animal care: Definition of terms. US Animal and Plant Health Inspection Service.http://www.aphis.usda.gov/ac/cfr/9cfr1.html.

Barreau, D. K. 1995. Context as a factor in personal information management systems. Journalof the American Society for Information Science 46, 5, 327–339.

Carlyle, A. 1999. User categorisation of works: Toward improved organisation of online cataloguedisplays. Journal of Documentation 55, 2, 184–208.

Carroll, J. M. 1982. Creative names for personal files in an interactive computing environment.International Journal on Man-Machine Studies 16, 405–438.

CHI 2002. CHI ’02: Proceedings of the SIGCHI conference on Human factors in computingsystems. ACM Press. Conference Chair-Dennis Wixon.

CHI 2004. CHI ’04: Proceedings of the 2004 conference on Human factors in computing systems.ACM Press. Conference Chair-Elizabeth Dykstra-Erickson and Conference Chair-ManfredTscheligi.

Cook, T. D. and Campbell, D. T. 1979. Quasi-Experimentation: Design & Analysis Issues forField Settings. Houghton Mifflin Company, U.S.A.

Costall, A. and Vedeler, D. 1992. David katz: On touch, pictures and pictures to touch. InAmerican Psychological Association Conference. Washington.

Czerwinski, M., Horvitz, E., and Wilhite, S. 2004. A diary study of task switching andinterruptions. In ACM Conference on Computer-Human Interaction (CHI) 2004. Vienna,Austria, 175–182.

Duncker, K. 1926. A qualitative (experimental and theoretical) study of productive thinking(solving of comprehensible problems). The Pedagogical Seminary and Journal of Genetic Psy-chology, Child Behavior, Animal Behavior and Comparative Psychology 33, 642–708.

Fiore, A. T., LeeTiernan, S., and Smith, M. A. 2002. Observed behavior and perceived valueof authors in usenet newsgroups: Bridging the gap. In ACM Conference on Computer-HumanInteraction (CHI) 2002. Minneapolis, 323–330.

Greenberg, S. and Thimbleby, H. 1992. The weak science of human-computer interaction. InACM Conference on Computer-Human Interaction (CHI) 1992. Monterey, CA.

Grossman, T., Balakrishnan, R., Kurtenbach, G., Fitzmaurice, G., Khan, A., and Buxton,B. 2002. Creating principle 3d curves with digital tape drawing. In ACM Conference onComputer-Human Interaction (CHI) 2002. Minneapolis, 121–128.

Heath, C., Jirotka, M., Luff, P., and Hindmarsh, J. 1995. The individual and the collabo-rative: the interactional organisation of trading in a city dealing room. Journal of ComputerSupported Cooperative Work 3, 1, 147–165.

Katz, D. 1953. Uber zeichnungen von blinden. In Studien zur experimentellen Psychologie.

Kleining, G. 1982. Umriss zu einer methodologie qualitativer sozialforschung [an outline for amethodology of qualitative research]. Kolner Zeitschrift fur Soziologie und Sozialpsychologie 34,224–253.

Kleining, G. 1986. Das qualitative experiment [the qualitative experiment]. Kolner Zeitschriftfur Soziologie und Sozialpsychologie 38, 4, 724–750.

Kleining, G. and Witt, H. 2001. Discovery as basic methodology of qualitative and quantitativeresearch. Forum Qualitative Research [On-line Journal] 2, 1.

Kwasnik, B. H. 1991. The importance of factors that are not document attributes in the organ-isation of personal documents. Journal of Documentation 47, 4, 389–398.

Lieberman, H. 2003. Chi fringe: The tyranny of evaluation. In Conference on Human Factorsand Computing Systems (CHI) 2003. Fort Lauderdale, Florida.

Mach, E. 1861. Uber das sehen von lagen und winkeln durch die bewegung des auges. In Sitzungs-bericht der Kaiserlichen Akademie der Wissenschaften, Mathematisch-NaturwissenschaftlicheClassen. Vol. 43. Vienna, 215–224.

Mach, E. 1976 (1905). Knowledge and Error: Sketches on the Psychology of Enquiry.

Mach, E. 2003 (1921). The Principles of Physical Optics: An Historical and Philosophical Treat-ment. Dover Publications.



Malone, T. W. 1983. How do people organize their desks? implications for the design of officeinformation systems. ACM Transactions on Office Information Systems 1, 1, 99–112.

Nardi, B., Anderson, K., and Erickson, T. 1995. Filing and finding computer files. In East-West Conference on Human Computer Interaction. Springer, Moscow, Russia.

Neale, J. M. and Liebert, R. M. 1986. Science and Behavior: An introduction to methods ofresearch, 3rd ed. Prentice Hall International Inc.

Paulos, E. and Goodman, E. 2004. The familiar stranger: Anxiety, comfort, and play in publicplaces. In ACM Conference on Computer-Human Interaction (CHI) 2004. Vienna, Austria,223–230.

Piaget, J. 1992 (1959). Origins of Intelligence in Children. International Universities Press.

Pirhonen, A., Brewster, S., and Holguin, C. 2002. Gestural and audio metaphors as means ofcontrol for mobile devices. In ACM Conference on Computer-Human Interaction (CHI) 2002.Minneapolis, 291–298.

Ravasio, P., Guttormsen Schar, S., and Krueger, H. 2004. In pursuit of desktop evolution:User problems and practices with modern desktop systems. ACM Transactions on Computer-Human Interaction (TOCHI) 11, 2 (June), 156 – 180.

Ravasio, P., Schluep, S., and Guttormsen Schar, S. 2003. Metadata in e-learning systems: Fo-cus on its value for the user. In Advances in Technology-based Education: Towards a Knowledge-based Society. Proceedings of the Second International Conference on Multimedia and ICTs inEducation (m-ICTE2003). Badajoz, Spain.

Saygin, T. P., Cicekli, I., and Akman, V. 2000. Turing test: 50 years later. Mind and Ma-chines 10, 463–518.

Schraefel, M. C., Hughes, G. V., Mills, H. R., Smith, G., Payne, T. R., and Frey, J.2004. Breaking the book: Translating the chemistry lab book into a pervasive computinglab environment. In ACM Conference on Computer-Human Interaction (CHI) 2004. Vienna,Austria, 25–32.

Sillence, E., Briggs, P., Fishwick, L., and Harris, P. 2004. Trust and mistrust of online healthsites. In ACM Conference on Computer-Human Interaction (CHI) 2004. Vienna, Austria, 663–670.

Sugimoto, M., Hosoi, K., and Hashizume, H. 2004. Caretta: A system for supporting face-to-facecollaboration by integrating personal and shared spaces. In ACM Conference on Computer-Human Interaction (CHI) 2004. Vienna, Austria, 41–48.

Suhm, B., Bers, J., McCarthy, D., Freeman, B., Getty, D., Godfrey, K., and Peterson,P. 2002. A comparative study of speech in the call center: Natural language call routing vs.touch-tone menus. In ACM Conference on Computer-Human Interaction (CHI) 2002. ACM,Minneapolis, USA.

Turing, A. M. 1950. Computing machinery and intelligence. Mind 49, 433–460.

USABILITY 2005. Usability terms glossary. The Usability Company. http://www.

theusabilitycompany.com/resources/glossary/evaluation.html.

Whittaker, S. and Hirschberg, J. 2001. The character, value, and management of personalpaper archives. ACM Transactions on Computer-Human Interaction 8, 2, 150–170.

Williams, F., Rice, R. E., and Rogers, E. M. 1988. Research Method and the New Media.Communication Technology and Society. The Free Press, Macmillan Inc., New York, NY.


the qualitative experiment in hci: deﬂnition, occurrences ... · the qualitative experiment in...

Documents