hassenzahl, wessler - 2000 - capturing design space from a user perspective the repertory grid...

Capturing Design Space From a User Perspective:The Repertory Grid Technique Revisited

Marc HassenzahlUsability Engineering

User Interface Design GmbH

Rainer WesslerDepartment of Psychology

University of Osnabrck

The design of an artifact (e.g., software system, household appliance) requires a multi-tude of decisions. In the course of narrowing down the design process, good ideashave to be divided from bad ideas. To accomplish this, user perceptions and evalua-tions are of great value. The individual way people perceive and evaluate a set of proto-types designed in parallel may shed light on their general needs and concerns. TheRepertory Grid Technique (RGT) is a method of elucidating the so-called personal con-structs (e.g., friendlyhostile, badgood, playfulexpert-like) people employ whenconfronted with other individuals, events, or artifacts. We assume that the personalconstructs (and the underlying topics) generated as a reaction to a set of artifacts markthe artifacts design space from a users perspective and that this information may behelpful in separating valuable ideas from the not so valuable. This article explores thepractical value of the RGT in gathering design-relevant information about the designspace of early artifact prototypes designed in parallel. Ways of treating the informationgathered, its quality and general advantages, and limitations of the RGT are presentedand discussed. In general, the RGT proved to be a valuable tool in exploring a set of ar-tifacts design space from a users perspective.

1. INTRODUCTION

To design an artifact (e.g., software system, household appliance) is a constantproblem-solving and decision-making process. In the course of this process, thenumber of possible alternatives is narrowed down until a final design is reached. A

INTERNATIONAL JOURNAL OF HUMANCOMPUTER INTERACTION, 12(3&4), 441459Copyright 2000, Lawrence Erlbaum Associates, Inc.

We thank the MediaPlant project group, especially Stefan Hofmann, Alard Weisscher, Jochen Klein,and Tobias Komischke for designing and implementing the prototypes used in this study. Thanks also toFlorian Sarodnick for his support.

Requests for reprints should be sent to Marc Hassenzahl, User Interface Design GmbH,Dompfaffweg 10, 81827 Munich, Germany. E-mail: [email protected]

multitude of decisions have to be made, revolving around the general purpose ofthe artifact, its context of use (Bevan & Macleod, 1994) and connected trade-offsand arguments (see Moran & Carroll, 1994, for an overview). Taking all de-sign-driving information together, this bundle can be thought of as an artifactsdesign space, thereby implying something that can be charted and explored.

Several frameworks and methods have been proposed to capture design space.Design Space Analysis (MacLean, Young, Bellotti, & Moran, 1991), for example,provides a means for designers to make the options they have and the decisionsthey make explicit by an analytical effortan act of reflection (Carroll & Moran,1991, p. 199). Carroll and Rosson (1991) took a slightly different approach in theframework of Claims Analysis. They proposed extracting psychological claimsfrom an artifact, which represent testable assumptions (i.e., empirical hypotheses)about the artifacts design rationale. What these approaches have in common is theprimarily analytical perspective on design space.

A more empirical way to explore design space, especially appropriate for novelartifacts, is parallel design (Nielsen, 1993) with a subsequent evaluation phase. Inparallel design, several designers are asked to work out design solutions for an arti-fact with a certain purpose. Each designer has to work on her or his own, to ensuremaximumheterogeneityof thesinglesolutions.Thebasicassumptionis thatbycom-bining the valuable ideas embodied in the single solutions, a new superior solutioncan emerge. Whether an idea is valuable or not must be confirmed by subsequentevaluation. This evaluation yields the crucial information to guide further design.

A wealth of user-based usability evaluation techniques is available, such asquestionnaires (e.g., IsoMetrics; Gediga, Hamborg, & Dntsch, 1999), interviewsand/or usability testing methods (e.g., Thinking Aloud; Jrgensen, 1989; alsoknown as Verbal Protocol Analysis; Ericsson & Simon, 1984; or CooperativeEvaluation; Wright & Monk, 1991). All these methods can be considered as vary-ing in the amount of predetermined structure that they impose on the data acquisi-tion and analysis process.

The major advantages of prestructured approaches (e.g., questionnaires) aretheir robustness, in the sense of reliability and objectivity, and their efficiency. Onemajor drawback is their insensitivity to topics, thoughts, and feelingsin short, in-formationthat do not fit into the predetermined structure. This is especiallyproblematic if there is a general lack of knowledge about the topic to be researched.For example, recently fun is considered as an important software or product re-quirement (Draper, 1999; Hassenzahl, Platz, Burmester, & Lehner, 2000). However,without an a priori notion of fun as an important aspect of software acceptance andan idea how to define it, prestructured methods will inevitably fail. They simplylack openness to new, yet unconsidered topics. Another important drawback ofprestructured approaches is their tendency to produce data that is of low practicaluse in a design process. Carroll (1997), for example, argued that formal experi-ments [a very structured approach] are fine for determining which of two designsis better on a set of a priori dimensions, but they are neither flexible nor rich enoughto guide a process of continual redesign (p. 504).

Unstructured methods (e.g., open interviews) in general have the required open-ness and the potential to produce design-relevant data, but this advantage is again

442 Hassenzahl and Wessler

accompanied by major drawbacks. First, a lack of predefined structure requires a lotmore effort to be put into the actual analysis of the data obtained. Often, hours of in-terviews have to be transcribed, coded, and integratedit is a complex, la-bor-intensive and uncertain business (Banister, Burman, Parker, Taylor, & Tindall,1994, p. 49). The same holds true for the qualitative analysis of video protocols fromusability testing sessions. Second, serious issues of objectivity and reliability arise,which touches on one of the core issues in the more or less philosophical argumentbetween protagonists of a quantitative-oriented versus qualitative-oriented re-search tradition (see Buur & Bagger, 1999; Hassenzahl, 1999).

To summarize, the user-based evaluation of artifacts in a parallel design situa-tion requires an efficient but open method that produces data rich and concreteenough to guide design. None of the traditional methods seems to satisfy all thoserequirements at once.

The obvious problems with popular user-based evaluation methods lead us toconsider the Repertory Grid Technique (RGT; Kelly, 1955) as a possible candidatemethod for capturing design space from a users perspective. The RGT makes itpossible to understand an individuals personal (i.e., idiosyncratic) construction ofher or his environment (e.g., artifacts, other persons). It avoids some of the prob-lems just discussed. Despite this, as a method for comparing or evaluating differentartifacts, it is somewhat out of fashion. With its high point around the 1980s (with awhole issue of the International Journal of ManMachine Studies devoted to the topic;Shaw, 1980), the RGT remains popular as a knowledge acquisition tool (e.g., Gaines& Shaw, 1997) and the results proved helpful for various purposes, such as struc-turing hypertexts (Dillon & McKnight, 1990).

The objective of this article is to present the RGT as a method of capturing designspace (i.e., design-relevant information) from a users perspective. First, RGT andthe rationale for using it in the context of artifact design are described. Second, theRGT is applied to a set of simple prototypes designed in parallel. We present exam-ples of how different types of information can be extracted from the data, therebyproposing a possible procedure for treating the obtained data. This procedure com-prises three stepscharting the design space, exploring and understanding the de-sign space, and abstracting. We investigate whether it is possible to abstract fromthe idiosyncratic perspectives to identify underlying topics relevant for the artifactto be designed. The major advantage of the latter may be a possible stimulation oftheory development (Carroll, Singley, & Rosson, 1992). Furthermore, we attemptto assess the quality and usefulness of the obtained data. Third, the advantages ofRGT, as well as the limitations, are discussed.

2. USING RGT TO BRING DESIGN SPACE TO LIFE

The RGT (Kelly, 1955) originally stems from the psychological study of personality(see Banister et al., 1994; Fransella & Bannister, 1977, for an overview). Kelly as-sumed that the meaning we attach to events or objects defines our subjective reality,and thereby the way we interact with our environment. The idiosyncratic views ofindividuals, that is, the different ways of seeing, and the differences to other indi-

Capturing Design Space 443

viduals define unique personalities. It is stated that our view of the objects (persons,events) we interact with is made up of a collection of similaritydifference dimen-sions, referred to as personal constructs. For example, if we perceive two cars as beingdifferent, we may come up with the personal construct fancyconservative to dif-ferentiate them. On one hand, this personal construct tells something about the per-son who uses it, namely his or her perceptions and concerns. On the other hand, italso reveals information about the cars, that is, their attributes.

From a design perspective, we are interested in differences between artifacts(i.e., the cars in our example) rather than differences in the individual, thus we in-tend to focus on what the personal constructs of a group of individuals might tellus about the artifacts they interact with. The differences between artifacts, manifestin the personal constructs a group of individuals comes up with, is the de-sign-relevant information that should bring design space to life.

The RGT is a method of extracting personal constructs in a systematic way. In afirst step, an individual is presented with a randomly drawn triad from a group ofartifacts that populate design space. He or she is asked in what way two of the threeare similar to each other and different from the third. This induces a search and theproduction of an appropriate personal construct that accounts for a perceived dif-ference. The personal construct found is named (e.g., playfulserious,two-dimensionalthree-dimensional, uglyattractive) and the whole process is re-peated until no further novel constructs arise. The result is a kind of semantic dif-ferential solely based on the idiosyncratic view of the individual.

In a second step, the individual is asked to rate all artifacts on her or his personalconstructs. The result is an individual-based description of the artifacts based ondifferences amongst them.

The RGT may have several advantages. First, it is a structured approach, butnevertheless open to the idiosyncratic views of each individual. It captures the wayindividuals construct the design space populated by artifacts. Second, it is more ef-ficient than comparable unstructured approaches. To focus on the personal con-structs as data denotes a significant reduction in the amount of data to be analyzed(hopefully without severe reduction in meaningful content). This is especially im-portant in the context of parallel design, where a large number of design alterna-tives are favorable. Third, personal constructs may have the potential to be design-relevant data. The whole approach is likely to generate different views on the arti-facts, embodying various individual needs and concerns in relation to the artifact.Fourth, the basic method lends itself to the application of almost any set of artifacts.

These (envisioned) advantages form the rationale for using the RGT in a paralleldesign situation. In the remainder of the article an application of the RGT is pre-sented and the results are discussed.

3. AN APPLICATION OF THE RGT IN A PARALLEL DESIGN SITUATION

3.1. Method

Participants. A total of 11 individuals (6 women, 5 men) participated in thestudy. They were mainly recruited among Siemens employees; most of them had


responded to a public announcement in the canteen. Their job background was het-erogeneous and covered nontechnical backgrounds (e.g., sports student, designer)as well as technical backgrounds (e.g., software developer, network administrator).The samples mean age was 34 years (Min = 22, Max = 54). Computer expertise wasassessed by a five-item questionnaire and varied from moderate (3 participants) tohigh (8 participants).

Artifacts. In a parallel design session, we asked students of visual, industrial,and ergonomic design to design and implement seven different artifacts (i.e., proto-types). These prototypes should serve to fulfill the same simple, yet realisticwork-related taskto switch off a pump in an assumed industry plant controlroom. This required at least the following steps: selecting the pump, switching it off,and an action confirmation (i.e., safety check). The shutting-down of the pump re-quired some time. It was left to the students whether this process was visualized ornot. The whole parallel design session was part of a larger project concerned withdesigning innovative control room interfaces.

The student designers were given no restrictions about prototype form and in-teraction style in advance. The students were encouraged to work out solutions ac-cording to what they found appropriate or interesting.

Color Plates 17 and 18, Figure 1 show the prototypes. Although each proto-type allowed the user to accomplish the same task, they strongly varied in designand interaction style. Multiple design dimensions were varied (e.g., colors, meta-phors). Six out of the seven prototypes had animated parts. As long as the pre-dominant design principle of parallel design is heterogeneity, flaws in the visualand ergonomic design were not corrected. A former study (Hassenzahl et al.,2000) using the same prototypes showed that they varied considerably inappealingness, perceived ergonomic quality (i.e., task-related quality aspects),and hedonic quality (non-task-related quality aspects). From these results, it canbe tentatively concluded that the design principle of heterogeneity was met.

Additional measures. In addition to the personal constructs, appealingnessrankings of the prototypes were obtained from each participant. The overall rankorder of the prototypes was based on the sum of each prototypes individual ranks.

Procedure. Each participant was led into the laboratory separately. After ashort introduction, the participant was seated in front of a 30-in. CRT that showedsmall pictures of the seven prototypes in a random order.

The whole procedure consisted of three partsintroduction, extraction, and as-sessment.

1. In the introduction part, the participant was instructed to familiarize himselfor herself with the prototypes. Each prototype embodies the task of switching off apump. To accomplish this, the participant had to select the running pump with the


mouse and was then asked whether he or she really wanted the pump to beswitched off. After a confirmation and a safety check, the pump was switched off.Once the participant was convinced that the pump was coming to a halt she or hewas to inform the experimenter. The interaction per prototype lasted approxi-mately 2 min. After getting familiar with all seven prototypes the participant wasasked to rank order the prototypes according to their appealingness. This rank or-dering was followed by a short break.

2. In the extraction part of the procedure, three of the seven prototypes (i.e., aprototype triad) were randomly chosen and displayed on the screen. The partici-pant was asked to find a dimension (i.e., personal construct) in that two of the threeprototypes was similar (i.e., inclusive construct-pole) but differed from the third(i.e., exclusive construct-pole). He or she was then required to label both poles in away that expresses the intended dimension as brief and clear as possible. After la-beling, a difference dimension a new triad was presented. This part of the proce-dure was repeated until the participant was unable to state a construct he or she didnot mention before.

3. In the assessment part, the participant was asked to evaluate each prototypeon her or his personal constructs by using a scale ranging from 1 (inclusive con-struct-pole) to 5 (exclusive construct-pole).

Demographics and computer expertise had been assessed at the end of each ses-sion. The whole session took about 1 hr and 15 min.

3.2. Results and Discussion

The following sections not only present our findings, but also their sequence can beviewed as an example for the stepwise exploration of RGT data.

Three steps were suggested:

Charting the design space: The first step is to visualize the relations amongthe prototypes; that is, to create a map of design space.

Exploring and understanding design space: Based on the map relations be-tween single prototypes (i.e., pairs of prototypes) can be further explored.This exploration may yield detailed design-relevant information.

Abstraction: Underlying topics made visible: An abstraction from the resultsobtained can promote a deeper understanding of the underlying topics. Thismay prove helpful for solving future design problems. Moreover, it maystimulate the development of theories of the design of artifacts.

Charting the Design Space

The RGT yielded 170 personal constructs, with a median of 15 constructs perparticipant (Min = 9, Max = 29). Before we consider the obtained constructs in de-tail, we attempt to visualize the relations among the prototypes, to create a map ofthe design space. To accomplish this, we calculated Euclidean distances between


the prototypes based on differences in the assessment of each prototype on the per-sonal constructs (see Procedure section). The resulting distance matrix was thensubmitted to a Pathfinder Network Analysis (Knowledge Network OrganizingTool [KNOT], 1992; Schvaneveldt, 1990; Wandmacher, 1993). The Pathfinder algo-rithms seek to determine a two-dimensional representation of a distance matrix inspace, with nodes representing objects and links representing relations (i.e., simi-larity) between objects.

Figure 2 shows the map of design space for the seven prototypes. The figureshows that the Windows-like prototype is a central node, connecting all other pro-totypes. Actually, this reflects in part the way the prototypes were designed. Thedesigner who produced the Windows-like prototype was most knowledgeableabout the domain. The other designers referred to him as an important source of in-formation during the design. In a way, his design became a blueprint of the otherdesigns. One may argue that a strong recommendation of parallel design, namelyto have the designers work separately from each other (Nielsen, 1993), was notfully taken into account. Conversely, the Windows-like prototypes central rolemay be simply a product of it most purely representing the task to be accomplishedby the participants. Regardless of which interpretation holds true, it is astonishingthat the central role of the Windows-like prototype, which was more or less im-plicit, was perceived by the participants, that is, is evident in the data.

From the central Windows-like prototype three different branches extrude.Branch 1 consists of the prototypes blue and comic. Blue adapts the general layoutfrom Windows-like, but presents it in a more visually designed way. A dialog flagextruding the pump symbol replaced the dialog box. Moreover, the general colorscheme was changed from Windows-gray to a dark and intensive blue. Comic still


FIGURE 2 Map of design space (derived from a Pathfinder Network Analysis withr = , q = n 1). Nodes represent the prototypes. Links show the relations between proto-types.

draws on the general layout, but introduces a surprising dialog element (i.e., thecomic figure holding up a dialog sign) and fun-related design. Taken together,Branch 1 may represent a transition from a basically technology-oriented,well-known design to a more appealing, surprise- and fun-related design.

Branch 2 consists of the prototypes game-like, cube, and real. Again game-like isthe prototype that draws on the general layout of the Windows-like prototype. Itdiffers by introducing dimensionality. The representation is changed from thetwo-dimensional to an isometric representation, similar to the way a certain genreof computer games present themselves (e.g., StarCraft, 1998; Weisscher, 1999). Thisdimensionality is further supported by the use of three-dimensional rendering.Cube sacrifices the general overview provided by Windows-like and game-like. Itpresents itself in a close-up view. Although the representation of the pump is stillan abstraction of a real pump, it looks more graspable and real than in thegame-like prototype. The shiny, metal-like surface and the solid, animated,three-dimensional dialog cube deepens this impression. Real takes this impressiona little further by introducing a zooming in from an overview of the plant layout toa close-up of the pump. The actual zoom is presented as a camera flight throughspace. The pump is modeled after a real pump (but not necessarily one used in anindustrial context), with knobs to switch it on and off, a flap hiding an action confir-mation, and a round indicator for its status. Furthermore, the pump was visuallyvibrating to show that it is running. To summarize, Branch 2 may represent thetransition from an abstract, two-dimensional design to a more reality-based,three-dimensional design.

Branch 3 consists of the animated prototype. This prototypes design is quiteclose to the Windows-like. It mainly differs in the fact that the pump icon itself isanimated (it turns when it is running) and that the actual process of shutting downthe pump is represented by signals running down the line connecting the dialogbox with the pump icon (see Color Plate 18, Figure 1). In short, Branch 3 may repre-sent a transition from the still to the strongly animated.

The map of design space is helpful for getting a first idea how the participantsperceive the prototypes. It visualizes similarities and dissimilarities (i.e., relations)between prototypes apparent in the data. Nonetheless, it is descriptive in nature. Inother words, it will not help to distinguish good from bad ideas.

To overcome this limitation, we might combine the map of design space with theoverall appealingness ranking. In the map (see Figure 2), the circles attached to theprototype nodes show the appealingness rank for each prototype (based on thesum of the individual ranks). Low numbers indicate a higher degree ofappealingness and high numbers a lower degree of appealingness. Apparently, theblue and Windows-like prototypes are the most appealing, whereas the real,comic, and animated prototypes are the least appealing.

Noticeably, the end nodes of the three branches (comic, real, animated) are consis-tently perceived as the least appealing. A conclusion from this result could be thatfun-oriented design (Branch 1), three-dimensional reality-based design (Branch 2),and animations (Branch 3) simply do not appeal to the participants (at least in a tech-nology-oriented domain). From our perspective, this conclusion is oversimplified. Itseems more likely that it is not the design elements per se (e.g., fun-oriented design,


reality-based design) that are unappealing, but their extremity. For example, look-ing at Branch 1 the minor changes in design from the quite common Windows-like tothe more unusual blue prototype are positively received. This is reflected by thehigher appealingness rank of the blue prototype. Blue may introduce some noveltyand good design solutions, which add value to the prototype. However, by going astep further to the comic prototype, appealingness drops dramatically. Indeed, thecomic prototype presents itself in a fairly extreme, fun-oriented way, being quite dif-ferent from the central Windows-like prototypeperhaps too different. This inter-pretation isconsistentwithDreyfusss (1955;cited inCarroll, 1997) recommendationof introducing new functions through familiar survival forms. This idea rests onthe observation that people are often bewildered by unnecessary novelty.

In fact, the interpretation of the appealingness ranks in combination with themap of design space is difficult and close to speculation. The simple knowledge ofdifferences (relations, respectively) between prototypes and their appealingnessdoes not seem sufficient to understand those differences. An understanding of thenature of these differences is necessary to substantially guide design. To be moreprecise, concrete, design-relevant information is needed to understand the designspace of a set of artifacts.

In the next section, we describe how the map of design space we just created canguide the extraction of such design-relevant information.

Exploring and Understanding Design Space: Extracting Design-RelevantInformation

By interpreting the map of design space based on what we as designers (or au-thors) know about or how we perceive the inhabitants of this space (i.e., the proto-types), we neglect the qualitative value of the personal constructs obtained. Apossible way of substantiating our interpretation is to look at the actual personalconstructs that differentiate two prototypes. Basically, the differences betweeneach pair of prototypes can be further explored, resulting in a large number of com-parisons to be made. At this point the map becomes valuable for guiding the pro-cess of comparing.

If, for example, we seek to substantiate our interpretation of Branch 1 as repre-senting the transition from the technology-oriented to the fun-oriented, we maylook at the actual personal constructs that differentiate between the Windows-likeand the comic prototype. A construct clearly differentiates if one prototype is char-acterized by one pole of the construct, whereas the opposite pole characterizes theother prototype.

Table 1 shows the personal constructs that differentiate between the comic andthe Windows-like prototype. The constructs are grouped by topics. Topic labels areshown in italics.

It is striking that a good part of the personal constructs in Table 1 indeed addressthe technology-oriented versus fun-oriented difference between the two proto-types. However it becomes apparent that this difference is not only perceived butalso evaluated. The participants express concerns about the appropriateness of a so


obviously fun-related design in an industrial context. Due to that fact that some ofthe personal constructs are evaluative in tone, it is possible to separate good frombad ideas.

The design-relevant information is that the comic prototype succeeds in inducing asense of playfulness, but that the apparent lack of seriousness is considered inappro-priate for the intended context of use. The blunt playfulness seems to contradict theneedforanexpert-likeandcompetent-lookingprototype(Table1,Constructs110).

Furthermore, the comic prototypes novelty is a topic (Constructs 1113). Again,the related personal constructs vary in tone: One construct emphasizes positivefeelings derived from the comic prototypes own character (Construct 12),whereas the unusual presentation construct connotes concerns about the obvi-ous novelty (Construct 13).


Table 1: Personal Constructs That Differentiate Between the Comic and the Windows-LikePrototype

Prototype Comic Prototype Windows-Like

Not serious (fun-oriented) Serious (technology-oriented)1. Does not take the problem seriously Takes the problem seriously2. Had been fun Serious (good for work)3. Non-expert-like Technically appropriate4. Frivolity Points at something technical5. Not serious More serious6. Playful Expert-like7. All show, no substance Technology-oriented8. Inappropriately funny Serious

Not competent Competent9. Process of switching off appears incompetent Process of switching off appears competent10. Appears incapable Is very capable

Novel Usual11. Figure is a novel interaction element Usual interaction12. Has its own character Windows-interface monotony13. Unusual presentation Usual computer-like presentation

No impairment of process visibility Impairment of process visibility box14. Visibility is not impaired Dialog box blocks important aspects of process15. Overview remains Dialog box covers overview16. No impairment of visibility Dialog box blocks overview

Low readability High readability17. Text is not readable Text is readable18. Font size is too small Font size is appropriate

Low mouseablity High mouseablity19. Yes and no selections hard to hit with mouse Yes and no selections easy to hit with mouse20. Buttons hard to hit with mouse Buttons easy to hit with mouse

Other Other21. Color of pump is neutral Color helps to identify pump22. Status of pump visible Status of pump not visible23. Not suitable for frequent use Suitable for frequent use

Note. The constructs are grouped by topic. Topic labels are in italics. All construct examples were originally inGerman.

Besides the adequacy concerns, that is, the comic prototype is perceived as moreplayful and novel than the Windows-like, but it is doubtful whether its design fitsinto the intended context of use, more usability related aspects are expressed. Forexample, the fact that the dialog box in the Windows-like prototype blocks theview of the process was negatively received (Constructs 1416). The participantsvoiced a need to sustain the view onto the whole process while operating the sys-tem. This was much better solved by the comic prototype by presenting the figurewith the dialog sign in the empty space between interface elements (see ColorPlate 17, Figure 1).

Moreover, the small size of the font on the comics dialog sign was perceived nega-tively (see Color Plate 17, Figure 1). It impaired the readability (Constructs 1718) andthe mouseability (Constructs 1920) of the prototype. Some additional constructs ad-dressedtheneglectofcolors inthecomicprototype(Construct21), thelackoffeedbackof the pump status in the Windows-like prototype (Construct 22), and the generaljudgment that the comic prototype is not suitable for frequent use (Construct 23).

Altogether, the detailed analysis added substantial, design-relevant informa-tion to the preliminary interpretation of the differences between the prototypesbased on the map of design space and the appealingness ranking. The most reveal-ing proved to be the personal constructs, which are evaluational in tone. Those con-structs help to understand how design elements (e.g., color, metaphors,dimensionality, etc.) are perceived and received by participants.

For the sake of brevity, we refrain from presenting further detailed analysis ofdifferences between two prototypes. However, to provide a rough idea of the fruit-fulness of the information obtained in the form of personal constructs, we per-formed the following additional analysis. We, along with an additional rater,independently rated each construct as belonging to one of the following categories:Type A, descriptive; Type B, evaluative, useful for artifact selection; and TypeC, evaluative, useful for artifact redesign without the need for further analysis.Type A constructs point to certain differences (e.g., two-dimensional vs.three-dimensional) to which individuals are receptive. They can be used to verifywhether design elements (e.g., dimensionality) used are actually perceived by theparticipants. Nevertheless, it remains unclear which pole of the construct is consid-ered as good or bad, respectively. This certainly limits its use.

Type B constructs point to relevant issues, without referring to concrete mea-sures to be taken to resolve these issues. For example, the construct interestingversus boring points to an important difference between prototypes. It is obviousthat to be boring is not considered a positive attribute of a prototype (i.e., to be bor-ing is bad). Such a construct is evaluational in tone and can be used for selecting agood artifact from the set of studied artifacts. Type B constructs tend to be too ab-stract, but still point to important issues. However, the question as to which designelements let one prototype appear boring and the other interesting cannot be an-swered by the personal construct alone. Again, this limits its use for guiding de-sign, because the design elements responsible for making a prototype, for example,interesting (i.e., a good idea) cannot be identified.

Type C constructs point to relevant issues, with a clear reference to the relevant de-sign elements. For example, the personal construct font too small versus font size ap-


propriate clearly indicates the participants expectations and the associated designelement (i.e., the font size in some prototypes). These constructs are evaluative in tone.They are concrete and useful for guiding design, but they tend to be very detailed.

From a design perspective, it would be desirable to have a small number of thepurely descriptive Type A constructs, a medium number of the more abstract TypeB constructs, and a large number of the more detailed Type C constructs.

Interrater agreement (Fleiss, 1971) of the initial categorization performed inde-pendently by the three raters was satisfactory ( = 0.64, = 11.14; p < .01,two-tailed). For the final classification disagreement was resolved either by usingthe category two raters at least agreed on or by negotiation.

Table 2 shows the construct types concerning design relevancy, and the numberand percentage of constructs belonging to each category. A large proportion (44%)of the personal constructs obtained are Type A constructs. Unfortunately, theseconstructs are of limited use when it comes to practical design. This result may re-flect the fact that it is much easier to come up with descriptive constructs ratherthan evaluative. Even so, in further applications of the RGT, measures should betaken to reduce the number of purely descriptive personal constructs.

The proportion of Type B and C constructs are as desired, with a large number ofthe detailed Type C constructs and a medium number of Type B constructs. It is en-couraging that the number of Type C constructs almost equals the number of themuch easier to produce Type A constructs.

The design relevancy of the information captured by the RGT isdespite thelarge proportion of purely descriptive constructs (Type A)encouraging. How-ever, it should be noted that the analysis presented herein has a limitation. It neglectsthe fact that some personal constructs may address the same issue or even the samedesign element (e.g., text is not readable vs. text is readable, or font size too small vs.font size is appropriate), thus the actual amount of information may be overesti-mated.Toourmind,amorecompleteanalysisof theactualqualityof theinformationobtainedshouldbepostponeduntilamoreextensivepoolofpersonalconstructs,ob-tained with different sets of artifacts, is available. For now, we tentatively concludethat an RGT will produce at least some design-relevant information.

Abstraction: Underlying Topics Made Visible

It is quite common to evaluate artifacts (or a set of artifacts) to improve their de-sign. Often neglected is the additional step of abstracting and generalizing from thedata obtained (Carroll et al., 1992). Such abstractions could stimulate the develop-


Table 2: Number and Percentage of Personal Constructs Belonging to Different TypesConcerning Their Design Relevancy

Construct Type Design Relevancy No. of Constructs % of Constructs

A Descriptive Low 75 44B Evaluative; Useful for artifact selection Medium 34 20C Evaluative; Useful for artifact redesign

without the need for further analysisHigh 61 36

ment of theories for the design of artifacts and could prove helpful for solving fu-ture design problems. From our perspective, it is an important asset of an(evaluation) method to support abstraction.

A first inspection of the personal constructs by the experimenter (i.e., Wessler)revealed obvious similarities within and between participants. Based on these sim-ilarities, he defined and described construct classes. We, with one additional rater,then categorized each construct as belonging to one of the construct classes. Am-biguous constructs were removed. Interrater agreement (Fleiss, 1971) of categori-zation was satisfactory ( = 0.68, = 15.49; p < .01, two-tailed).

Based on difficulties encountered during classification, the initial set of con-struct classes was slightly reformulated. It was attempted to resolve disagreementabout constructs class membership among the raters. In the case disagreementscould not be resolved, the construct was removed from further analysis. Alto-gether, 153 constructs remained in the analysis.

Table 3 shows the identified construct classes, some examples, and the numberof constructs summarized by the class.

Design principles. The construct class design principles refers to differ-ences in metaphors (e.g., reality, anthropomorphism, desktop), visual design meth-ods (e.g., animation, color), and interaction design methods (buttons, dialog boxes)used by the designers. It illustrates the participants receptiveness to differences inthe way the designers solved the design problem set by the given task. The summa-rized constructs are more or less descriptive in nature.

Quality of interaction. The construct class quality of interaction refers todifferences in the prototypes controllability, simplicity, and efficiency. It showsthe participants ability to express and put into focus usability problems occur-ring during interaction. The summarized constructs are more or less evaluativein nature.

Quality of presentation. The construct class quality of presentation refersto differences in the prototypes self-descriptiveness (i.e., ability to communicate itsfunctions, the way it is operated, and its current status) and clarity (e.g., readability,unambiguousness). It shows the participants ability to infer differences in the us-ability of prototypes from the way information is presented to them. The summa-rized constructs are more or less evaluative in nature.

Hedonic quality. The construct class hedonic quality refers to differences inthe prototypes hedonic quality (Hassenzahl et al., 2000), that is, thenon-task-related qualities modernity, novelty, and ability to stimulate. It illustratesthe participants responsiveness to differences beyond mere usability and utility.The summarized constructs are more or less evaluative in nature.


Adequacy concerns. The construct class adequacy concerns refers to userconcerns about the prototypes adequacy for the intended context of use (profi-ciency vs. playfulness) and the adequacy of employed design principles in general(animation, light effects). It illustrates the participants belief about the extent towhich the prototype is suitable for the task. The summarized constructs areevaluative in nature.

If we look at the percentage of constructs devoted to the different topics, agreat deal are perceived differences in the design principles and elements em-ployed by the designers (40%). This demonstrates the participants receptivenessto variations in design. Differences in quality of presentation (32%) is the secondstrongest group. Together with quality of interaction (10%) these constructs rep-resent perceived differences in the usability (i.e., task-oriented quality aspects) ofthe prototypes. As long as the prototypes used in this study did not allow for ex-tensive interaction (mean interaction time was about 2 min), the stronger focuson presentational differences can be easily explained. Hedonic quality, that is,non-task-oriented quality aspects (3%) did not receive much attention from theparticipants. This is astonishing, given the fact that more quantitatively orientedanalyses of the same set of prototypes showed that perceptions of ergonomic(i.e., usability) and hedonic quality contributed almost equally to theappealingness of the prototypes (Hassenzahl et al., 2000). Why was there onlysuch a small number of personal constructs devoted to the topic of hedonic qual-ity? There are at least two possible explanations. First, if we take a closer look atthe adequacy concerns (15%), it is striking that a great many constructs revolve


Table 3: Construct Classes, Examples, and Number of Constructs

Construct Class Personal Construct Examples No. ofConstructs

% ofConstructs

1 Design principles Two dimensionalThree dimensionalDetail viewTotal viewGraspableAbstract

61 40

2 Quality of interaction Trial and errorUnambiguous controlDialog element inefficientDialog element efficientDemanding interactionStraightforward interaction

16 10

3 Quality of presentation Presentation confusingPresentation clearToo much informationAppropriate amount of

informationStructure remains vagueStructure becomes

apparent

49 32

4 Hedonic quality BoringInterestingHas its own characterWindows-interface monotonyNovel interaction elementConventional interaction

element

4 3

5 Adequacy concerns Inappropriately funnySeriousUnnecessary animationNo unnecessary animationAppears incapableIs very capable

23 15

Note. See text for further descriptions of the construct classes. All construct examples were originallyin German.

around concerns as to whether design principles and elements employed to in-duce hedonic quality are appropriate in the intended context of use. In otherwords, the topic adequacy concerns also addresses hedonic quality, but the ma-jor issue is rather the adequacy of the way hedonic quality is induced than a gen-eral appreciation of hedonic quality.

Second, it might simply be much harder to be aware of differences in hedonicquality, that is to produce hedonic-oriented personal constructs. For example, be-coming aware of the fact that a certain color scheme violates ones taste requires alot more reflection than becoming aware that a certain text is not readable. Mostlikely both explanations hold true to some extent.

To summarize, laypersons confronted with a prototype representing a task froman unknown and technology-oriented domain are especially receptive of differ-ences in usability, with the clear need for a good usability. They are also receptiveof differences in hedonic quality, but they express strong concerns about the ade-quacy of the way this quality aspect was induced.

Based on these results, further studies can be planned (e.g., same set of proto-types presented to domain experts instead). By cumulating the personal constructsobtained under different conditions (e.g., sets of artifacts, user groups) and an ab-straction step, domain-specific and general models of design space from a usersperspective could be built up. These models would certainly help designers to de-velop their designs in a way appreciated by the potential users. From our perspec-tive, abstraction and generalization of the results obtained seems possible andfruitful. It certainly stimulates the adoption of new perspectives on the artifacts tobe designed and their context of use.

To summarize, the RGT combined with the analysis methods proposed provedto be helpful in charting, exploring, and understanding the design space of thegiven set of artifacts. It lent itself to both (a) the generation of concrete, design-rele-vant information and (b) the abstraction from this information of material to stimu-late further analysis and theory development.

Future studies should explore the utility of the RGT by varying the types of arti-facts. By comparing the results from these, it may be possible to determine whetherthe personal constructs found are of general nature or heavily dependent on the ac-tual artifacts studied.

4. ADVANTAGES, LIMITATIONS, AND FURTHER IMPROVEMENTS OFTHE RGT

In the following sections advantages, limitations, and improvements of the RGT forexploring design space are presented and discussed.

4.1. Advantages

The most important advantages of the RGT are (a) its ability to gather design-rele-vant information, (b) its ability to illuminate important topics without the need to


have a preconception of these, (c) its relative efficiency, and (d) the wide variety oftypes of analyses that can be applied to the gathered data.

As already emphasized, a method that aims at guiding design has to provide in-formation of high practical value. In this study, 36% of the constructs obtainedwere rated to be useful for artifact redesign without the need for further analysis.Another 20% of the constructs contribute relevant information to the design pro-cess although additional analysis is necessary. The amount of high-quality infor-mation gathered becomes even more remarkable, if it is taken into account that theRGT as an adaptive method is likely to draw the designers attention to topics yetunconsidered. Information such as the concerns dealing with the prototypes ade-quacy are especially valuable in early phases of the design process. There are goodreasons to doubt whether other methods would have been able to make the ade-quacy concerns that clear.

Interestingly, only a small number of personal constructs directly refer tohedonic quality. We argued that both hedonic quality constructs and adequacyconcerns constructs are in fact tied to the same artifact properties. However, whatwas intended by the designers to induce hedonic quality was ill-received by the us-ers. Many people seem to share the belief thatfor professional usea certain de-gree of seriousness is indispensable. This points to another strength of the RGT.Because of its openness, important user attitudes, beliefs, and needs will surfacewithout the necessity to have a preconception of these.

Compared to other nondirective methods (e.g., open interviews), the RGT is aneconomic method. Especially in a parallel design situation with many prototypesto be assessed, the RGT remains efficient. Beside the relative small amount of timeneeded for carrying out an RGT study, the analysis of the data gathered alsoseemed to require less time than the analysis of data gathered with othernondirective methods. In cases where experimenterparticipant interaction is notnecessary or even desirable, the RGT can also be run in totally computerized ver-sions, which makes it even more efficient.

The most interesting feature of the RGT is the wide variety of different types ofanalyses that can be applied to the gathered personal constructs. It provides datathat lend to the identification of general needs, beliefs and attitudes, and specificinterartifact differences, as well as interperson or intergroup differences.

4.2. Limitations

Apart from its advantages, several limitations of the RGT became obvious duringits application: (a) the requirement of a set of at least four artifacts, (b) the insen-sitivity to good or bad attributes shared by all artifacts in the set, (c) the lack ofsupport for the actual phrasing and labeling of constructs, (d) the high number ofmerely descriptive constructs, and (e) problems with determining relationsamong constructs.

Although highly recommended by Nielsen (1993), extensive parallel designhardly ever seems to be practiced due to small budgets and lack of time (at least ac-cording to our experiences). Beyond its application in a parallel design phase, the


RGT could be used to compare a prototype with similar products already availableat the market. For example, employing the RGT to compare different design ap-proaches to competing sites in the World Wide Web may be interesting, becausethe experimenter will find various alternative products at his or her free disposal.Especially in Web site design not only well-known qualities (e.g., usability) butalso new topics (e.g., hedonic qualities, joy of use, fun) often have to be dealt with.Such new topics can be identified and explored by means of the RGT.

Another problem of the RGT touches one of its core assumptions, namely toequate a personal construct with a similaritydissimilarity dimension triggered byvariation between artifacts. Attributes of artifacts that may be important but do notvary in the set of artifacts at hand will never appear as a personal construct. Thus,good or bad attributes shared by all artifacts in the set will go unnoticed. For exam-ple, the underlying topic (construct class), adequacy concerns, would have beenunlikely to emerge without having the comic prototype in the set. This also empha-sizes the importance of maximizing the heterogeneity of the artifacts designed inparallel. Even artifacts that seem to comprise extreme design methods are desir-able. The more differences exist and the more extreme they are, the more unlikely itbecomes that relevant topics will go unnoticed.

Even though the RGT appears to have a structured and defined process, wewould like to stress the fact that the phrasing and labeling of the personal con-structs requires verbal skills of both the experimenter and the participant. The ac-tual process of phrasing remains unstructured and can be best described asmining for the core meaning of the participants initial statement. This obviouslycalls for more supportive and elaborated techniques.

The application of the RGT presented previously still produced a certain num-ber (44%) of mainly descriptive personal constructs that are of less value to guideartifact design. For example, the construct two-dimensional versusthree-dimensional leaves it open whether the participant actually prefers athree-dimensional or a two-dimensional interface. Measures should be taken to re-duce the number of merely descriptive constructs.

Finally, the construct pool gathered by applying the RGT is not structured in anyway. Thus, relations between personal constructs remain unclear. For example, thedata gathered in this study showed that the comic prototype was found to be highlyplayful and not suitable for professional use. Although it may appear obvious, onecannot readily conclude that the comic prototype is not suitable for professional usebecause it is so playful. A good deal of reasoning and interpretation is required tostate relations between constructs. This surely decreases the RGTs objectivity.

4.3. Improvements and Future Work

As already discussed, constructs that are evaluative in nature (e.g., font size appro-priate vs. text not readable) are more valuable than constructs that are descriptive innature (e.g., two-dimensional vs. three-dimensional). Unfortunately participantsare likely to come up with descriptive constructs as well as evaluative constructs. Tosolve this problem, one might simply instruct the participants to refrain from stat-


ing descriptive constructs. Another possibility is a short follow-up interview afterfinishing the RGT. In this interview, experimenter and participant briefly revieweach construct to determine which construct pole the participant considers as desir-able. This procedure would convert descriptive constructs to evaluative constructs,given that the participant is able to state a clear preference. The follow-up interviewcould also help the experimenter to establish construct priorities. Taking into ac-count that in almost every design project budget and time are limited, the prioritiza-tion of constructs seems inevitable (see Hassenzahl, 2000). To establish constructpriorities, the experimenter could simply ask the participant to rank the constructsin terms of personal relevance or importance.

One of the most impressive features of the RGT is its sensitivity to individual per-ceptions,needs,beliefs, andattitudes.Furtherworkshouldbringupamethodthat isas sensitive but applicable with only one artifact throughout the whole design pro-cessa situation much more likely to occur in an industrial setting (e.g., Hassenzahl& Burmester, 2000). Furthermore, such a method should be able to capture relationsbetween single constructs, for example as a hierarchical or network structure.

5. GENERAL CONCLUSION

In this article, we attempted to put into focus the RGT as a method of exploring de-sign space in a parallel design situation. It is not a new method and obviously has itslimitations, but nevertheless seems worth a try if user perceptions, needs, beliefs,and attitudes are to be taken into account by the design.

Beyond questions about the mere practicality of the RGT, the underlying per-sonal construct approach addresses a more fundamental issue, namely the infor-mational value of the idiosyncratic (i.e., personal) perspectives of users on theartifacts we design. Humancomputer interaction research still tends to quicklygeneralize across users. The best design alternative is generally the one the mostpeople agree on. This viewrooted in the quantitative research traditionne-glects the informational value of contradictions and inconsistency in idiosyncraticperspectives. To our mind, much can be learned from understanding the naive the-ories people have about the artifacts they live with.

REFERENCES

Banister, P., Burman, E., Parker, I., Taylor, M., & Tindall, C. (1994). Qualitative methods in psy-chology. Philadelphia: Open University Press.

Bevan, N., & Macleod, M. (1994). Usability measurement in context. Behaviour & InformationTechnology, 13, 132145.

Buur, J., & Bagger, K. (1999). Replacing usability testing with user dialogue. Communicationsof the ACM, 42(5), 6366.

Carroll, J. M. (1997). Humancomputer interaction: Psychology as a science of design. Inter-national Journal of HumanComputer Studies, 46, 501522.

Carroll, J. M., & Moran, T. P. (1991). Introduction to the special issue on design rationale. Hu-manComputer Interaction, 6, 197200.


Carroll, J. M., & Rosson, M. B. (1991). Deliberated evolution: Stalking the view matcher in de-sign space. HumanComputer Interaction, 6, 281318.

Carroll, J. M., Singley, M. K., & Rosson, M. B. (1992). Integrating theory development with de-sign evaluation. Behaviour & Information Technology, 11, 247255.

Dillon, A., & McKnight, C. (1990). Towards a classification of text types: A repertory grid ap-proach. International Journal of ManMachine Studies, 33, 623636.

Draper, S. W. (1999). Analysing fun as a candidate software requirement. Personal Technology,3(1), 16.

Dreyfuss, H. (1955). Designing for people. New York: Simon & Schuster.Ericsson, K. A., & Simon, H. A. (1984). Verbal reports as data. Cambridge, MA: MIT Press.Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bul-

letin, 76, 378382.Fransella, F., & Bannister, D. (1977). A manual for repertory grid technique. London: Academic.Gaines, B. R., & Shaw, M. L. G. (1997). Knowledge acquisition, modelling and inference

through the World Wide Web. International Journal of HumanComputer Studies, 46, 729759.Gediga, G., Hamborg, K.-C., & Dntsch, I. (1999). The IsoMetrics usability inventory: An

operationalization of ISO 9241-10 supporting summative and formative evaluation ofsoftware systems. Behaviour & Information Technology, 18, 151164.

Hassenzahl, M. (1999). Usability engineers as clinicians. Common Ground, 9(3), 1214.Hassenzahl, M. (2000). Prioritising usability problems: Data-driven and judgement-driven

severity estimates. Behaviour & Information Technology, 19, 2942.Hassenzahl, M., & Burmester, M. (2000). Zur Diagnose von Nutzungsproblemen:

Praktikable Anstze aus der qualitativen Forschungspraxis [The diagnosis of usabilityproblems: Practical qualitative approaches]. In K.-P. Timpe, H.-P. Willumeit, & H. Kolrep(Eds.), Bewertung von Mensch-Maschine-Systemen 3: Berliner WerkstattMensch-Maschine-Systeme (pp. 171184). Dsseldorf, Germany: VDI Verlag.

Hassenzahl, M., Platz, A., Burmester, M., & Lehner, K. (2000). Hedonic and ergonomic qual-ity aspects determine a softwares appeal. In Proceedings of the ACM CHI 2000 conference onhuman factors in computing (pp. 201208). New York: ACM.

Jrgensen, A. H. (1989). Using the thinking-aloud method in system development. In G.Salvendy & M. J. Smith (Eds.), Designing and using humancomputer interfaces and knowledgebased systems (pp. 743750). Amsterdam: Elsevier.

Kelly, G. A. (1955). The psychology of personal constructs (Vol. 1 & 2). New York: Norton.KNOT: Knowledge Network Organizing Tool [computer software]. (1992). Las Cruces, NM:

Interlink.MacLean, A., Young, R. M., Bellotti, V. M. E., & Moran, T. P. (1991). Questions, options, crite-

ria: Elements of design space analysis. HumanComputer Interaction, 6, 201250.Moran, T. P., & Carroll, J. M. (1994). Design rationale: Concepts, techniques, and use. Hillsdale,

NJ: Lawrence Erlbaum Associates, Inc.Nielsen, J. (1993). Usability engineering. San Diego, CA: Academic.Schvaneveldt, R. W. (1990). Pathfinder associative networks: Studies in knowledge organization.

Norwood, NJ: Ablex.Shaw, M. L. G. (1980). Advances in personal construct technology [Editorial]. International

Journal of ManMachine Studies, 13, 12.StarCraft [computer software]. (1998). Irvine, CA: Blizzard.Wandmacher, J. (1993). Software-Ergonomie [Software-ergonomics]. Berlin: de Gruyter.Weisscher, A. (1999). Innovative user interfaces for power distribution systems. Unpublished

manuscript, Technical University Delft, Faculty of Industrial Design.Wright, P. C., & Monk, A. F. (1991). A cost-effective evaluation method for use by designers.

International Journal of ManMachine Studies, 35, 891912.


FIGURE 1 Artifacts (i.e., prototypes) used in the study.

COLOR PLATE 17

FIGURE 1 (continued) Artifacts (i.e., prototypes) used in the study.

COLOR PLATE 18

a1: Overview of the industrial plant. The green disc at the left repre-sents the pump.a2: Detail of the windows-like dialog box.a3: Detail of the action confirmation.

b1: Close-up of the pump (no overview exists).b2: Pump with a dialog cube.b3: Details of the dialog cube turning

c1: Overview. The green disc at the left represents the pump.c2: Detail of the dialog flag extruding from the pump symbol.c3: Detail of the action confirmation.

d1: Overview. The disc with arrow at the left represents the pump.d2: Detail of a figure holding up a dialog sign.d3: Detail of the action confirmation.

e1: Overview of the industrial plant in a 3D style. Clicking on thepump initiates zoom-in.e2: Detail of the pump with an on/off-knobe3: Detail of the action confirmation.

f1: Overview. The green rotating disc at the left represents thepump.f2: Detail of the transparent dialog box. It is visually connected tothe pump by a line.

g1: Overview in an isometric, rendered style. The golden shape rep-resents the pump. The pump is selected and in on-status.g2: Detail of action confirmation.g3: Pump in off-status.

hassenzahl, wessler - 2000 - capturing design space from a user perspective the repertory grid...

Documents

design process

osnabrckthe design

final design

designrelevant information

design space analysis

artifactsdesign space

user perspective

moran carroll