the case for supportive evaluation during design

29
Interacting with Computers zml 7 no 2 (1995) 115-143 The case for supportive evaluation during design Jon May and Philip Barnard The relevance of human-computer interaction to industry is being questioned, and the emphasis is shifting away from providing gener- alised support to systematic evaluation methods, typified by cognitive walkthroughs (CW). The evidence suggests that CW has not proved as effective as hoped. This evidence is examined, and the authors argue that the problem lies not with CW or its underlying theory in particular, but with its limited scope and in the increasing dissociation of an evaluation method from its theoretical foundation. Evaluation methods retaining a theoretical element would provide the necessary conceptual support to enable designers to identify, comprehend and resolve usability problems, and would also be less limited than dissociated evaluation methods in their breadth and depth of applica- tion. A vision of a ‘supportive evaluation’ tool is presented and cognitive task analysis (CTA), the methodology upon which a proof- of-concept tool has been based is described. Three brief design scenarios are described to illustrate how CTA supports the identifica- tion and resolution of usability problems and the role of cognitive modelling in the context of design is discussed. Keywords: human-computer interaction, user interface, cognitive walkthrough, cognitive task analysis Despite the range of theoretical developments in human-computer interaction over the past twenty or so years, there are few areas where substantive theory has had a direct impact upon design (Carroll et al., 1991). Questions have been asked about the relevance of HCI theory to industry, and whether basic cognitive theory will ever be able to provide extensive support within the design process (eg, Landauer, 1987). Approaches to design support such as guidelines, (e.g., Smith and Mosier, 1986), design heuristics (e.g., Nielsen, 1992), or usability standards (e.g., Dickerson and Hedman, 1993), have been Medical Research Council Applied Psychology Unit, 15 Chaucer Road, Cambridge Cl32 2EF, UK. Email [email protected] Paper submitted: 27 April 1994; rmised 22 November 1994 0953-5438/95/02/011F29 0 1995 Elsevier Science Ltd 115

Upload: jon-may

Post on 26-Jun-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The case for supportive evaluation during design

Interacting with Computers zml 7 no 2 (1995) 115-143

The case for supportive evaluation during design

Jon May and Philip Barnard

The relevance of human-computer interaction to industry is being questioned, and the emphasis is shifting away from providing gener- alised support to systematic evaluation methods, typified by cognitive walkthroughs (CW). The evidence suggests that CW has not proved as effective as hoped. This evidence is examined, and the authors argue that the problem lies not with CW or its underlying theory in particular, but with its limited scope and in the increasing dissociation of an evaluation method from its theoretical foundation. Evaluation methods retaining a theoretical element would provide the necessary conceptual support to enable designers to identify, comprehend and resolve usability problems, and would also be less limited than dissociated evaluation methods in their breadth and depth of applica- tion. A vision of a ‘supportive evaluation’ tool is presented and cognitive task analysis (CTA), the methodology upon which a proof- of-concept tool has been based is described. Three brief design scenarios are described to illustrate how CTA supports the identifica- tion and resolution of usability problems and the role of cognitive modelling in the context of design is discussed.

Keywords: human-computer interaction, user interface, cognitive walkthrough, cognitive task analysis

Despite the range of theoretical developments in human-computer interaction over the past twenty or so years, there are few areas where substantive theory has had a direct impact upon design (Carroll et al., 1991). Questions have been asked about the relevance of HCI theory to industry, and whether basic cognitive theory will ever be able to provide extensive support within the design process (eg, Landauer, 1987). Approaches to design support such as guidelines, (e.g., Smith and Mosier, 1986), design heuristics (e.g., Nielsen, 1992), or usability standards (e.g., Dickerson and Hedman, 1993), have been

Medical Research Council Applied Psychology Unit, 15 Chaucer Road, Cambridge Cl32 2EF, UK. Email [email protected]

Paper submitted: 27 April 1994; rmised 22 November 1994

0953-5438/95/02/011F29 0 1995 Elsevier Science Ltd 115

Page 2: The case for supportive evaluation during design

found to be rather too general in their prescriptions, or too limited in their scope, to be practically applicable in design (Polson et al., 1992). Wallace and Anderson (1993) describe the ‘technologist’ approach to design support that has arisen in consequence, its essential feature being the provision of an automated interface design tool or prototyping method that is able to ‘free the applications programmer from low-level details’, in an attempt to relieve the interface designers of any need for knowledge of HCI theory.

In attempting to maintain a link between design and HCI theory, the emphasis has shifted away from generalised support and towards systematic methods of evaluation. The aim has been to develop methods that do not require human factors’ expertise or theoretical knowledge on the part of their practitioners, yet are still derived from a theoretical background. Central among these approaches is cognitive walkthrough, or CW (Lewis et al., 1990; Polson et aI., 1992), which provides an evaluative process for software developers to apply to their design to check that users will be able to learn it easily.

Evidence is now accumulating that CW itself is not as effective in practice as had been hoped. In this paper we examine this evidence, and argue that the problem lies not with CW or its underlying theory in particular, but in the increasing dissociation of an evaluation method from its theoretical foundation. It is no longer a theory-based method or a heuristic method, but a hybrid. However applicable and accurate any dissociated evaluation method may be, the lack of conceptual support makes it difficult for evaluators to justify their assessments, to gain insight into the reason for problems, or to derive solutions. The dissociation also requires the theoretical analysis to be constrained to generally relevant aspects of interface design, resulting in a lack of flexibility and scope. With its emphasis on verbal aspects of an interface, for example, it is not obvious how CW can be used to support the evaluation and discussion of currently fashionable design issues such as graphical interfaces, animation, sound, and system generated facial expressions.

We suggest that evaluation methods retaining a theoretical element would provide the necessary conceptual support to enable designers to identify, comprehend and resolve usability problems, and would also be less constrained in their breadth and depth of application, allowing generalisation to novel design issues as they arise. The dissociation that should be made is not between an evaluation method and its theoretical basis, but between the concepts used by the theory and their analytical application. Design practitioners following a cognitive evaluation method should have an understanding of the concepts and formalism of the underlying theory, but not be required to produce the analysis themselves. The skills of designers, after all, lie in design, not in cognitive modelling. The goal of cognitive engineering must be to provide designers with theoretically based methods and tools that substitute for the HCI ‘craft skills’ of cognitive modellers. The designers would then be free to use their own skills in an appropriately supported manner. For this partnership to operate, there must be communication between the designers and modellers (or the modelling tool), and so there must be some conceptual common group that can be used by designers to describe their problems to the modellers, that the modellers or the tool can use to report the analyses, and that the designers can use to reason

116 lnteracfing with Computers vol 7 no 2 (1995)

Page 3: The case for supportive evaluation during design

about the implications. The role of cognitive theory is to provide this conceptual common ground.

We briefly discuss some problems that have been reported with CW to set out the nature of the difficulties faced by users of a dissociated evaluation method. We then outline the case for supportive evaluation, and present a ‘vision’ of its use by a design team in the form of an automated reasoning tool. Although this might sound ambitious, a proof-of-concept tool has been built, using concepts drawn from cognitive theory. We describe the analyses of three brief design scenarios to show how the supportive evaluation provided by this tool would guide design practitioners towards solutions for the problems that it identifies.

Evidence from the use of cognitive walkthrough

The CW method is based upon a cognitive theory, CE+ (Polson and Lewis, 1990) that combines problem-solving heuristics and a model of exploratory learning (Lewis et al., 1987) with cognitive complexity theory, an extension of GOMS (Kieras, 1988; Bovair et al., 1990). Rather than attempt to convey the theory to designers, CW is based upon its distillation into eight ‘principles for successful guessing.’ Designers are required to ‘walk through’ a task, identify- ing each action that a user has to carry out. They must then answer a set of questions for each action to assess whether or not the principles have been applied. When all of the tasks that an interface is intended to support have been walked through in this way, the designers will have identified any situations where CE+ would predict user errors.

CW is seen as a valuable methodology because it appears to require no specialist cognitive science or user-interface design expertise, to require few resources in terms of time, effort and prototype building, and it appears similar in structure to other forms of walkthrough with which the software developers are already familiar. The findings of studies comparing CW with other evaluation techniques question these assumptions. Jeffries et al. (1991) found that CW took longer to apply than a heuristic or a guideline evaluation method, but that it missed “general and recurring problems”, and that the problems that it did identify were less severe. Where the theory, CE+, could potentially generalise, CW is restricted to a specific set of questions, and hence its answers are limited in breadth.

Despite the intention to simplify the exhaustive effort required in GOMS analyses, the systematic nature of CW has also been found to be tedious and over-detailed (Jeffries et aZ., 1990; Rowley and Rhoades, 1992). The method is dependent upon the selection of a representative range of tasks that users will want to carry out (Lewis et al., 1990), yet Wharton et al. (1992) found that the length and complexity of the evaluation process resulted in tasks being selected that were simplifications of what users would do, and that as a result potential problems may have been overlooked, because “tasks that mirror a simple, functional decomposition of the interface typically do not expose many prob- lems with the interface”. Attempts to solve this problem have centred on reducing the form-filling (e.g., the HyperCard stack described by Rieman et al., 1991) and the number of task steps evaluated (as in the ‘cognitive jogthrough’;

May and Barnard 117

Page 4: The case for supportive evaluation during design

Rowley and Rhoades, 1992), but attempting to speed up the walkthrough makes it even more likely that crucial task steps will be missed. The detail inherent in CW stems from its formulaic concentration upon the meaning and relationship of low-level, atomic actions, and is inevitable in a method that avoids the problems in selection of analysis points by requiring all points to be analysed. This focus has in itself been identified as a weakness, since it neglects the meaningfulness of the overall task: “The designer also needs to know whether the task itself is sensible, non-circular, too long, or important to the user.” (Wharton et al., 1992)

The need to attend to each action individually leads to another set of difficulties in applying the technique, for as Rowley and Rhoades (1992) concluded: “The walkthrough, after all, is an interface evaluation procedure and does not lend itself to the design process.” In their summary of CW evaluations, Wharton et al. (1992) noticed that “while doing a CW, evaluators often noticed problems that were not directly relevant to the current task,” and that “they often wanted to pursue a problem beyond the limits of the walkthrough - to design a fix, followed immediately with a walkthrough on the fix“. This iterative pattern of ‘opportunistic design’ has been widely noted as studies of design practice (Terrins-Rudge and Jorgensen, 1993), and yet is impossible within the formal structure of the CW method.

Dissociation of method and theory

The CW methodology attempts to solve the problem of introducing human factors information about usability into the design process by crystallising the implications of a particular theoretical description of human-computer inter- action into a checklist of guidelines and prescribing a formal appraisal of an interaction against that checklist. In dissociating the practice from the theory, however, the approach removes not just the complexity that theory brings, but also the conceptual support that it offers. A successful theory does more than prescribe a sequence of steps to be followed. It should provide a deeper understanding of the steps, and the justification for their existence and execution. It should afford practitioners some insight into what can be done when a step fails or cannot be executed because the situation does not match the assumptions upon which the theory is based. With the aim of simplifying the knowledge requirements placed upon the designer, the CW approach omits this conceptual support. Its users must turn to other sources of knowledge to help them solve their problems.

In particular, evaluators still need a background in cognitive science and usability testing to conduct a walkthrough, for although the appraisal is structured as a sequence of yes/no questions, coupled with rough estimates of the percentage of users likely to experience a problem, “claims that a given step would not cause any difficulties must be supported by theoretical arguments, empirical data, or relevant experience and common sense of the team members” (Polson et al, 1992). The software engineers who are intended to use the method are limited in their knowledge of empirical usability testing, and the method

118 Znteracting with Computers ml 7 no 2 (1995)

Page 5: The case for supportive evaluation during design

provides no support at this crucial point, where they must estimate the difficulty of their design.

CW might be expected to improve upon the other evaluation methods through its formal structure, and in the decomposition of a task sequence to emphasise the points of the design that should be checked. Unfortunately it is the very formality of the walkthrough and the reliance upon the task decom- position that causes problems in its application. While evaluators with exper- tise in user interface design can use their existing knowledge to identify the appropriate steps in an interaction, software engineers have much more difficulty. The theory that gives rise to the method does not claim to deal with all aspects of the cognitition involved in an interaction, or to cover the whole range of usability problems (Wharton et al., 1992), and without additional theoretical support, evaluators will only be able to identify a limited range of problems.

Towards supportive evaluation

The emphasis of the CW approach is on the assessment of a specification that has already been designed, but the evidence suggests that it does not perform as well as the usability testing that it is intended to replace. Although usability testing is expensive and time-consuming, in the context of a large commercial design it will always be more cost-effective than continuing with the produc- tion of a design that has undiscovered usability problems (Grudin, 1991). No theoretically based evaluation method is likely to challenge this economic fact. However, in precisely these situations, where time and resources are available, the application of theoretical analyses to support design will more than repay the additional effort. Nielsen and Phillips (1993) found that the benefits of usability engineering were between 1,000 and 5,000 times greater than the costs, regardless of method.

Because it is essentially an evaluative technique, CW does not provide any direct support for the creation of the design specification. We do not challenge the validity or relevance of the theoretical underpinnings of the method, for its basis in problem solving, exploratory learning and cognitive complexity theory is well founded. What does not seem appropriate is the attempt to represent the depth of these theoretical approaches in eight principles, and then expect designers to be able to work back from the assessment of their interface against these principles, to produce an improved design. We have seen how the evidence indicates that CW evaluators need theoretical support in the task decomposition, and in the assessment of individual steps. They will also surely need an understanding of the theoretical issues to interpret the results of a walkthrough: to understand why the walkthrough is predicting problems with a specific part of the design, and crucially, to derive well-motivated ideas for improvements.

If any theoretically based principles, or their representations as guidelines, are valid, they should be applied early in the design process rather than afterwards as part of an evaluation methodology, to help the designers produced a usable interface (c.f. Dowel1 and Long, 1989; Lim et al., 1992).

May and Barnard 119

Page 6: The case for supportive evaluation during design

Problems that are identified too late in the specification process may have become too deeply embedded to be modified. Indeed, it may not be apparent whether a usability problem is really caused by aspects of the interaction at which it becomes apparent, or whether the true reason for its occurrence lies deeper in the assumptions that have been made to reach the design solution. In consequence, post hoc evaluation methods may only be able to procure ‘fire-fighting’ analyses.

Taking the experience of CW evaluators into account, there are clearly some very restrictive demands upon any method that seeks to involve cognitive theory in the design process. It must be able to deal with partially specified designs, so that early commitments are not too deeply embedded to be rejected. It should be applicable by people who do not have extensive knowledge of the theoretical basis or procedures involved in the analysis, although as we have seen, they should probably have some wider experience of human factors or usability issues. It should be applicable quickly, and its results should allow the generation of alternative design solutions that compensate for, or avoid any problems it indicates. A pragmatic requirement that must be met for a method to be widely acceptable, and justifiable to the organisations within which design teams operate, is that it should not require the adoption of any particular structured method or design notation. A method able to meet these demands would provide timely evaluations of design decisions, as they are made, supporting designers as they work their way through the design space. We see this supportive evaluation as potentially more beneficial than either generalised support or dissociated evaluation methods.

Vision of supportive evaluation method

Imagine a team of interface designers who are considering possibilities for one function in a system that they are developing. They want to know how their choice would influence usability. One member of the team, who has some human factors’ expertise, writes out half a dozen task sequences in which this particular function would be of use, and turns to a computer that is running an expert system. It has a knowledge-base that models how users’ cognitive processes and mental representations will affect their performance and learning in the context of interface design.

As they have done some modelling for other aspects of this design before, they first load a database that contains descriptions of the basic interface, covering its appearance and current functionality. Once this description has been loaded, they decide to change the number of operations listed in a particular dialogue box and then query the ‘consequences’. The system im- mediately responds by asking them to frame the scope of the required modelling. A broad assessment of learning and performance is requested.

The team member with some HF expertise then responds to a series of questions about the tasks envisaged that would involve this function, the properties of the visual appearance of the interface that its incorporation would affect, and how it is to be named. The rest of the design team offer suggestions about the layout and appearance of the interface objects as the questions are

120 Interacting with Computers zml 7 no 2 (2995)

Page 7: The case for supportive evaluation during design

posed, and in the process enrich their original ideas with a few details that they had not specified, such as highlighting and greying-out options, font sizes, and so on. In 15 minutes the descriptive part is essentially complete and the expert system ‘delivers’ a textual report which describes probable properties of user performance at the novice, intermediate and expert stages of use. It is about four pages long, and contains comments on effort in learning, an error likelihood assesssment and some ‘clashes’ with prior decisions.

While the rest of the team read through the output, the HF expert examines the underlying ‘model’ to get some additional information about why these behavioural characteristics have been predicted. On looking at the attributes of the model that were affected by the new description, it becomes clear that there is some ambiguity in the way that a few of the operations have been grouped in the dialogue. After a short debate, the design team suggest some new groupings, and they edit the relevant part of the interface description that they have supplied to the expert system. Once again they query the ‘consequences’.

This time the system has a minimal number of questions, because only a couple of the specifications are different, and it already has the information from the previous evaluation. Three or four minutes later a new report is shown which suggests that there appear to be three major counts on which the new alternative design has an advantage in terms of error likelihood and perform- ance. The previous alternative has a marginal advantage in novice learning, but the advantage is short-lived, and intermediate expert users are predicted to be basically better off with the new version. The HF expert points out that according to the model, the novices’ problems are being caused by one operation in particular, which it is predicted would be difficult for novices to locate within the dialogue. Given this information, one of the design team points out that there was no particular reason to keep it where they had originally suggested, and proposes a new location for it, where it seems to ‘stand out’ better. The third evaluation is performed, and the repositioning is shown to resolve the predicted difficulty for novices. The HF expert saves the interface description for future use, while the design team mock up the favoured option in a prototyping tool to show to the full project team at that afternoon’s progress meeting.

Cognitive task analysis

The consultation with the expert system in the above scenario helped the design team in a number of ways. It helped them to enrich their design, by asking them for information that they might not have otherwise realised was relevant to the cognitive usability of their dialogue. It automated the cognitive analysis, allowing them to ussess their proposals rapidly without distracting them by the analytic process. By presenting the analysis in conceptually relevant terms, it gave the team support in identifying the source of problems, and the directions that solutions needed to move towards. Finally, it allowed them to iterate towards a more cognitively usable solution.

Clearly, this ‘vision’ of an interactive modelling tool that can provide on-the-spot evaluations of alternative design options is not yet a reality. It is not

May and Barnard 121

Page 8: The case for supportive evaluation during design

too far-fetched, however, for two generations of a ‘proof-of-concept’ demonstra- tion tool have been built (Barnard et al., 1988; May et al., 1993). We shall not pursue the details of the actual reasoning mechanics here. It is nevertheless important to distinguish the basic theory represented in the modelling tool’s rules from the actual process of model building. The basic theory is interacting cognitive subsystems, or ICS (Barnard, 1985), which provides a definition of cognitive resources and a set of constraints governing their use. Using this as a basis, cognitive task analysis (CTA) derives a model containing a large number of attributes that describe properties of mental processing activity, what the cognitive processes do in context, how they use different memory records, and how all relevant parts of the theoretical structure would be dynamically co-ordinated during learning and task execution.

While this small tool only has around a hundred ‘modelling’ rules, which were written to allow it to deal with seven specific design scenarios like the one described in the ‘vision’, the way that it has been constructed provides it with a surprising degree of generalisation. Its basic sequence of operation follows that of the visionary tool, in that an interface design is described to the system, which uses a set of ‘input’ rules to represent it as a number of structural descriptions reflecting the information that would be available to the user. These representations are then used to drive a second set of rules that model the cognitive resources required to operate upon the interface. The system then uses a third set of ‘output’ rules to interpret the model and draw inferences about the consequences in terms of the target domain, the design of human- computer interfaces. The goal of the tool is to encourage designers to think about the interface in terms of theoretically motivated concepts, and by rapidly providing them with an evaluation that is related to these concepts, it allows them to comprehend the causes of, and possible solutions to, usability problems without being distracted by the ‘nuts and bolts’ of the analytical method.

The rudimentary content of the demonstrator tool, particularly in the ‘input’ and ‘output’ classes of rules, inevitably makes its actual use hard to assess, but three exercises with potential users of such a system have drawn encouraging conclusions about the value of the approach in general (Buckingham Shum et al., 1994(a)). A group of design practitioners found that the expert system helped them to understand the role and relevance of CTA to their own situations (Aboulafia et al., 1993), and a class of ergonomics masters students were found not to need in-depth knowledge of CTA to use the system, although the particular method of ‘structural description’ required to enter information to the system was more problematic (Buckingham Shum and Hammond, 1994). A very brief overview of CTA was found to be successful in making the approach credible to a team of hypermedia designers, enabling them to comprehend and appreciate a modelling report on part of a system they were in the process of designing (Buckingham Shum et al., 1994(b)). A version of this overview is given in the Appendix.

The expert system in the ‘vision’ was automating the skills of a cognitive modeller, rather than those of a human factors’ specialist. It is valuable to distinguish between human factors’ expertise that must play a part in design

122 Interacting with Computers vol 7 no 2 (1995)

Page 9: The case for supportive evaluation during design

and which cannot be automated, such as that involved in identifying users’ tasks or the semantics of groupings, from the modelling expertise, which plays a supportive role and which can be automated, given an appropriate analytical method. The modelling carried out by such an automated system could provide an evaluation of a part of a design, while the concepts used in the analysis provide a language for reasoning about the design. To illustrate the way that we believe this distinction between conceptual support and modelling can aid designers, we present three brief ‘design scenarios’ that we have dealt with in the elaboration of this approach to supportive evaluation. Although simplified for current purposes, these scenarios are all based on problems encountered in real design situations. The analyses rely upon the basic concepts of CTA that have been developed and reported elsewhere (e.g., Barnard, 1987; Barnard et al., 1988; Barnard, 1991; May ef al., 1993; Barnard and May, 1993), but in Appendix 1 we provide a digest of the main concepts and terms. The first scenario concerns the control of a video window in a computer-supported co-operative work environment, and introduces some of the main elements of a CTA model (which are all described in Appendix 1). The second and third show how a conceptually based explanation of problems in the interactions can elicit novel design solutions.

Video buffer scenario

A real-time window was being implemented for a multimedia workstation. It was to have a ‘buffer’ containing the last 30 seconds or so of video, so that the user could review an event that had just happened. The hardware allowed the user to move serially through the buffer in either direction at 4 or 12 times normal speed, freeze the frame, or move forwards at normal speed, but not to jump directly to any point. The two design options being considered were one based upon scroll bar techniques, and one based upon a ‘video recorder’ analogy.

Interpretations of the two alternative designs for this scenario are shown in Figure 1. The ‘video button’ alternative is modelled on the buttons used in a VCR, and each button corresponds directly to a speed-of-playback of the video buffer. The ‘scrollbar’ is modelled on scrollbars in other windows and applica- tions, and represents the position of the image within the buffer and speed of motion through the buffer. Changes in the velocity are effected by the user clicking to one or other side of the ‘cursor’, which cannot be directly manipu- lated. Although these two designs present the user with the same functionality, the sequences of action that must be performed and the information afforded by the interface about the status of the video buffer are quite different.

In particular, the video buttons offer no direct information about the status of the buffer, i.e., whether the picture being viewed is ‘live’ or from the buffer, and so there is little support for the user in terms of goal formation, or deciding to change its status. When it comes to using the buttons, once a goal has been formed, the situation is better. Most users will be familiar with the design of the video button icons and their associated functions, and so will be able to carry out action specification and identify which button needs to be pressed or clicked.

May and Barnard 123

Page 10: The case for supportive evaluation during design

Figure 1. ‘Video button’ and ‘scrollbar’ alternatives

In CTA terms (see Appendix 1 for explanations) they will have abstracted entity property records (EPRs) that allow their propositional meaning to be derived directly from their appearance. The triple-triangle icon is not standard, and so its functions must be worked out by novice users, using implicational repre- sentations derived from the propositional meaning of the other icons. This can be done by analogy with the meaning of the double-triangle icons, and its spatial positioning within the icon array, and should not present too much difficulty. To construct the analogy the implicational processes will be able to recruit abstractions - common task records (CTRs) - about leftwardness and rightwardness to produce propositions of backward and forward motion. The increase in the number of elements in the object structure of the icon, and the increasing distance from the central ‘stationary’ icon both correspond to changes in a single attribute of the function of the icons, i.e. increases in rewind speed, and so this should be easily abstracted out. The consequence of this is that existing knowledge can be built upon to represent the new icon as well as the known icons, making learning straightforward.

The use of ‘greying out’ and ‘reverse highlighting’ to indicate functions that are either not available or currently selected is a common direct manipulation interface technique, and so if the user has experience of such systems (in CTA, experiential task records (ETRs) their application to these icons should not be problematic (since they will have procedura2ised knowledge about the meaning of ‘greyness’), even though the video buttons that the user would be familiar with would not exhibit such behaviour. In fact, the EPR for a video control button would have a description corresonding to ‘in use’, since VCR buttons usually either light-up in some way, or are mechanically depressed, to indicate system status. The EPR can thus easily be adapted to reflect the use of highlight. Only

124 interacting with Computers vol 7 no 2 (2995)

Page 11: The case for supportive evaluation during design

use buffer 0 Figure 2. Propositional CTR for use of buffer with ‘video buttons‘

the ‘greying out’ would require formal rather than content alterations to the EPR.

In summary, then, the compatibility of the video button icons with the user’s existing knowledge (i.e. EPRs) would be a successful use of metaphor, allowing the planning of task performance, or action specification, to proceed without major problems. The actual performance, or action execution, would also be straightforward, because each button is acted upon in the same way, and so the same CTR for performance could be abstracted for the use of each button (Figure 2). ‘Use buffer’ is the superordinate element of the two actions ‘find icon X’, which is the subject of the representation, and ‘click icon X’, which is its predicate.

This simple CTR would compensate for the low goal formation support. Goals that are inconsistent with the status of the buffer (e.g. intending to fast-forward when the buffer is empty, the system is in real-time mode, or the buffer is already fast-forwarding) would be picked up in the first step of this CTR, since the greyed out or highlighted appearance of the icons (their object representa- tion) would not match the expected (i.e., propositional) appearance. Errors would thus tend to be noticed before the user tried to click on the icon. Overall, the analysis suggests that the ‘video buttons’ options would present little difficulty in use.

The rationale given for the use of a scrollbar is that the video window and image buffer are analogous to other classes of window and stored information that the user will be familiar with, for example in document processing, and drawing applications. However, in these uses of scrollbars the stored informa- tion is constant, and the scrollbar allows the user to navigate absolutely within it, whereas in this application the buffer is constantly changing, and the scrollbar allows the user to move relative to the current position. The EPRs for scrollbar use that the user might have abstracted would therefore not be directly applicable to this application. This difference between absolute and relative navigation is reflected in the actions users are expected to perform upon this scrollbar. The usual use of scrollbars is to manipulate them directly, dragging the cursor to an absolute position within the information buffer. Here the user cannot manipulate the cursor directly, but is expected to click either side of it to increment or decrement their speed of movement through the buffer.

The main justification given for the adoption of the scrollbars, then, that their use would be “consistent” with other applications, can be seen to be mislead- ing. The actions that a novice user would perform with the scrollbar would not lead to the results that they expect, and the real functionality of the scrollbar

May and Barnard 125

Page 12: The case for supportive evaluation during design

cannot be determined from their existing EPRs. This is because the same action will have different effects depending upon the system’s status. Clicks to the left or right of the cursor do not map directly to a result, but incrementally affect the status. A click to the left of the cursor, for example, means “go backwards faster” or “go forwards slower” rather than “go back at 4~ normal speed”. There is an inconsistent mapping between the goals a user will form (a specific rate of motion) and the actions they can actually perform (increment current speed). In effect, the system is ‘modal’, and the user must take into account the state of the system to be able to carry out action specification. In contrast to the video buttons option, a single CTR for all operations is not abstractable.

For accurate action specification, knowledge would be required about the buffer’s status. This is in theory obtainable from the motion of the cursor, but the degrees of difference between 4~ and 12x motion would be small, hence ambiguous, and in practice the user would be more likely to assess the current playback speed from the image itself. This would simplify this aspect of the action specification, but would require transitions in the visual structure of the object representation to focus alternately on the scrollbar and the image. The evaluation of the buffer status would demand a cycle of interchanges between propositional and implicational representations of the image, and the search for a ‘click’ position relative to the cursor, while the cursor is itself moving, would require interchanges between propositional and object representations to control visuomotor co-ordination. In consequence this task would have a heavy processing requirement in both action specification and action execution.

It is clear from these analyses that the video button icons should be preferred over the scrollbar both on grounds of compatibility with the user’s existing knowledge, and intrinsic ease-of-use. In reaching their conclusions about the alternatives, the designers would have to decide whether the one criticism of the video buttons, the lack of support for goal formation, was critical, and whether some feature should be added to provide more support for this phase of cognitive activity, perhaps a reduced, afunctional scrollbar. They could then remodel the improved design, iterating towards a solution.

In this scenario, the analysts were able to describe the information repre- sented by the interface, the user’s evolving knowledge of the interface, and the availability of prior knowledge as classes of image record contents (ATRs, ETRs, EPRs and CTRs), bringing them together within a common conceptual framework. The CTA provided a vocabulary for assessing the design problem and regularising human factors’ knowledge. In a conventional setting, accessing this knowledge would be part of the craft skill of an HCI expert, but it could be represented in a modelling tool as a set of output rules, extrapolating from a model of cognition according to empirical evidence. The human factors ‘expert’ would be presented with the relevant information and be able to relate it to their own particular questions. In the next scenario, we show how CTA can assist designers in assessing the assumptions behind their design alternatives.

Small-screen map scenario

This scenario deals with the problem of presenting both detailed views and less

126 Interacting with Computers vol 7 no 2 (1995)

Page 13: The case for supportive evaluation during design

a Schematic diagram

Figure 3. Three alternative designs for the small screen map scenario: (a) split-screen; (b) multi-focal lens; (c) fish-eye lens

detailed overviews of information to a user, an issue that is still being debated within the HCI community (e.g., Tani et al., 1994; Rao and Card, 1994). An information system is being developed on a PC for the general public to use a trackerball to find their way around a map of a city centre, get an overview of the general area, and to identify key detailed points. These tasks require different levels of detail, and it is not possible to show the whole map at the scale required. A method is sought that will allow users to see both low magni- fication overviews of the whole map without details and high magnification views with sufficient detail, but of a limited area. Three design options have been proposed and are shown in Figures 3(a)-(c).

The first option (Figure 3(a)) splits the screen into two areas, the top two- thirds of the screen showing the low magnification overview map, and the lower third showing a high magnification view of a subsection of this map. In the other two options a region of the screen is devoted to a close-up view, and can be moved around the background low magnification view, which is compressed and stretched as the close-up box moves to ensure that no information is ever hidden. In the second option (Figure 3(b)) each of the eight surrounding areas is simply squashed or stretched uniformly to fit the space available, while in the third option (Figure 3(c)) a continuous function smoothly transforms the background to avoid discontinuities.

To use any of these options users must alternately attend to information in two areas, requiring a ‘transition’ between elements of the object representa- tion. In the split-screen option, the part of the overview with the inverted background would stand out, forming the ‘subject’ of the object representation, and so no further transitions would be required to manipulate it. Moving the trackerball while attending to the overview results in this highlighted area moving appropriately. If the close-up area is being attended to, though, there are problems, since moving the trackerball results in elements moving in the opposite direction. The user would therefore have inconsistent experiential fask records (ETRs) for motion, retarding the acquisition of task-action mappings for action execution.

In the other two options the user has to cope with systematic, but unfamiliar distortions of a visual image. These distortions would prevent the use of

May and Barnard 127

Page 14: The case for supportive evaluation during design

existing well-proceduralised knowledge for the processing of visual structures, and would substantially increase the complexity of interchange between object, propositional and implicational representations. In addition, the continuously variable nature of the distortion would give rise to highly variable active tusk records (ATRs) in the visual representation, disrupting processing, and produc- ing inconsistent ETRs in the object and propositional representations.

In the second option, the discontinuities that must exist at the boundaries between nine sections would also give rise to difficulties in forming proposi- tional representations from the object structure, and it would be difficult to comprehend the spatial inter-relationships of elements within the display. The user would therefore not use the overview area of the screen for comparison of elements at all, and would instead attend just to elements shown within the high magnification area. The only use for the overview area would be to identify new elements to examine with the close-up view. Route information could only be obtained through the close-up, and if the route extended outside the close-up box, the box would have to be moved towards the approximate location of the destination. The subject of processing would have to alternate between the target in the overview and the contents of the close-up box to register the route information, requiring transitions between the elements in the close-up box and other (unproceduralised) elements of the overall view. This would require processing to correct for the motion of the target (since it would move away from the central box as it approached, and on crossing the boundary of two compression zones, would move in a different direction). This processing would interfere with the processing required to move and recall the motion of the box, hindering an accurate assessment of the geographical dimensions of the route.

The third option has all the disadvantages of the second concerning the proceduralisation of processing, although the more continuous nature of the motion of elements within the overview would make the motion towards a target easier. The circular zones remove the discontinuities that would occur at the edges of the square zones, and elements would now always move towards the centre of the detailed view as it approached them, but rather than aid processing, this could allow extensively proceduralised mappings from visual to implicational representations to trigger feelings of disorientation and a sensation of motion at odds with other sensory inputs (a situation similar to those that induce motion-sickness).

All three of these options attempt to allow the user to make use of the close-up and overview simultaneously. Two of them, however, introduce distortions into the low-level view that make it impossible for the user to extract the relevant information from it. On these grounds alone the split-screen view can be identified as the preferred option. The other two options can be seen from the CTAs to actually hinder or prevent the user carrying out the tasks that they are supposed to support. The analyses have identified a recurring problem for the user in relating the motion of the close-up to the overview, and in making the transitions between the content of the overview and the content of the close-up window. Both of these aspects of the problem involve transitions between different elements of the visual structure of the screen, and transitions

128 Interacting with Computers vol 7 no 2 (1995)

Page 15: The case for supportive evaluation during design

click

.

click

Figure 4. Four successive screen-shots from possible re-design of the ‘small-screen map‘ scenario, showing the user ‘zooming-in’ on a particular place.

into the substructure of these elements. Even the preferred alternative is not ideal in this regard.

Having identified this as the key to the usability of the system, it is possible to ask how the nature of the transitions around the visual structure could be used in the design. An assumption in the design requirements is that it is necessary and advantageous for the user to have both the close-up view and the overview visible simultaneously, yet the cognitive models point out that the user must make transitions between the two views, and will not be able to extract information from them both at the same time. The designer could therefore question this earlier design decision, and think about designs that do not attempt to show differently-scaled maps within one screen display.

One possibility is to replace the transitions that the user would need to make between different areas of the screen by changes in the screen display, so instead of the user having to attend to a different part of the screen, the whole screen would be taken up by the new scale view. This could be managed by, for example, using the trackerball to move a set of cross hairs around the screen to locate a point of interest within the map, and then ‘zooming in’ so that the point under the cross hairs becomes the centre of a new screen display, but at a higher magnification (see Figure 4). A different button would similarly allow ‘zooming out’. Scrolling of an image could be allowed if the crosshairs are pushed against an edge of the map.

May and Barnard 129

Page 16: The case for supportive evaluation during design

Because it requires additional user interactions (to zoom in and out), the new design would be more complex than the three proposed designs, but by matching these actions and changes in the screen display to the transitions in the user’s visual structures (in this case, between superstructures and substruc- tures) it could actually be easier to use, since it removes superfluous visual elements that could distract the user. There are additional questions raised by this design, such as how easily users will be able to maintain a sense of their general location if they are allowed to ‘scroll’ a high magnification view, but such questions could be answered by successive iterations of model building and design.

The point is not that the model itself generates design solutions, but that the modelling can alert the designers to weaknesses in their requirements and point to the areas of the design space that they should concentrate upon. This can also be helpful where the design space is not yet well-elaborated, as the next scenario illustrates.

Rapid cash machine scenario

This scenario considers software changes to a bank’s cash machine, to let customers just withdraw money rapidly, without requiring new hardware. The aim of the design was to maximise the ease and speed of use and ‘customer satisfaction’.

The strategy followed in the previous scenarios was to model the cognition required to use particular, reasonably well specified parts of interfaces. This design was not as far progressed as a set of interface objects, but was still at an early level. Decisions taken at this level form a large constraint upon subsequent design work, and so it is crucial that the right directions are taken. In consequence, we considered properties of the task domain, and concluded that an understanding of the issues leads to family of solutions.

In this scenario the task for the user is to extract their required amount of cash as quickly as possible. Although the required amount can vary between customers and for any customer on different occasions, a significant proportion of usage would involve individual customers withdrawing the same amount from one occasion to the next. This suggests that the between customer variance can be overcome by giving the machine knowledge about each user’s typical withdrawals (perhaps encoded on the user’s card), using this as a personal default amount. Use of the machine would in this case correspond directly to the user’s immediate goal, to ‘get some cash’. However, the user may on occasion form more detailed goals, requiring a specific amount of cash, and so some means must be provided to allow the user to alter the default amount suggested by the machine. The design path has already begun to form: the primary aim of the solution is to allow the user to perform a goal where the amount of cash is not explicitly specified, with a requirement for action specification and execution to be simple if the user’s goal is more explicit.

The task sequence should therefore have a branching point at which the user either selects the default amount and proceeds with the transaction, or selects an option leading to an additional embedded dialogue to change the default

130 interacting with Computers ml 7 no 2 (2995)

Page 17: The case for supportive evaluation during design

Figure 5. Two possible CTRs required to use a rapid cash machine

amount, before returning to the default track and proceeding with the transac- tion. This would lead to the user developing two sets of CTR, one for ‘get usual amount’ and another ‘get novel amount’ (see Figure 5).

Even this low degree of complexity can lead to the possibility of error, however, because once the user has started to use a CTR in the specification of their actions, the sequence it prescribes will tend to be performed unless there is an explicit check upon some aspect of the environment to halt them. In this case, if the machine has knowledge about the user’s personal default amount, the most frequent course of action would be the adoption of first CTR, to ‘get usual amount’. However, because this CTR does not force the user to check what the default amount is, they might accept it and then find that the machine had, for some reason, not presented the amount they had expected.

In general, designing any task that requires multiple CTRs raises problems, for if one course of action is predominant, the others may not be encountered often enough for the user to abstract relevant CTRs. While becoming expert at one task sequence, they may remain novice with the others, leading to delays and errors (Lee, 1993). The implication is that the task should be designed in such a way that the CTR that is abstracted for the most frequent course of action is itself applicable to less frequent situations.

Here the actions that correspond to the predominant ‘accept default’ should encompass those for the ‘reject default’ and ‘alter amount’ sequences. In practice this means making the user do something in the ‘accept default’ case that requires them to attend to the same element of the display that they would have to attend to in the other cases, and then designing the layout so that the objects involved in altering the amount are perceptually associated with it, simplifying the specification and execution of subsequent actions. A general CTR for this is shown in Figure 6.

The problem of providing facilities for amending the amount is not as difficult as it might seem, for the nature of money and cash machine technology means that only fairly large discrete units of change can be used. A simple ‘up’ and ‘down’ button combination that increased or decreased the amount by a set

May and Barnard 131

Page 18: The case for supportive evaluation during design

perate on defaul

Figure 6. Combined CTR for both rapid cash machine operations

amount would provide all of the functionality required, and the number of steps (i.e., button clicks) required would not be problematic, given the range of most transactions.

The relevant screen for an interface that follows from these suggestions, using existing hardware, is illustrated in figure 7. The use of screen-side buttons is not ideal, since it means that, following the location of the up/down icons or the amount, there must be a thematic transition to the appropriate button. Since the hardware cannot be modified, the screen symbols have been linked to the buttons in an attempt to simplify these transitions by making the button likely to be an element in the same structural group.

CTA in the design context

Applying CTA to a variety of design scenarios has shown that a modelling tool based upon it can help to organise constructive criticism of partial design proposals, expressed at a variety of levels, by drawing the analyst’s attention towards the psychologically salient aspects of the design problem.

The idea of characterising the psychological aspects of the design problem has, of course, numerous familiar rings to it. It is a core aspect of the use of traditional behavioural science ‘guidelines’; it forms a central plank of design rationale (MacLean et al., 1989); as well as the task-artifact cycle of Carroll et al. (1991). Last, but not least, the psychological aspects are presumed to be

D

D

D Figure 7. Screen of a redesigned rapid cash machine

132 Interacting with Computers vol 7 no 2 (1995)

Page 19: The case for supportive evaluation during design

implicitly derivable by the designer using a programmable user model, or PUM (Young et al., 1989). However, our current argument does have a novel twist.

Traditional guidelines have been generally formulated as ‘rules of thumb’ that link some feature of tasks, users or interfaces more or less directly with a design solution or class of design choice. Design rationale links questions through options to criteria. Except perhaps through the possibility of a library of previous cases, it does not suggest what the psychologically motivated ques- tions should actually be. Similarly, the task-artifact cycle and Carroll’s associ- ated claims analysis is a description of a methodology, not a means of inspiring the content of specific claims about usability - that requires a ‘theory of tasks’. In the case of PUMS, the psychologically relevant concerns are not posed: they implicitly emerge through the activities involved in specifying the knowledge analysis required to use the instruction language and running it through the PUM architecture. In none of these approaches is the nature of the deeper psychological issues directly suggested by the analytic representation.

Our argument is that a cognitive task model does make the psychological issues apparent. It is not restricted to ‘tasks’, but concerns cognition in a wider sense. It is a representation that describes properties of cognitive activity at some level of approximation. It incorporates components that describe attri- butes both of cognitive processing and of the knowledge used in the proces- sing. It therefore provides an abstract, but explicit, representation of what is going on in the head of the user. From this, it is possible to gain a direct understanding of the psychological issues associated with a design in a particular domain. A part of this might be to provide the interface designer with specific targets in terms of the underlying psychology:

l Ensure that the transitions between subjects and predicates of mental representations are simple, by conforming the structure of the interface to the user’s sequence of actions and tasks.

l Ensure that thematic transitions between the subject of a representation and its superstructure or substructure are simple, by minimising the ambiguity and complexity of interface structures.

l Allow common task records and entity property records to be easily abstract- able, by providing consistent experiential task records.

l Minimise complexity of representational interchange, by taking advantage of, and supporting the development of, proceduralised knowledge.

These broad ‘principles for proceduralised processing’ are on their own no more help to a designer than any other list of principles, but they are the constraints that the human factors’ specialist in a design team would have to bear in mind when looking at the results of a modelling exercise, whether they had carried it out themselves or had been provided with it by an automated modelling tool. An understanding of these constraints would be the key towards their identifying the relevant aspects of the design, or the assumptions behind it that needed the designers’ attention.

We do not want to make any strong claims about the actual utility of CTA, merely that it is available and could plausibly be used as a basis for formulating

May and Barnard 133

Page 20: The case for supportive evaluation during design

problems within the design space that must be addressed from the psychologic- al perspective. The information provided should be structured to feed the more creative craft skills that come into play in the production of a design. In certain respects, the cognitive task model should be directly suggestive of a substantive design question and possible means of answering it.

The shape of this particular formulation is closely allied to Landauer’s (1987) description of psychology as “a mother of invention”. Landauer’s claim is that behavioural methodologies can provide a deep understanding of the issues that surround problematic use of interfaces. That understanding is then used to support a process of creating new types of interface that get round the problems identified with the other ones. Within Landauer’s scheme, theory has at best a modest role - confined to the more mundane aspects of design of the type typified best through evaluative GOMS, Fitt’s Law, and Hick’s law type analyses. For Landauer, the key understanding arises primarily from empirical observation, experimentation and simulation. A cognitive task model could fulfil this supportive role in the creative process of developing design solutions.

It has been generally assumed that the role of modelling techniques is to support the evaluation and understanding of options, going no further in terms of design problem specification. Here, we have been arguing that modelling techniques can have further value and be explicitly directed toward problem specification and resolution.

However, it should be emphasised that no expert system currently exists that can produce an appropriate CTA for any conceivable design. Indeed, the methodology is still being developed to derive principles linking the contents of cognitive task models to the probable properties of user behaviour. In the intermediate term, it might be productive to consider how some familiarity with the basic cognitive theory, such as ICS, and its associated analyses, such as CTA, might form a basis for a methodology and a set of practical heuristics for ‘driving’ design discussion.

Using key aspects of approximate modelling for the specification of cognitive issues associated with a design can be viewed as an attempt to re-establish classic information processing concepts on the same kind of ground that Carroll is arguing requires a broader ‘theory of tasks’. We are trying to establish ‘upward connections’ from CTA to broader design issues, and seeking to refine ‘downward connections’ to provide more precision and scope for our behavioural predictions.

It is worth concluding this discussion of preliminary attempts to connect CTA to design scenario material with a reminder about the scientific status of all the different concepts. ICS is the underlying theoretical framework we are using to specify CTA, a form of approximate modelling of cognitive activity. The models that result are representations of our scientific understanding, specifically intended for the purposes of providing support in design decision making or behavioural prediction. The objective ‘truth’ of the particular collection of principles involved in deriving an output is not really an issue, although there is evidential support, from empirical studies, that can be used to justify the modelling. The issue is whether or not this kind of technique provides sufficiently robust approximations to be of value in assessing practical design

134 Interacting with Computers vol 7 no 2 (1995)

Page 21: The case for supportive evaluation during design

options. The question is not whether the theory and all of its attendant analyses are ‘true’ in any abstract sense, but whether the output is useful and more economical of effort than other means of arriving at the decision.

Similarly, if we are to develop the techniques and methods to provide conceptual support for elaborating design options through the specification and communication of psychological issues in problem specification, we are again not primarily concerned with ‘truth’, per se. What we are talking about is the development of an ‘application representation’ whose purpose is to help design teams arrive at an understanding of the potential psychological issues at stake in the design space (Barnard, 1991). Its potential for success or failure should be weighted, not simply in relation to truth, accuracy or scope, but in terms of the extent to which it might lead to design options whose subsquent assessment is more positive (by whatever criteria) than would have been the case otherwise.

Acknowledgements

This work was carried out as part of the ESPRIT project BRA 7040 AMODEUS-2, supported by the Commission of the European Communities. Further informa- tion on AMODEUS can be obtained via the WORLD WIDE WEB http://www.mrc- apu.cam.ac.uk/amodeus/

References

Aboulafia, A., Nielsen, J. and Jorgensen. A. (1993) ‘Evaluation report on the ‘EnEx 1’ design workshop‘. Amodeus-2 Project Working Paper TA/WPZ MRC- APU, Cambridge, UK

Barnard, P.J. (1985) ‘Interacting cognitive subsystems: A psycholinguistic approach to short term memory’ in Ellis, A. (ed.) Progress In The Psychology Of Language Lawrence Erlbaum Associates, 197-258

Barnard, P.J. (1987) ‘Cognitive resources and the learning of human-computer dialogs’ in Carroll, J.M. fed.) Interfacing Thought: Cognitive Aspects Of Human-Computer Interaction MIT Press, 112-158

Barnard, P.J. (1991) ‘Bridging between basic theories and the artifacts of human-computer interaction’ in Carroll, J.M. fed.) Designing Interaction. Psychology At The Human-Computer Interface Cambridge University Press, 103-127

Barnard, P.J., Grudin, J. and MacLean, A. (1989) ‘Developing a science base for the naming of computer commands’ in Long, J.B. and Whitefield, A. teds.) Cognitive Ergonomics and Human Computer lnteruction Cambridge University Press, 95-113

Barnard, P. J. and May, J. (1993) ‘Cognitive modelling for user requirements’ in Byerley, P.F., Barnard, P.J. and May, J. teds.) Computers, Communication And Usability: Design Issues, Research And Methods For lntegruted Services Elsevier, 101-146

Barnard, P.J., Wilson, M. and MacLean, A. (1988) ‘Approximate modelling of cognitive activity with an expert system: A theory based strategy for

May and Barnard 135

Page 22: The case for supportive evaluation during design

developing an interactive design tool’ Comput. 1. 31, 445456

Bovair, S., Kieras, D.E. and Polson, P.G. (1990) ‘The acquisition and perform- ance of text-editing skill: a cognitive complexity analysis’ Human-Computing Interaction 5, 148

Buckingham Shum, S. and Hammond, N.V. (1994) ‘Delivering HCI modelling to designers: a framework and case study of cognitive modelling’ Interacting with Computers 6, 311-341

Buckingham Shum, S,. Harmmond, N., Jorgensen, A.H. and Aboulafia, A. (1994a) ‘Communicating HCI modelling to practitioners’ in Proceedings of CW94 ACM Press, 271-272

Buckingham Shum, S., Jorgensen, A.H., Hammond, N. and Abolafia, A. (199413) ‘Communicating and evaluating HCI modelling’ Amodeus-2 Project Working Paper TAIWP22 MRC-APU, Cambridge, UK

Carroll, J.M., Kellogg, W.A. and Rosson, M.B. (1991) ‘The task-artifact cycle’ in Carroll, J.M. ted.) Designing Interaction - Psychology At The Human- Computer Interface Cambridge University Press

Dickerson, K.R. and Hedman, L.R. (1993) ‘Usability and standards’ in Byerley, P.F., Barnard, P.J. and May. J. feds.) Computers, Communication And Usabil- ity: Design Issues, Research And Methods For Integrated Services Elsevier, 413452

Dowell, J. and Long, J.B. (1989) ‘Towards a conception for an engineering discipline of human factors’ Ergonomics 32, 1513-1535

Grudin, J. (1991) ‘Systematic sources of suboptimal interface design in large product development organisations’ Human-Computer Interaction 6, 147-196

Jeffries, R., Miller, J.R., Wharton, C. and Uyeda, K. (1991) ‘User interface evaluation in the real world: a comparison of four techniques’ in Proc. CHI’SZ ACM Press, 119-124

Kieras, D.E. (1988) ‘Towards a practical GOMS model methodology for user interface design’ in Helander, M. ted.) The Handbook Of Human-Computer Interaction North Holland

Landauer, T.K. (1987) ‘Relations between cognitive psychology and computer systems design’ in Carroll, J.M. (ed) Interfacing Thought MIT Press, l-25

Lee, W-O. (1993) ‘Adapting to interface resources and circumventing interface problems: knowledge development in a menu search task’ in Alty, J.L., Diaper, D. and Guest, S. teds.) People And Computers VIII Cambridge University Press, 61-77

Lewis, C.H., Casner, S., Schoenberg, V. and Blake, M. (1987) ‘Generalization, consistency, and control’ in Proc. CH1’89 ACM Press, 1-5

Lewis, C., Polson, P., Wharton, C. and Rieman, J. (1990) ‘Testing a walk- through methodology for theory-based design of walk-up and use interfaces’ in Proc. CH1’90 ACM Press, 235-241

Lim, K.Y., Long, J.B. and Silcock, N. (1991) ‘Integrating human factors with the Jackson system development method: an illustrated overview’ Ergonomics 36, 1135-1161

136 Interacting with Computers zd 7 no 2 (2995)

Page 23: The case for supportive evaluation during design

MacLean, A., Young, R.M. and Moran, T.P. (1989) ‘Design rationale: the argument behind the artefact’ in Proc. CHl’89 ACM Press, 247-252

May, J., Barnard, P.J. and Blandford, A. (1993) ‘Using structural descriptions of interfaces to automate the modelling of user cognition’ User Modeling and User Adapted Interaction 3, 27-64

Nielsen, J. and Phillips, V. (1993) ‘Estimating the relative usability of two interfaces: heuristic, formal and empirical methods compared’ in Proc. InterCHI’93 ACM Press, 214-221

Nielsen, J. (1992) ‘Finding usability problems through heuristic evaluation’ in Proc. CHl’92 ACM Press, 373-380

Polson, P.G. and Lewis, C. (1990) ‘Theory based design for easily learned interfaces’ Human Computer interaction 5, 191-220

Polson, P.G., Lewis, C., Rieman, J. and Wharton, C. (1992) ‘Cognitive walkthroughs: a methodology for theory-based evaluation of user interfaces’ lnt. 1. Man-Machine Studies 36, 741-773

Rao, R. and Card, S.K. (1994) ‘The Table Lens: merging graphical and symbolic representations in an interactive focus+context visualization for tabular information’ in Proc. CH1’94 ACM Press, 318-322

Rieman, J., Davies, S., Hair, D.C., Esemplare, M., Polson, P. and Lewis, C. (1991) ‘An automated cognitive walkthrough’ in Proc. CHl’91 ACM Press, 427-428

Rowley, D.E. and Rhoades, D.G. (1992) ‘The cognitive jogthrough: a fast-paced user interface evaluation procedure’ in Proc. CH1’92 ACM Press, 389-395

Smith, S.L. and Mosier, J.N. (1986) ‘Guidelines for designing user interface software’ Report MTR-20090, The MITRE Corporation, Bedford, MA, USA

Tani, M, Horita, M., Yamaashi, K., Tanikoshi, K. and Futukawa, M. (1994) ‘Courtyard: integrating shared overview on a large screen and per-user detail on individual screens’ in Proc. CH1’94 ACM Press, 44-50

Terrins-Rudge, D. and Jorgensen, A.H. (1993) ‘Supporting the designers: reaching the users’ in Byerley, P.F., Barnard, P.J. and May, J. feds.) Computers, Communication And Usability: Design Issues, Research And Methods For Integrated Services Elsevier, 87-98

Wallace, M.D. and Anderson, T.J. (1993) ‘Approaches to interface design’ Interacting with Computers 5, 259-278

Wharton, C., Bradford, J., Jeffries, R. and Franzke, M. (1992) ‘Applying cognitive walkthroughs to more complex interfaces: experiences, issues and recommendations’ in Proc. CHl’92 ACM Press, 381-388

Young, R.M., Green, T.R.G. and Simon, T. (1989) ‘Programmable user models for predictive evaluation of interface designs’ in Proc. CH1’89 ACM Press, 15-19

May and Barnard 137

Page 24: The case for supportive evaluation during design

Appendix 1: Cognitive Task Analysis

Cognitive task analysis (CTA) looks at a system from the point of view of the user, and identifies aspects of the design that place heavy demands on the user’s cognitive resources - their memory, attention, and so on. This informa- tion allows the designer to focus upon the features that users will find hardest to learn, and where they are most likely to make errors. By highlighting the sources of ambiguity, CTA helps designers to iterate towards design specifica- tions that are cognitively straightforward, leaving users freer to concentrate on performing the tasks that the system supports rather than on using the interface itself.

Overview of CTA

CTA is based on a unified architecture of human cognition called ‘Interacting Cognitive Subsystems’, or ICS. This architecture represents the complete sequence of information processing as three phases of cognition:

l Goal formation: realising that a particular goal needs to be reached. l Action specification: determining how to reach the goal. 0 Action execution: carrying out the actions to reach the goal.

The cognitive modelling for each of these three phases is represented in sets of ‘identifiers’ for each phase (see Figure Al):

Theoretical framework Cognitive Task Models

1 VIS

-&zy-- Interacting Cognitive Subsystems, representations of information, and transformations of representations

1 Action Execution

1 Action SDecifical :ion Goal Formation

Process Configuration

Procedural Knowledge

Record Contents

Dynamic Control 0 Descriptions of the cognition occuring in different phases of an interaction, at different stages of expertise.

Figure Al. Cognitive task models are approximated from the theoretical framework

138 Interacting with Computers zd 7 no 2 (1995)

Page 25: The case for supportive evaluation during design

Procedural configuration - the cognitive processes required to operate upon information. Procedural knowledge - how automatic each part of the processing has become. Record contents - memory records that can be used in processing. Dynamic control - the requirements for the focus of processing to shift between subsystems.

This ‘identifier space description’ is an approximation of the detailed modelling that provides an assessment of the ease with which each phase of an interaction can be carried out, where errors and the need for additional support (e.g., help systems) are likely to arise, and how the user’s behaviour will develop over the course of repeated experience with the system.

Cognitive architecture

The ICS architecture represents human mental activity as occurring in nine independent cognitive subsystems, all acting in parallel (see Table Al). Each subsystem is specialised to deal with a particular form of mental code, or representation.

Configuration of processes

Each subsystem processes the representations it receives in its own code, producing representations in other mental codes, which can subsequently be processed by the other subsystems. The more often a particular transformation operates upon a representation, the easier it becomes to transform it in the

Table Al: Nine cognitive subsystems divided into three groupings

Sensory subsystems:

Visual Acoustic Body state

Information from the eyes, e.g., hue, contour, brightness Information from the ears, e.g., pitch, rhythm, timbre Information from the body, e.g., proprioceptive feedback, arousal

Central subsystems:

Object Mental imagery, e.g., spatial patterns, shapes Morphonolexical Words and lexical forms, e.g., command names Propositional Semantic relationships between entities, e.g., task sequences Implicational Meaning and comprehension., e.g., schematic models

Effector subsystems

Articulator-y Limb

May and Barnard

Subvocal rehearsal and speech Motion of limbs, eyes, fingers, etc.

139

Page 26: The case for supportive evaluation during design

future - the subsystem can develop proceduralised knowledge within the transformation process. An example of the flow of information, or process configuration, for the recognition and naming of an object in the world might be:

l Sensory data from the eyes forms a visual representation of colours and shapes.

0 The visual representation is transformed to an object representation. 0 The object representation is transformed to a propositional representation, and

the object can be identified and related to other objects. l The propositiona representation is transformed to an implicational representa-

tion, and the meaning of the object (and its actions and relationships) can be understood.

l The impZicutionuZ representation is transformed to a new propositional repre- sentation, putting the meaning of the object into the context of the indi- vidual’s experience of the world.

l This propositional representation is blended with the propositional repre- sentation derived from the object representation, emphasising the aspects of the visual world that fit the individual’s current interpretation.

l The combined propositional representation is transformed to a morphonolexi- cuZ representation, producing a verbal label for the object.

l The morphonoZexicuZ representation is transformed to urticulutory representa- tion, which controls the production of speech, and the object’s name is spoken.

l The speech is heard, and forms an acoustic representation of the sound of the object’s name, that can be blended with the internally derived form to check the accuracy of the spoken form.

As this example shows (see Figure A2), subsystems can blend representations from several sources (the propositional subsystem receives information from the implicational and object subsystems) and can produce output in different representations at the same time (the propositional subsystem produced implicational and morphonolexical output). This allows a degree of feedback between the central subsystems.

Image record contents

At the same time as transforming representations, each subsystem is able to copy the representations it has received to its image record. These act as local memories for each subsystem, and allow regularities in records over time to be abstracted. There are consequently a number of different records available to each subsystem:

l ATRs: active task records - the representation that the subsystem has just processed, and which is available for immediate re-use.

l ETRs: experiential task records - representations that have been experi- enced in the past, and which can be revived if the subsystem receives a sufficiently similar input representation.

140 Interacting with Computers vol 7 no 2 (199.5)

Page 27: The case for supportive evaluation during design

Acoustic Morphonolexical Articulatory /

Propositional

Implicational

Visual Object Limb Figure A2. Example of flow of information between subsystems

l CTRs: common task records - abstractions of the commonality of a number of ETRs, formed by their simultaneous revival (similarities being combined, discrepancies discarded).

May and Barnard 141

Page 28: The case for supportive evaluation during design

superstructure

basic units of a remesentation +

redicate elemen

substructure

redicate elemen

Figure A3. General structure of mental representation

l EPRs: entity property records - the ‘completion’ of the input representation by its combination with ETRs and CTRs allows entities within a subsystem’s representational domain to afford information that is not necessarily present in the original input representation.

The feedback between the central subsystems allows the cognitive architecture to use record contents to elaborate, interpret, and enrich the representations of information arriving from the sensory subsystems.

Structure of mental representations

CTA is not concerned with the detailed content of representations, but it does consider their structure (see Figure A3). Representations in all mental codes are modelled as consisting of a set of ‘basic units’ of information, joined into a common ‘superordinate unit’ in a ‘superstructure’, and which can each be decomposed into a ‘substructure’ of ‘subordinate units’. At any given point in processing, one of the basic units will be the focus of processing, forming the ‘subject’ of the transformations operating on that representation. The other basic units form a ‘predicate structure’, that can be used to obtain disambiguat- ing information about the subject. Transitions can be made across and between the various levels of the representation to make different elements the subject,

her elements of display

Box - Row of Sacks

-Three dots

Figure A4. Transitions through object representations of an icon array

142 Interacting with Computers ml 7 no 2 (299.5)

Page 29: The case for supportive evaluation during design

and hence the focus of transformations, although this may result in extra complexity in cognitive processing.

Figure A4 shows two successive transitions in an Object level representation that might occur when a computer user is searching for a particular icon on their screen. To begin with, the whole icon array is represented as a single element in the representation, with other screen objects making up its predicate structure. The first transition goes into the substructure of the array to make the ‘can’ icon at the lower right corner the new subject. Its predicate structure is made up of nearby objects - the ‘box’ icon and, clustered as a single element, the row of sacks above it. To evaluate the identity of the can icon more fully, the user would need to make a further transition to examine its substructure. The can itself would become the subject, and the triangle and the row of dots its predicate. The user could examine each of these elements in turn by successive transitions making them each the subject of the representation.

May and Barnard 143