towards a cognitive theory of information accessing: an empirical study

17
lnjormorion Processing & Managmwnf Vol. 29, No. 5. pp. 569-585. 1993 0X6-4573/93 $6.00 + .OO Printed in Great Britam. Copyright 0 1993 Pergamon Press Ltd. TOWARDS A COGNITIVE THEORY OF INFORMATION ACCESSING: AN EMPIRICAL STUDY NIGEL FORD and ROSALIND FORD Department of Information Studies, University of Sheffield, U.K. (Received 18 March 1992; accepted in final form 31 July 1992) Abstract-How would users access an ‘ideal’ computer-based information retrieval sys- tem? What strategies would they use in seeking information if they had access to a truly expert knowledge-base which could respond effectively to any kind of questioning, phrased in any way? No such system exists. But this project provided the next best thing-a computer system which allowed unlimited access to genuinely expert knowledge. Unbeknown at the time of learning to 30 volunteer users who accessed the system, the knowledge-base included 2 human experts, communicating with them from a different building, via the computer screen. The interactions between users and the system were logged and analysed. The results reveal a number of different information accessing strat- egies linked to individual user characteristics and retrieval effectiveness. Implications for the design of improved information retrieval systems are discussed. 1. KNOWLEDGE REPRESENTATION AND RETRIEVAL An increasing number of studies within information science focus on relatively narrow sub- ject domains. This is not surprising where information retrieval (IR) systems are being de- veloped that take advantage of techniques derived from research into artificial intelligence. The deeper and more structured are the knowledge representation formalisms adopted, the more difficult it is to develop systems able to accommodate wide-ranging subject con- tent. Intelligent intermediary systems can attempt to provide expert assistance in the search- ing of large and varied databases by encoding aspects of the knowledge of trained human intermediaries. However, attempts at encoding more subject-specific knowledge (as op- posed to more general knowledge of searching techniques) encounter problems of scale. TomeSearcher, for example, was customized for particular subject areas, and although its linguistic and conceptual knowledge was limited (basically to thesaural and classification information), adaptation to new subject areas was a nontrivial task. Using a domain-inde- pendent base of some 40,000 terms, a further 10,000 to 20,000 specialist terms would be required in areas such as petrochemicals or medicine. It has been estimated that an expe- rienced information scientist could add some 1,000 new terms per week (Durham, 1989). Systems able to reason more intelligently about a subject area generally relate to a much more narrowly defined area, for example, RESEARCHER and technical patents, or SCISOR and corporate takeovers. Work is ongoing to develop systems that can handle larger domains via the automatic processing of the natural language of information sources (e.g., IOTA), and the limitations mentioned above do not relate to every AI-derived tech- nique. For example, spreading activation, similar to that used in neural networks, can be applied to retrieval systems using relatively large document bases (e.g., 13R) (Ford, 1991). Ford (1983) has argued that the concern of both IR and computer assisted learning (CAL) are to some extent moving closer-partly because of the developments previously mentioned. Indeed, the authors of this paper take the view that CAL is a special class of IR-a notion explored further in section 4.2. There may be much to be gained from cross fertilization-notably the potential use of the greater cognitive theoretical base existing in the CAL area. Just as we should not necessarily equate IR with large databases and rela- Correspondence and requests for reprints should be addressed to Nigel Ford, Department of Information Studies, Sheffield University, Western Bank, Sheffield SIO 2TN, South Yorkshire, U.K. 569

Upload: nigel-ford

Post on 25-Aug-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Towards a cognitive theory of information accessing: An empirical study

lnjormorion Processing & Managmwnf Vol. 29, No. 5. pp. 569-585. 1993 0X6-4573/93 $6.00 + .OO

Printed in Great Britam. Copyright 0 1993 Pergamon Press Ltd.

TOWARDS A COGNITIVE THEORY OF INFORMATION ACCESSING:

AN EMPIRICAL STUDY

NIGEL FORD and ROSALIND FORD Department of Information Studies, University of Sheffield, U.K.

(Received 18 March 1992; accepted in final form 31 July 1992)

Abstract-How would users access an ‘ideal’ computer-based information retrieval sys- tem? What strategies would they use in seeking information if they had access to a truly expert knowledge-base which could respond effectively to any kind of questioning, phrased in any way? No such system exists. But this project provided the next best thing-a computer system which allowed unlimited access to genuinely expert knowledge. Unbeknown at the time of learning to 30 volunteer users who accessed the system, the knowledge-base included 2 human experts, communicating with them from a different building, via the computer screen. The interactions between users and the system were logged and analysed. The results reveal a number of different information accessing strat- egies linked to individual user characteristics and retrieval effectiveness. Implications for the design of improved information retrieval systems are discussed.

1. KNOWLEDGE REPRESENTATION AND RETRIEVAL

An increasing number of studies within information science focus on relatively narrow sub- ject domains. This is not surprising where information retrieval (IR) systems are being de- veloped that take advantage of techniques derived from research into artificial intelligence.

The deeper and more structured are the knowledge representation formalisms adopted, the more difficult it is to develop systems able to accommodate wide-ranging subject con- tent. Intelligent intermediary systems can attempt to provide expert assistance in the search- ing of large and varied databases by encoding aspects of the knowledge of trained human intermediaries. However, attempts at encoding more subject-specific knowledge (as op- posed to more general knowledge of searching techniques) encounter problems of scale. TomeSearcher, for example, was customized for particular subject areas, and although its linguistic and conceptual knowledge was limited (basically to thesaural and classification information), adaptation to new subject areas was a nontrivial task. Using a domain-inde- pendent base of some 40,000 terms, a further 10,000 to 20,000 specialist terms would be required in areas such as petrochemicals or medicine. It has been estimated that an expe- rienced information scientist could add some 1,000 new terms per week (Durham, 1989).

Systems able to reason more intelligently about a subject area generally relate to a much more narrowly defined area, for example, RESEARCHER and technical patents, or SCISOR and corporate takeovers. Work is ongoing to develop systems that can handle larger domains via the automatic processing of the natural language of information sources (e.g., IOTA), and the limitations mentioned above do not relate to every AI-derived tech- nique. For example, spreading activation, similar to that used in neural networks, can be applied to retrieval systems using relatively large document bases (e.g., 13R) (Ford, 1991).

Ford (1983) has argued that the concern of both IR and computer assisted learning (CAL) are to some extent moving closer-partly because of the developments previously mentioned. Indeed, the authors of this paper take the view that CAL is a special class of IR-a notion explored further in section 4.2. There may be much to be gained from cross fertilization-notably the potential use of the greater cognitive theoretical base existing in the CAL area. Just as we should not necessarily equate IR with large databases and rela-

Correspondence and requests for reprints should be addressed to Nigel Ford, Department of Information Studies, Sheffield University, Western Bank, Sheffield SIO 2TN, South Yorkshire, U.K.

569

Page 2: Towards a cognitive theory of information accessing: An empirical study

570 N. FORD and K. FORD

tively shallow knowledge representation formalisms, we should not equate CAL with pre- scriptive modes of information presentation. The sort of free searching characteristic of IR systems-and levels of freedom between this and more prescriptive information presenta- tion-is of increasing interest to those engaged in developing CAL systems. Equally, the relatively sophisticated information processing characteristics of systems that address rel- atively narrow subject domains (such as expert systems and CAL systems) are increasingly of interest to those concerned with developing systems that address relatively large volumes of information.

2. IMPROVING INFORMATION RETRIEVAL SYSTEMS

We lack a well developed cognitive theory of how people might optimally interrogate databases in order to satisfy particular information needs (Green, 1991). Such knowledge is required to help us design more effective IR systems.

Various means have been adopted in attempts to acquire such knowledge. Interactions between (a) information professionals and their clients; and (b) IR systems (computerized and manual) and people (information professionals or end users) have been both obtru- sively and unobtrusively observed, recorded, and analysed, using techniques such as trans- action logging, in-search interactive questionnaires, and talk-aloud recording (e.g., Burton, 1990; Elzy et al. 1991; Hancock-Beaulieu et al. 1991; Paskoff, 1991; Payne, 1990; Whit- latch, 1989). These and other studies have resulted in some interesting findings relating to individual differences in searching for, evaluating, and using information (e.g., Davidson, 1977; Logan, 1990; Rholes & Droessler, 1984).

However, if we observe people’s use of some existing computer-based system, then progress towards an ideal is arguably hampered by having adopted an approach that is locked into existing technological limitations. These limitations are great if we consider an ‘ideal’ computer-based information retrieval system as one capable of the sort of flexible response to queries characteristic of a human subject expert.

Observing interactions between humans (information professionals and their clients) overcomes the problem of flexible response-but introduces a number of important con- founding interpersonal variables (described in section 3). If we are aiming to learn more about how people interact with a machine-based system, then such studies arguably lack ‘ecological validity’.

It would be useful if we had access to a computer-based IR system that would display none of the limitations inherent in current computer-based systems. Ideally, it would be able to communicate with users freely, in natural language, and to respond intelligently and helpfully to any request for information, no matter how phrased. It would also be expert in the subject domain of the database. If we could observe human interaction with such a system, we could possibly generate theory/models nearer to the ‘ideal’, in that they would not be shackled by existing technological limitations, but would preserve relatively high eco- logical validity in relation to computer-based IR.

3. TOWARDS AN “IDEAL.” SYSTEM

Unfortunately, such a system does not exist. So the researchers built the next best thing - ‘Diogenes’.

‘Diogenes’ was a computer-based system designed to observe and record users’ inter- actions with an ‘ideal’ knowledge base. As well as being expert, the knowledge base was able to permit totally unrestricted access and interrogation in terms of users’ language and strategies.

The knowledge base consisted of two human experts, supported by appropriate doc- umentation and computer files, communicating with users via computer terminals, this fact not being revealed to the users.

It was considered desirable that the users should think that they were interacting with a machine system for a number of reasons. As noted above, the raison d’gtre of the research was to produce data of use to the development of computer-based information systems en-

Page 3: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 571

tailing person-to-machine interactions. Person-to-person interactions in which both parties know that they are interacting with another human being introduce a number of extrane- ous variables-arguably thus reducing ecological validity. A range of interpersonal factors could come into play, including the following:

l Possible embarrassment and/or unwillingness to expose self-perceived ignorance or slowness when interacting with a human expert. It was considered less likely that such factors would significantly affect interactions between users and what they per- ceived to be a machine system.

0 Information-seeking strategies adopted when knowingly interacting with one or more human experts might differ significantly from strategies adopted in machine- based interactions. Expectations and assumptions affecting the adoption of strat- egies might differ significantly between the two contexts.

l In the case of face-to-face interactions (as opposed to interacting with a human via a computer terminal), nonverbal cues (facial expressions, gestures, etc.) could com- municate a variety of uncontrolled and unrecorded messages.

However, in relation to the last factor, face-to-face interaction would not have been a possible option in these experiments. The need for the researchers to engage in urgent (and in some cases noisy!) discussions and document consultation during the interactions would have been, to say the least, distracting to the users. Such discussions and consultations were needed in order to ensure that non-leading replies were given (composed on the spot or se- lected from pre-set frames-described in section 4.1) in response to user queries. Discus- sions were also required in attempts to ensure that the information given by the system in response to queries was just unforthcoming enough to be helpful yet at the same time to force users to display a coherent strategy-to seek out information as opposed to having it handed ‘on a plate’ with minimal questioning.

Records of the interactions were obtained and the content analysed.

4. THE EXPERIMENT

4.1 The system An interactive, computer-based retrieval system was set up using a Prime minicom-

puter. This was an apparently fully automated system with a natural language interface. In fact, however, it was a system mediated by two researchers (who were also the subject experts forming the ‘knowledge base’), in a different building.

The system was very user-friendly, to set users at ease, introducing itself to the students as Diogenes, and eliciting their names. The system and the users interacted via natural lan- guage typed in at the keyboard. Apart from the initial screens introducing the system and asking for each user’s name, Diogenes was passive, reacting to user queries and deliberately avoiding proactivity - in order to force users to devise a strategy to extract information for themselves. Sample interactions are shown in Appendices D, E, and F.

Facilities were incorporated for the researchers to have a choice of modes of response. They could create customized messages on the spot in response to user queries, and/or draw on a series of over 100 pre-prepared frames of information. However, pre-prepared frames were to be used only when strictly appropriate as a reply to a query. In all other cases, screens of information were prepared on the spot, edited, checked, and sent to the user. Care was taken to ensure the rapid generation and despatch of messages, so that response times were acceptable.

4.2 Test instructions nnd condition Thirty volunteer postgraduate librarianship and information science students were told

that their objective was freely to retrieve information in order to learn how to index using the PRECIS system (described in Appendix C). They were told that they could interrogate the knowledge base in whatever way they wished and that the system could respond to nat- ural language requests.

Page 4: Towards a cognitive theory of information accessing: An empirical study

572 N. FORD and R. FORD

Learning is one of the many tasks requiring the retrieval of information. Indeed, the authors of this paper take the view that learning needs are a special class of information needs, and that CAL systems are a special class of IR systems. Diogenes, being used here within a learning context, displays characteristics more commonly associated with IR than CAL systems, in that (a) it adopts a reactive as opposed to proactive role-waiting for users’ requests rather than proffering any suggestions as to what the user should do next, or information that it considers the user should know; and (b) it allows almost complete freedom of route through its subject matter-basing each response on each user’s imme- diately preceding statement of need, as opposed to more prescriptive assessments of user needs derived from some inbuilt model. The relationship between IR and CAL systems is

again taken up in section 7. Every effort was made to hold constant environmental and test conditions across all

the sessions, to eliminate as many confounding variables as possible. The test situation was standardized to the extent that each student was given precisely the same preamble, by the same researcher, and the same explanatory sheet outlining what was required (see Appen- dix A). Physical environment was made as uniform as possible, users being allocated to a terminal in a small, quiet laboratory, located in a different building from the researchers’ terminal.

During the briefing, the students were informed exactly what their objective was in learning about PRECIS: that is, to make a start of learning how to index in PRECIS, imag- ining that this was the first of seven such interactions. This latter condition was imposed for the following reason. The researchers wanted to explore the use of a relatively large knowledge base-large enough to allow users to choose alternative routes and approaches. The subject matter selected-how to index using PRECIS - was therefore not realistically achievable within one session. Ideally, user behavior over a number of sessions would have been desirable. However, this was not feasible within the resource constraints of the project. The artificiality of asking users to behave as though this was the first of a number of ses- sions (when in fact the experimental session was the only session) was considered to be more desirable than the alternative-namely, restricting the amount of information and the task to fit one session. The average amount of time spent at the terminal was 55.57 minutes.

4.3 The subject matter The subject selected was the PRECIS document indexing system (Austin, 1984; Rams-

den, 1981). An overview summary of PRECIS is given in Appendix C. A database of in-

formation was created anticipating possible questions at all levels from the very general to the specific.

Importantly, as previously noted, these frames were only to be used if strictly appro- priate to a user’s query. Otherwise, queries would be answered directly by the human ex- perts. These frames were examined and their phraseology discussed, so that the information they contained could be presented in as neutral and non-leading a way as possible. The idea was that users should be encouraged to dictate their own route through the available information.

Approximately 100 such frames were prepared, and coded for retrieval under their ap- propriate headings, for quick access during the interactions.

4.4 Evaluation questionnaire A questionnaire, designed to assess prior knowledge of PRECIS, ‘retrieval effective-

ness’ (as defined in section 5.7), and reactions to the system, was given to each user after the retrieval session. This questionnaire is reproduced in Appendix B.

5. ANALYSIS

Data were analysed using SPSS. Analysis was conducted using (a) Pearson correlation coefficients, and (b) Chi-square. Since the latter requires categorical as opposed to contin- uous data, a number of different transformations of the data were made to produce a num- ber of variables as detailed below.

Page 5: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 573

5.1 Sex and background In addition to being grouped into Male or Female, users were classified depending

upon their first degree as having Background: Science or Background: Arts.

5.2 Forms of question Four basic categories of question (distilled from an initial list of 21 types) were estab-

lished. These basic categories were as follows:

1. Description: This type consisted of questions that invited straightforward factual answers describing some concept(s) or aspect(s) of the PRECIS system.

2. Focussing: This type included questions that sought, in essence, to set limits, or ex- plore the bounds of some concept or function.

3. Concrete: These questions were down-to-earth, practical requests to see some as- pect of the PRECIS system in action.

4. Analysis: These questions reflected a degree of analysis of the material being re- trieved on the part of the questioner (for example, to compare aspects of the PRECIS system, or test the user’s understanding of the system).

Examples of these different forms of question are given in Appendix F. 5.2.1 Forms of question: continuous data. The users varied widely in the total num-

ber of questions asked. Because of this, it was necessary, in order to compare relative ques- tion usage, to convert the raw scores to percentages. Each user was therefore assigned a percentage figure for each form of question. This figure represented the percentage (of the user’s total question number) accounted for by that particular form of question.

The names of the values (of the variable Forms of question) representing these per- centage figures are Descriptive, Concrete, Focussing, and Analytic.

5.2.2 Forms of question: categorical data. For those analyses requiring categorical data, each student was also categorized as having asked a relatively high, or relatively low number of each type of question (relative, that is, to the other students). The percentage scores referred to in the previous paragraph were split above and below the mean to form the following: Descriptive: High, Descriptive: Low, Concrete: High, Concrete: Low, Fo- cussing: High, Focussing: Low, Analytic: High, and Analytic: Low.

5.3 Levels of question Questions were also categorised according to their level in the hierarchical structure

of the PRECIS subject matter. Level 1 consisted of what were essentially system-testing questions (pleasantries, questions about the weather, general questions asking what Di- ogenes was)-as opposed to questions relating to PRECIS. Level 2 questions related to PRECIS generalities (e.g., what PRECIS is, its background, etc.). Level 3 related to an overview of how PRECIS works and how it is applied. Level 4 related to a slightly more detailed consideration of PRECIS operations, including general questions about syntax and the generation of index entries. Level 5 questions related to subdivisions of these (e.g., op- erators and codes, and the division between human and machine processing in the gener- ation of index entries). Level 6 related to subdivisions of the level 5 topics. Operators were subdivided into primary and secondary: codes were subdivided into primary, secondary, and typographic, etc. Levels 7 and 8 entailed similar subdivisions- for example, of primary operators into core and ex-core (level 7), and, core primary operators into their various types (level 8). Examples of different levels of question are given in Appendix D.

As previously noted, Level 1 was so general as not to relate to PRECIS at all, consist- ing of questions aimed at getting used to the computer system itself rather than learning about PRECIS. This level was discarded for the purposes of the statistical analysis.

For this analysis, levels were grouped into 3 broad categories-Levels 2-3, Levels 4-5 and Levels 6-8.

5.3.1 Levels of question: continuous data. The raw figures for the number of ques- tions asked by each user at each level were converted to percentages, so that relative ques- tioning at different levels could be compared between users. Percentages were grouped into Levels 2-3, Levels 4-5, and Levels 6-8.

Page 6: Towards a cognitive theory of information accessing: An empirical study

574 N. FoRDand R. FORD

5.3.2 Levels of question: categorical data. For those analyses that required categor- ical data, for each person the levels associated with the user’s highest percentage were cat- egorized as his or her Principal Levels. These consisted of Principal levels 2-3, Principal levels 4-5, and Principal levels 6-8.

As an additional measure, each user was categorised dichotomously as having scored above or below the mean percentage figure for all the users at each level. This procedure generated six new variable values:

Levels 2-3: High, Levels 2-3: Low, Levels 4-5: High, Levels 4-5: Low, Levels 6-8: High, and Levels 6-8: Low.

5.4 Speed The system automatically logged the time spent by each user on each question. These

times formed the variable Speed. For those analyses that required categorical data, each user’s speed was split above and below the mean to form the values Speed: High and Speed: Low.

5.5 Number of questions The variable Number of questions consisted of the total number of questions asked

by each user, excluding questions at Level 1, as previously explained. For those analyses that required categorical data, this variable was dichotomized above and below the mean

into Number of questions asked: High and Number of questions asked: Low.

5.6 Tactics A graph was prepared for each user, showing the route taken in terms of the hierar-

chical levels of the subject matter. The graphs were then categorised, first qualitatively, then quantitatively, as follows:

5.6.1 Qualitative analysis.

1. One group of users went straight “in at the deep end” to the most detailed levels of information (Levels 6-8) within the first few questions. These were labelled Deependers.

2. A second group took almost exclusive interest from the start in a level of detail that avoided extremes of either generality or fine detail (Levels 4-5). They were termed Midpoolers.

3. A third group made an early examination of ‘overview’ material (Levels 2-3), and were called Shallow-enders.

4. A fourth group worked back and forth methodically across the levels, apparently placing details within their broader context. These users were labelled Consolidators.

Category 2 (comprising only 10% of the sample) was combined with category 3, for the purposes of the statistical analysis, into a new group labelled Mid-shallowenders.

5.6.2 Quantitative analysis. Using quantitative criteria, users were classified as follows:

1. Deependers reached levels 6-8 in nine questions or fewer; 2. Mid-poolers reached levels 6-8 in 18 or more questions; 3. Shallow-enders did not reach levels 6-8 at all; 4. Consolidators reached levels 6-8 in more than nine and fewer than 18 questions.

Categories 2 and 3 were combined, for the statistical analyses, to form the category Mid-shallowenders.

Applying these quantitative criteria resulted in three reclassifications from the origi- nal qualitative categorisation. One student previously classified as a Deepender was reclas- sified as a Consolidator; one Consolidator was reclassified as a Deepender; and one Consolidator was reclassified as a Mid-shallowender. Statistical analyses were performed using both qualitative and quantitative classifications in turn, as detailed in section 6.

Page 7: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 575

These analyses resulted in the creation of two variables Tactics (qualitative) and Tactics (quantitative). The values of these two variables are Deependers (qualitative), Deepend- ers (quantitative), Consolidators (qualitative), Consolidators (quantitative), Mid-shallow- enders (qualitative), and Mid-shallowenders (quantitative).

5.7 Results Users were then given an open-ended questionnaire (reproduced in Appendix B) de-

signed to assess retrieval effectiveness-defined in terms of their understanding (measured by free recall) of what had been retrieved. This represents a somewhat more extended def- inition of retrieval effectiveness than is normally used in IR experiments.

It centers on the extent to which information retrieval is successful in enabling the users to perform the required task -to learn about PRECIS. Maximum retrieval effectiveness in this context would result in maximum learning efficiency. This measure was felt to be nec- essary because all information in the knowledge base was potentially relevant to the sub- ject and task in which the users were engaged.

In other words, it was not a question of judging information to be more or less rele- vant to the subject and task. Information was only retrieved more or less effectively inso- far as it was more or less appropriate to helping each individual user successfully achieve understanding.

This measure is more akin to measures of the effective use of CAL systems. This is not surprising if (as discussed in section 4.2) we view learning needs as a special case of in- formation needs, and CAL as a special case of IR. In these experiments, users were retriev- ing information in order to learn.

Each transcript in which users recalled what they had learned was given a mark out of 10, based on the number of correctly recalled items of information, by one of the two PRECIS experts. This was done before any other data analysis.

The raw scores for this test were contained in the variable Results. For the analyses requiring categorical data, the score for each user was split above and below the mean to form the dichotomous values Results: High and Results: Low.

5.8 Like/Dislike of the system The questionnaire (Appendix B) included an open-ended request to the users to make

a note of any comments on, or reactions to, using the system. In the majority of cases, it was easy to classify responses as positive or negative. The responses were therefore classi- fied dichotomously into Like or Dislike. A Neutral classification was made in those cases where no opinion was expressed, or a user’s opinion was unclear.

6. RESULTS

For the purpose of this study it was decided to adopt a minimum significance level of

p < 0.05.

6.1 Pearson correlation coefficients Using the Pearson correlation coefficient, statistically significant correlations were

found as shown in Table 1.

6.2 Chi-square analysis Using Chi-square analysis, statistically significant correlations were found as shown

in Tables 2, 3, and 4. 6.2.1 Using the quantitative definition of Tactics. Deependers tended to be Female

(92.86%); to concentrate on Principal Levels 6-8 of the subject matter (64.28%); and to be characterised as Descriptive: High in their questioning (64.28%).

Consolidators tended to be Female (7 1.43%); to concentrate mainly on Principal Lev- els 4-5 (71.43%); and to be Descriptive: High in their questioning (71.43%).

Mid-shallowenders tended to be Male (66.67%); to concentrate mainly on Principal Levels 2-3 (77.78%); and to be Descriptive: Low in their questioning (88.89%).

Page 8: Towards a cognitive theory of information accessing: An empirical study

576 N. FORD and R. FORD

Table I. Pearson correlation coefficients

Correlating variables Correlation

(Pearson) Significance

Levels 4-5 and Results Concrefe and Results Concrete and Levels 4-5 Concrete and Analytic Levels 4-S and Descriptive Levels 6-8 and Descriptive Levels 2-3 and Focussing No. of questions and Speed

r= -0.3196 r= --0.3301 r= 0.3899 r= -0.2986 r= 0.5428 r= 0.6158 r= 0.4633 r= 0.628 I

p < 0.052 p < 0.046 p < 0.017 p < 0.055 p < 0.001 p < 0.000 p < 0.005 p < 0.000

6.2.2 Using the qualitative definition of Tactics. Of the Mid-shallowenders, 75% were Male, 85.75% were categorized as Principal Levels 2-3, 75% were Background: Arts, and 100% were Descriptive: Low.

Of the Deependers, 85.71% were Female, 64.28% were categorized as Principal Lev- els 6-8, 78.57% were Background: Arts, and 64.28% were Descriptive: High.

Of the Consolidators 87.5% were Female, 75% were categorized as Principal Levels 4-5, 75% were Background: Science, and 75% were Descriptive: High.

6.2.3 For variables excluding Tactics. Of those users whose Principal Levels were 6-8, 90.91% were categorized as Results: High. Of the Principal Levels 2-3 users for whom test results were available, 66.67% were also categorized as Results: High. Of those Principal Levels 4-5 users for whom test results were available, 60% were categorized as Results: Low.

Of those male users whose responses were not classified as Neutral in relation to their attitude towards the system, 100% liked the system. Of those female users whose responses were not Neutral, 61.11% disliked the system.

Of those users who liked the system, 76.92% asked a low number of questions. Of those users who disliked the system, 81.82% asked a high number of questions.

Of the Levels 2-3: High users, 76.92% were categorized as Descriptive: Low. Of the Levels 6-8: High users, 81.25% were categorized as Analyfic: Low. Of the Lev-

els 6-8: Low users, 64.28% were categorized as Analytic: High. Of the male users, 88.89% were categorized as Levels 6-8: Low. Of the female users,

71.43% were categorized as Levels 6-8: High. Of the males, 77.78% were categorized as Levels 2-3: High. Of the females, 71.43%

were categorized as Levels 2-3: Low. Of those users with Principal Levels 2-3, 87.5% asked a low number of questions. Of

those with Principal Levels 4-5, 72.73% asked a high number of questions. Of those with Principal Levels 6-8, 54.54% asked a low number of questions.

6.3 Partial correlations In an attempt to illuminate further the nature of the relationships between the vari-

ables, partial correlations were used. This entailed three-way analyses of categorical data. Since the total sample comprised 30 people, it must be noted that the three-way partial cor- relations entailed very small groups.

The group of eleven users whose Primary Levels were 6-8 consisted mainly (nine of the eleven: 81.82%) of Deependers. Eight of these nine Deependers (88.89%) were classi-

Table 2. Chi-square correlations using the quantifafive definition of Tacfics

Correlating variables Correlation (Chi-square) Significance

Tactics and Sex Tactics and Principal Levels Tacfics and Descriptive: High/Low

x’ = 9.25170 x’ = 22.09145 x2 = 7.87302

p < 0.0098 p < 0.0002 p < 0.0195

Page 9: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing

Table 3. Chi-square correlations using the qualitotive definition of Tucfics

577

Correlating variables Correlation (Chi-square) Significance

Tactics and Sex x2 = 10.52721 p < 0.0052 Tactics and Principal Levels x2 = 26.10998 p<o.OOcCJ Tactics and Descriptive: High/Low x2 = 11.14286 p < 0.0038 Tactics and Background: Arts/Science x2 = 6.93097 p < 0.0313

fied as Results: High, and seven of the nine (77.78%) were characterised as Descriptive: High, and as (seven out of nine: 77.78%) Analytic: Low. Two of the eleven users (18.18%)

whose Principal Levels were 6-8 were Consolidators. Both of these users were classified as Results: High, and like the Deependers, as Analytic: Low.

The eight users whose Principal Levels were 2-3 consisted of Mid-shallowenders (seven of the eight: 87.5%). Discounting the two Mid-shallowenders for whom test results were not available, three of the five (60%) were classified as Results: High, four of the seven (57.14%) as Analytic: High, six of the seven (85.71 olo) as Concrete: Low, and all seven as Descriptive: Low. The single user in this group who was not a Mid-shallowender was Deep- ender, who also displayed the same ‘Analytical/Descriptive/Concrete’ balance.

Of the six users whose Principal Levels were 4-5, and who performed relatively inef- fectively, five (83.33%) were classified as Concrete: High.

7. CONCLUSIONS

In this experiment we have seen how a small set of people, given a particular task, use an information retrieval system that is in many ways much less constrained than present computer-based systems. A number of interesting issues emerged from the study, detailed

below.

7.1 Accessing strategies and cognitive styles Two strategies seem to lead to relative success, the first more successful than the sec-

ond. Interestingly, the first is most strongly associated with female students (the majority of the sample), the second with male students. A third approach seems to lead to relative failure.

On the one hand, a concentration on the most detailed levels of the subject matter rep- resents overall the most successful strategy. This particular route to success seems charac- terised by a relatively passive intake of information (high Descriptive and/or low Analytic), coupled with an attention to detail (concentration on Levels 6-8).

A second route seems characterised as relatively active (high Analytic, low Descriptive questioning). This route also entails concentrating on higher level, overview material-con- ceptual rather than procedurally detailed (relatively few Concrete questions that relate to procedural detail; relatively high Analytic questions that relate to relatively ‘conceptual’ is- sues, including comparing and testing of hypotheses; and a concentration on Levels 2-3).

Table 4. Chi-square correlations for variables excluding Tactics

Correlating variables Correlation (Chi-square) Significance

Principal Levels and Rest&s: High/Low Sex and Like/Dislike of the system No. of Questions: High/Low and Like/Dislike of the system Levels 2-3: High/Low and Descriptive: High/Low Levels 6-8: High/Low and Analytic: High/Low Levels 6-8: High/Low and Sex Levels 2-3: High/Low and Sex Principal Levels and No. of Questions: High/Low

x2 = 6.10909 x2 = 4.53147 x2 = 6.04196 x2 = 4.88688 x2 = 4.69308 x2 = 6.94515 x2 = 4.36975 x2 = 6.76035

p < 0.047 1 p < 0.0333 p < 0.014 p < 0.027 1 p < 0.0303 p < 0.0084 p < 0.0366 p < 0.0340

Page 10: Towards a cognitive theory of information accessing: An empirical study

578 N. FORD and R. FORD

The principai route to failure would seem to be characterised by mixed male and fe- male users, focussing mainly on middle-level subject matter (Levels 4-S), concentrating on procedural (Concrete) detail.

The first two approaches bring to mind distinct cognitive styles identified by Pask (Pask, 1976a; 1976b; 1979) and subsequently investigated in relatively large-scale studies by Entwistle (Entwistle, 1981; Entwistle et al., 1979). Pask identified ‘operation learners’, who concentrated primarily on procedural details when processing information in a learn- ing context, and who were linked with a serialistic, relatively passive learning approach. ‘Comprehension learners’, on the other hand, primarily concentrated less on procedural detail than on broad relational (overview) information, and were linked with a holistic, relatively active learning strategy. The pathology of the comprehension learner is ‘globe- trotting’-basically, obtaining an overview that is not supported by valid detail. That of the operation learner is ‘improvidence’-in many ways like ‘not seeing the wood for the trees’, representing an over-emphasis on detail at the expense of an overall picture.

Pask also identified ‘versatile learners’, able to combine overview with detail, to achieve effective learning. It would seem mistaken, however, to link level 4-5 learners-who con- centrated on the ‘middle ground’ between overview and detail-with such a versatile ap- proach. Indeed, students concentrating on Levels 4-5 were the least successful overall. It may be that they fall between two stools, gaining neither overview nor procedural detail.

7.2 The effectiveness of information seeking strategies Diogenes allowed users to adopt information accessing strategies of their choice. Yet

although it is interesting from a cognitive science viewpoint to have found that this group adopted strategies very similar to cognitive styles identified by other researchers, how ef- fective really was the Diogenes approach? To what extent might other existing systems, not offering the freedom and facilities of Diogenes, have been equally, more, or less effective’? The resources available to this project did not allow the use of control groups learning about PRECIS using other methods, including more ‘traditional’ IR, CAL, or human teaching systems. Use of such controls might have resulted in light being shed on this

question. However, as noted above, there are strong similarities between the information seek-

ing strategies recorded in this project and the cognitive styles discovered by Pask. In this context it is interesting to note that Pask and Scott (1972) also found evidence that when information was presented in a way mismatched with individuals’ preferred styles, their as- similation and recall of information was severely disrupted. Processing information in matched mode resulted in significantly enhanced performance. Ford (1985) found support- ive evidence of the same effects with postgraduate librarianship and information science students. If such findings are generalisable, then it may arguably be desirable to allow in- dividuals, in their accessing of information in a iearning context, the Ievels of freedom more characteristic of IR than of CAL systems.

However, this contention is only valid if individuals, when given such freedom, are in fact likely to choose effective strategies. Rearing in mind the limitations of a dichotomous classification of effective and ineffective performance, 66.67% of all users (excluding those for whom test results were not available) performed well. Of these successful users, 64.28% liked using the system. Of those who were did not perform well, 66‘67~0 did not like using the system. In the majority of cases, then, freedom to access information in an idiosyncratic way led to relatively effective performance.

These figures arguably support the notion that the majority of users possessed suffi- cient ‘meta-cognitive’ skills-self awareness and control of their own information seeking and using processes- to justify devising systems that allow the user control over the sys- tem (rather than assessing ‘behind the scenes’ what is and is not good for them). This con- trol, if based on effective ‘meta-cognitive’ skills, is likely to result in successful performance. For these users, it may be that giving them a great deal of control over the system is rela- tively effective. A user-controlled system would also be much easier to implement than say, one that assesses performance against a more prescriptive user model, and decides on op- timal information presentation strategies.

Page 11: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 579

However, we cannot therefore say that more user-controlled approaches are likely to benefit all individuals. In the Diogenes experiment, this was by no means the case. Of those who performed badly, 33.33% liked using the system, and of those who performed well,

35.71% did not like using the system. It may be that some individuals require help to be able to adopt an effective information seeking strategy. Indeed, there is some evidence from other work on cognitive styles of the preference and need on the part of some individuals to work in an environment that provides considerable external structuring. There is evidence that Witkins’s ‘field-dependent’ individuals learn generally more effectively with external structuring than their relatively field-independent counterparts (Witkin et al., 1977). Also, Pask’s serialists would seem to rely on external (as opposed to their own) structuring of sub- ject matter they are processing (Pask & Scott, 1972).

It is interesting to note in this context that the individuals identified in the present study who showed most similarity to serialists (concentrating on procedural detail) were as a group much less happy using Diogenes than those individuals most similar to Pask’s holists (those concentrating primarily on higher-level conceptual material).

7.3 Implications for IR and CAL Many CAL systems impose a relatively fixed, teacher-prescribed route through their

subject matter. This may or may not correspond with the preferred strategies of those who use the systems. Much research and development has concentrated on more heavily struc- tured systems that do not allow relatively autonomous approaches. As O’Shea and Self (1983, p. 120) note: “The ‘learner as bucket’ philosophy. . . still dominates the computer- assisted learning field. Approaches derived from programmed learning are unfortunately too easy to implement on a computer.” However, an increasingly important focus of re- search is the development of teaching/learning systems designed to allow learners choice in how they access complex information.

It is an axiom of modern learning theory that effective complex learning involves the learner’s active reconstruction of knowledge, rather than passive receipt of the informa- tional components of understanding (e.g., Piaget, 1970; Bruner, 1974; Pask, 1979; Craik & Lockhart, 1972; Ford, 1979a). There is also some evidence of the effectiveness of auton- omous learning, in which the student perceives him or herself to be in control of the learn- ing process, in terms of resultant intrinsic motivation and, as a possible consequence of such motivation more ‘meaningful’ learning (Deci, 1975; Maehr, 1976; Fishbein & Ajzen, 1975).

Reviewing such research, Ford (1979a) concluded that it was highly likely that the qual- ity and durability of learning would be enhanced for certain students if they were to learn in an environment facilitating autonomous active learning relatively free from external con- trol and constraints. Further empirical work (Ford, 1980a) provided evidence in support of this general contention. Yet prevalent teaching and learning environments have been rel- atively unprepared for such learning (Ford, 1979b).

Just as it has not been characteristic of CAL systems, relative freedom of access has been a central feature of IR systems. This has derived not from a realisation of needs based on cognitive models of users, but rather from the reality of having to provide access to large volumes of data. Such volume has necessitated the use of semantically relatively shallow knowledge representations-relative, that is, to many CAL systems operating on narrow subject domains (Ford, 1983).

Nevertheless, the contribution of IR systems and techniques to the process of effec- tive student learning has not been widely exploited within the educational sector. It is likely that both the integrated use of IR techniques within CAL systems, and the increasing use of ‘stand alone’ IR systems as important learning resources, have much to offer. Indeed, there is increasing interest, in the field of teaching and learning, in providing freer access to information-including incorporating the use of databases in courses (Davies & Allison, 1989).

7.4 Towards the ‘ideal’ information system Ideally it would seem that an automated system should know when to allow freedom

and when to impose various levels and types of structure. In this experiment, most people,

Page 12: Towards a cognitive theory of information accessing: An empirical study

580 N. FORDand R. FORD

when given freedom in the way they accessed information, adopted relatively effective, though stylistically very different, strategies. A smaller number, however, adopted relatively ineffective ways of interrogating the knowledge base.

If we could further develop models of what constitutes effective and ineffective infor- mation processing strategies, for different types of information need and for different types of individuals, it might be possible to build into retrieval systems some form of intelligent assistance that might be able to assess the stylistic preferences of individual users, attempt to assess the strategy they are using, and suggest strategic ‘next moves’, more generalised strategic advice, or at least some informed feedback relating to their database searching. Models of information processing approaches emerging from studies such as the present one can be of help here.

We must also try to develop and improve the ability of users to exercise freedom ef- fectively. This will entail the development and enhancement of their ‘meta-cognitive’ infor- mation skills-their ability to exploit their information environment effectively to seek and make use of information in an independent fashion. We are some way from the first goal, held back by our lack of appropriate user models rather than by a lack of software tech- niques to support them. The second goal-helping the development of meta-cognitive in- formation skills- has been the object of concern and research for a number of years (Ford, 1979b; 1980b; 1981). However, there is still much progress to be made. Together, both ap- proaches may result in an effective symbiotic relationship between human and machine that minimises the limitations and maximises the strengths of each.

Acknowledgements-This research was mad?: possible by a grant from the Learning Technology Unit of the UK Department of Employment’s Training, Enterprise and Education Directorate (formerly the Training Agency). Grateful acknowledgement is made to the volunteer students who took part in this study, and to Alan Griffiths (of Open Windows Information Solutions) who programmed the DIOGENES shell.

REFERENCES

Austin, D. (1984). PRECIS: A manual of concept analysis and subject indexing, 2nd edition. London: British Li- brary Bibliographic Services Division.

Bruner, J.S. (1974). Beyond the information given. London: George Allen & Unwin. Burton, P.F. (1990). Accuracy of information provision: The need for client-centered service. Journal of Librar-

ianship, 22(4), 201-215. Craik, F.I.M., & Lockhart, R.S. (1972). Levels of processing: A framework for memory research. Journal of Verbal

Learning and Verbal Behaviour, I I, 67 l-684. Davidson, D. (1977). The effect of individual differences of cognitive style on judgements of document relevance.

Journal qf the America1 Society for Information Science, 28(5), 273-284. Davies, P., & Allison, R. (1989). Computer assisted learning feature: Using databases in economics and business

studies. Economics, 25, 73-77. Deci, E.L. (1975). Intrinsic motivation. New York: Plenum. Durham, A. (1989). Having the last word on information retrieval. Computing, 18 May 1989, 32-3. Elzy, C., Nourie, A., Lancaster, F.W., &Joseph, K.M. (1991). Evaluating reference service in a large academic

library. College & Research Libraries, 52(5), 454-465. Entwistle, N.J. (1981). Sfyles of /earning and teaching. Chichester: Wiley. Entwistle, N.J., Hanley, M., & Hounsell, D.J. (1979). Identifying distinctive approaches to studying. Higher Ed-

ucafion, 8, 365-380. Fishbein, M., & Ajzen, I. (1975). Belief, altitude, infenfion and behaviour: An introductron IO theory and research.

Reading, MA: Addison-Wesley. Ford, N. (1979a). Study strategies, orientations and ‘personal meaningfulness’ in higher education. Brifish Jour-

nal of Educational Technology, 10(2), 143-160. Ford, N. (1979b). Relating ‘Information Needs’ to learner characteristics in higher education. Journal of Docu-

mentation, 36(2), 99-l 14. Ford, N. (1980a). Levels of understanding and the personal acceptance of information in higher education. Sfud-

ies in Higher Education, 5( 1). 63-70. Ford, N. (1980b). Teaching study skills to teachers: A reappraisal. Brifish Journal of Teacher Educafion, 6(l),

71-78. Ford, N. (1981). Recent approaches to the study and teaching of ‘effective learning’ in higher education. RevieMr

of Educational Research, 51, 345-377. Ford, N. (19831. Knowledge structures in human and machine information processing: Their representation and

interaction. Social Science Information Studies, 3(4), 209-222. Ford, N. (1985). Learning styles and strategies of postgraduate students. British Journal of Educational Tech-

nology, 16(l), 65-79. Ford, N. (1991). Expert systems and artificial intelligence: An information manager’s guide. London: Library As-

sociation Publishing. Green, R. (1991). The professions models of information: A cognitive linguistic analysis. Journal of Documen-

tation, 4?‘(2), 130-148.

Page 13: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 581

Hancock-Beaulieu, M., Robertson, S., & Neilson, C. (1991). Evaluation of online catalogues: Eliciting informa- tion from the user. Information Processing & Management, 27(5), 523-532.

Logan, E. (1990). Cognitive styles and online behavior of novice searchers. Information Processing & Manage- ment, 26(4), 503-510.

Maehr, M.L. (1976). Continuing motivation: An analysis of seldom considered educational outcomes. Review of Educational Research, 46(3), 443-462.

O’Shea, T., & Self, J. (1983). Learning and teaching with computers. Brighton: Harvester. Pask, G. (1976a). Conversational techniques in the study and practice of education. British Journal of Educa-

tional Psychology, 46, 12-25. Pask, G. (1976b). Styles and strategies of learning. British Journal of Educational Psychology, 46, 128-148. Pask, G. (1979). Final report of S.S.R.C. Research Programme HR 2708. Richard (Surrey): System Research Ltd. Pask, G., & Scott, B.C.E. (1972). Learning strategies and individual competence. International Journal of Man-

machine Studies, 4, 217-253. Paskoff, B.M. (1991). Accuracy of telephone reference service in health sciences libraries. The Bulletin of the Med-

ical Library Association, 79(2), 182-188. Payne, C. (1990). The use of public reference libraries. Library Management, 11(l), 4-24. Piaget, J. (1970). The science of education and the psychology of the child. New York: Orion. Ramsden, M.J. (1981). PRECIS: A workbook for students of librarianship. London: Bingley. Rholes, J.M., & Droessler, J.B. (1984, April). Online database searchers: Cognitive style. Paper presented at the

1984 National Online Meeting, New York, NY. Whitlatch, J.B. (1989). Unobtrusive studies and the quality of academic library reference services. College & Re-

search Libraries, 50(2), 181-194. Witkin, H.A., Moore, C.A., Goodenough, D.R., & Cox, P.W. (1977). Field-dependent and field-independent cog-

nitive styles and their educational implications. Review of Educational Research, 47, l-64.

APPENDIX A. PRE-SESSION INFORMATION

PRECIS teaching system 1. Imagine that this is the first of a series of seven sessions designed to teach you how

to index using PRECIS-a system for indexing documents. Treat today’s session as you would the first in such a series (i.e., you are not required to learn everything in this one session!).

2. The system is ‘intelligent’- it ‘understands’ natural language, so should be able to respond without the need to phrase your questions in any special way (i.e., no need to use key words or other artificial formats). It also has a limited ability to answer questions of a more general nature, beyond the scope of the main subject matter.

3. After the session, you will be asked to fill in a short questionnaire. Although you will be asked to remember as much as you can of what you have learnt, this will not be used as a test of memory or of how much you have learnt. It is the system which is being tested - not you!

4. In order to get the feel of the system, you may wish to begin by asking it some gen- eral questions (e.g., about the weather). Then go into asking about PRECIS.

APPENDIX B. POST-SESSION QUESTIONNAIRE

Questionnaire - response to PRECIS teaching system 1. Did you previously have any knowledge of PRECIS? If so, please give as much detail as you can.

[space for reply]

2. Would you now try to remember as much as you can about what you have learnt about PRECIS during this session. Please write it out, as though you were teaching someone just

like yourself, with the same level of knowledge as you had before you started. Be as de- tailed as you can.

[space for reply]

3. Please make a note of any comments on or reactions to using this teaching system. Please be as detailed as you can.

[space for replyJ

Thanks very much for your help.

Page 14: Towards a cognitive theory of information accessing: An empirical study

582 N. FORD and R. FORD

APPENDIX C. A BRIEF DESCRIPTION OF PRECIS FOR READERS

OF IPM NOT FAMILIAR WITH THE SYSTEM

PRECIS is a technique for subject indexing of documents. The term ‘PRECIS’ is an acronym for PREserved Context Index System. It was developed in 1969-70, originally for the BNB, as part of a UK MARC project relating to the production of machine-readable bibliographic records.

A PRECIS entry is generated by a two-stage procedure: first, the document to be in- dexed is processed by a human indexer, according to a system of syntactical rules, and sec- ond, the resultant machine-readable input string is processed by a computer, which produces a set of appropriate entries, after consulting its machine-held thesaurus.

From the user’s point of view, PRECIS has the advantage of preserving a full descrip- tion of the document at every index point, none of the more specific terms being ‘dropped’ as the points of user access become more general. From an organizational point of view, PRECIS has the advantage of being a two-stage system, which enables both human indexer and machine to perform the types of task they do best, the one exercising skill and judge- ment, and the other doing the routine tasks of formatting the entries and generating cross references. The Indexer examines the document, and, using a schema of Operators and Codes, compiles an input string, in accordance with the rules of PRECIS’ own ‘grammar’. The resultant instructions, expressed in standard conventions, are finally converted into a nine-digit machine-readable manipulation string.

In PRECIS there is a scheme of Role Operators: a system designed to ensure that different indexers (and the same indexer on different occasions) achieve a high degree of uniformity, and that the terms within the Input String consistently appear in context-depen- dent order. An Operator is a symbol indicating the grammatical function of the term within the String. There are 15 Operators in all. They are conventionally placed in brackets be- fore their related term. There are Primary (or Main Line) Operators, and Secondary (or Interposed) Operators.

There are also three types of Code used in PRECIS strings: Primary (Theme Interlinks and Term Codes, which precede the whole string), Secondary (Differences and Connectives, which precede their terms within the string), and Typographic codes, all of which are at the disposal of the indexer when giving precise instructions to the computer as to the format of each individual entry.

The vocabulary of the machine-held thesaural network is open-ended, accepting new terms as soon as they are encountered, and the whole of the subject matter of a document is stated in summary at each point of access. A natural word order is preserved through- out. In listing subject terms, a general-to-specific order is also maintained at each entry by a technique known as ‘shunting’, this providing information as to the relationship between topics covered by a document.

For full details of the PRECIS system, the reader is referred to Austin (1984) and Ramsden (1981).

APPENDIX D

Log of the beginnings of an interaction with a Deepender. The level of each question

has been added to the transcript.

14:10:24 Question 1 it is sunny-what shall i do?

[level IJ

) The sun has harmful rays. Protect yourself from it.

14: 12:45 Question 2 it is raining-what shall i do?

[level IJ

)if it is raining, you should take an umbrella.

14:14:08 Question 3 what is precis?

[level 2J

)PRECIS is a technique for subject indexing of documents. The term ‘PRECIS’ is an ac- ronym for PREserved Context Index System.

Page 15: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 583

14:15:15 Question 4 how does precis work?

[level 21

)A PRECIS entry is generated by a two-stage procedure: FIRSTLY, the document to be indexed is processed by a human indexer, according to a system of syntactical rules, and SECONDLY, the resultant machine-readable input string is processed by a computer, which produces a set of appropriate entries, after consulting its machine-held thesaurus.

14: 16:42 Question 5 [level 4] what are the syntactical rules for indexing documents with precis? )There are PRIMARY (or MAIN LINE) OPERATORS and SECONDARY (or INTER- POSED) OPERATORS.

14:19:16 Question 6 explain primary operators

[level 6J

)PRIMARY (or MAIN LINE) OPERATORS are divided into CORE CONCEPTS and EXTRA-CORE CONCEPTS. Every PRECIS string of terms describing a document MUST contain a PRIMARY OPERATOR.

14:20: 19 Question 7 [level 6J give an example of primary operators )PRIMARY OPERATOR (0) signifies LOCATION- where something is located, or where something happened.

14:22:17 Question 8 [level 71 how is the primary operator (0) divided into core and extra-core operators? )The EXTRA-CORE CONCEPT PRIMARY (or MAIN LINE) OPERATORS are:

4 Viewpoint-as-form; Aspect. Viewpoint of author influencing treatment of subject. 5 Selected Instance. Sample population/study region. 6 Form of document/ Physical or intellectual form: ‘tables’, ‘bibliographies’,

Target user. Class of user (eg. ‘for parents’).

14:24:03 Question 9 [level 8J what are the core concept primary operators? )There are 4 main (or CORE) PRIMARY (or MAIN LINE) OPERATORS: (There are also 3 ‘extra-core’ primary operators)

0 Location 1 Key system 2 Action/effect 3 Agent

APPENDIX E

Log of sections of an interaction to show different types of response. The type of re- sponse (pre-set frame or on-the-spot message) has been added to the transcript.

10:21:01 Question 1 what is precis?

[preset frame/ )PRECIS is an technique for subject indexing of documents. The term ‘PRECIS’ is an acronym for PREserved Context Index System.

10:22:20 Question 2 what is diogenes?

>I am a system here to help you with PRECIS. [message typed on the spot]

10:24:12 Question 3 help me with precis

)Please ask whatever you wish to know.

[message typed on the spot/

Page 16: Towards a cognitive theory of information accessing: An empirical study

584

10:24:51 Question 4

N. FORD and R. FORD

how do you index documents using precis? [preset frame]

)Steps in indexing a document in PRECIS:

1. Scan the document to extract relevant descriptive terms (or make up suitable ones). 2. Decide which are to be Lead Items (i.e., to appear as index access points for the

user). 3. Assign operators to each term, and insert codes as necessary, thus forming a string.

This string is eventually converted into a machine-readable manipulation string for process- ing by computer.

10:25:58 Question 5 what counts as a relevant descriptive term?

[message typed on the spot] )The indexer uses his judgement to decide upon suitable descriptive terms.

10:29:25 Question 6 how does the indexer decide what is a relevant term?

[message typed on the spot/ )The indexer scans the document and may use the title, contents list, index and text to un- derstand the subject matter. The indexer then judges what are the most appropriate terms which will make sense to the user of the index system.

10:38:46 Question 10 what is an operator and what are its functions?

[preset frame] )In PRECIS there is a scheme of ROLE OPERATORS: a system designed to ensure that different indexers (and the same indexer on different occasions) achieve a high degree of uniformity, and that the terms within the INPUT STRING consistently appear in CON- TEXT DEPENDENT order. An OPERATOR is a symbol indicating the grammatical func- tion of the term which it precedes, and determines the order of appearance of the term within a STRING. For example:

Document: ‘The manufacture of computers by robots in Great Britain’ Operator: (0) Great Britain

(1) Computers (2) Manufacture $v by $w of (3) Robots

There are 15 OPERATORS in all. They are conventionalIy placed in brackets before their related term. There are PRIMARY (or MAIN LINE) OPERATORS and SECONDARY (or INTERPOSED) OPERATORS.

10:39:39 Question 11 why is Great Britain given the operator (0), Computers the operator (l), and so forth, in the above example you have just given me?

[preset frame] )CORE CONCEPTS are the KEY SYSTEM, or central concept, its environment, actions of which it is the performer or recipient, and performers of intransitive actions upon it. There are 4 main (or CORE) PRIMARY (or MAIN LINE) OPERATORS (there are also 3 “extra-core” primary operators):

0 Location 1 Key system 2 Action/effect 3 Agent

10:45:55 Question 12 what is meant by context dependent order?

Page 17: Towards a cognitive theory of information accessing: An empirical study

A cognitive theory of information accessing 585

[preset frameJ )CONTEXT DEPENDENCY is the maintaining of a general-to-specific order in listing subject terms. This is achieved by a particular PRECIS technique.

APPENDIX F

Following are partial logs illustrating different types of question. The type of question has been added to the transcript.

10:31:24 Question 14 special classes of action? [Descriptivef )SECONDARY OPERATORS ‘s’, ‘t’ and ‘u’ denote SPECIAL CLASSES OF ACTION. They are used by the indexer to indicate the ways in which concepts are interlinked within a single theme.

The three SECONDARY OPERATORS dealing with special classes of action are:

s Role definer: Directional property

t Author-attributed association

u two-way interaction

10:32:43 Question 1.5 give me an example )An example of the use of the SECONDARY OPERATOR ‘s’: Document: ‘The participation of Greeks in PRECIS teaching’

String: (1) PRECIS (tick) (2) teaching (tick) $w of (s) participation $v of $w in (3) Greeks (tick)

[ConcreteJ

Entries: PRECIS Teaching. Participation of Greeks

Teaching. PRECIS Participation of Greeks

Greeks. Participation in teaching of PRECIS

16:06:18 Question 21 what can precis do that a manual indexer could not achieve? )Human indexing of this scope would require great effort and time.

fFocussingJ

16: 16: 19 Question 28 what is the difference between primary and secondary operators? )Primaries give syntactic information. Secondaries show relationships.

(Analytic]

16:24:27 Question 32 [Focussingj does the indexer have to read through the whole document to extract relevant index terms? )Not necessarily. A thorough scan might do.

16:27:26 Question 34 [Descriptive] if a short sentence is used to describe a document how does precis get all the index terms? )The indexer must ensure terms cover all index entries required.

16~29~32 Question 35 (Analytic] so, the ‘short’ sentence may have to contain several hundred index terms? )A typical index string would contain approximately 6 index terms. Each of these terms would form an index entry (qualified by all the other 5). How many index terms are cho- sen depends, of course, on the indexer’s judgement.