knowledge engineering as cross-examination

Knowledge engineering as cross-examination

Joseph S Fulda

A simple yet powerful technique is presented for interviewing domain specialists in an expert systems' context on the basis of the legal process of cross-examination. The technique involves seven heuristics, which collectively assure the integrity of the knowledge base. In addition, combinatorially explosive interviewing problems are rendered tractable.

Keywords: interviewing, knowledge acquisition, knowledge elici- tation

A common task for expert systems is identification ~, for example the identification of a fault in a mechanical or computer system or the diagnosis of a disease in a hospi- tal patient or in a species of plant. The usual technique for building the knowledge base that underpins such systems is interviewing-', an unstructured process that has drawn criticism from several quarters as being difficult, tiring, and unscientific, yet critical. (See Reference 3 for a summary.) We present here a methodology for interviewing experts that reduces the work involved, can be regarded as scientific, and has been tested on a large medical case study 4. In addition, the technique is simple to understand and use, and neither grids nor technological aids are required.

P R O B L E M

Consider the normal identification task with I possible outcomes, e evidentiary items that are used to make the identification, and a maximum of v values per evidentiary item. In order to perform the identification task successfully, every possible combination of evidence must be associated with 0 or more outcomes. (The evidence may be indicative of no identification or of multi- ple identifications. For example, a patient may have no disease or several diseases. For this reason, every threshold or none of them may be met.) Examining every

Program in Computer Technology and Applications, Columbia University, NY, USA (correspondence to 701 West 177th Street, 21, New York, NY 10033, USA) Paper received 17 July 1993. Accepted 18 August 1993

possible combination of evidence is a combinatorially intractable problem, specifically involving ! × v e deter- minations. Clearly the brute-force approach will not work. Nor is this dependent on the limitations of computers (which could arguably be overcome if, in almost all cases, e was properly limited to, say, a few hundred). Rather, since this is a human factors process involving a domain specialist and knowledge engineer, or a man-machine process, where the machine learns dir- ectly from the human expert, it is h u m a n limitations that control the process. This observation is not new. Rather, it is what Professor Edward Feigenbaum termed 'the knowledge acquisition bottleneck' in 19695 .

C O V E R I N G

The general method of treating such combinatorial problems is to examine select combinations of evidence which obviate the need to examine a large number of others. In other words, we examine patterns of evidence, each of which covers many other patterns. For automated knowledge acquisition, this has been done in a most intriguing way 6. For interviewing, however our subject here - there has not to our knowledge been any successful at tempt to provide a clear methodology for covering evidence-patterns 5. Our contribution here is to provide just such a methodology. The methodology has been implemented in the medical domain in which it was used to diagnose the etiology of tiredness in a select sample of the patient population. This study has been described in detail elsewhere 4.

C R O S S - E X A M I N A T I O N

The method involves an analogy with the legal technique of cross-examination used in courtrooms throughout the Anglophone world. Contrary to popular belief, cross- examination is not a form of interrogation. Rather, it is the putting of questions to call into question the answers to previous questions. In the courtroom, this can be done in two ways: by impeaching the credibility of the witness (e.g. by bringing up his/her criminal record), and by

0950-7051/94/010052-03 © 1994 Butterworth-Heinemann Ltd 52 Knowledge-Based Systems Volume 7 Number 1 March 1994

impeaching the testimony that he/she gives by demon- strating an inconsistency with earlier testimony. It is this latter form of cross-examination that animates our methodology.

A D D I T I V E M O D E L

In many domains, the evidence will be fairly independent and thus cumulative. It will therefore be possible in just those domains to add the values assigned to the evidentiary items producing a sum which can be compared against a threshold for each of the I outcome-possibili- ties. This has the following technical justification. From a Bayesian decision-making perspective, a trial is indi- cated if and only if the probability of a given crime exceeds a certain threshold. When a series of evidentiary items have been noted, the probability of the crime differs from its prior probability (i.e. its frequency in that part of the base population meeting the profile), in accordance with Bayes' theorem. Thus, if the evidentiary items are independent, the posterior odds of the crime are the product of the likelihood ratios and the prior odds. Taking logarithms, the likelihood ratios' logarithms are additive measures of evidentiary weight.

The domain specialist is first asked (direct examination) to provide numerical (or algebraic or Boolean (see Reference 4)) data for each possible value of the evidence. This will result in a matrix with e × I entries. Each column must then be summed and compared against each of the I thresholds. If the threshold is met or exceeded, an identification is made. Otherwise, it is not. It is important to note that, even in those domains where the evidence is fully independent and additive, the mere assignment of numerical weights does not confer any particular reliability on the resultant matrix. Indeed, for the matrix to be considered as a product of science rather than data from a social-science interview, a technique guaranteeing the reproducibility of the matrix is necessary. In the extensive medical case study in which this methodology was applied 4, we were able to obtain a reproducible matrix with 75 × 14 entries? This having been said, it is necessary to admit that, even with a reproducible methodology, a particular set of interviews may not result in a reproducible matrix if the domain specialist is incapable of making consistent judgments about the domain.

M E T H O D

The idea is to work at upsetting the matrix until success- ive and continued assaults on its integrity consistently and repeatedly fail. The method is thus not dissimilar in its structure to mathematical proof by refutation. The matrix is assumed, a priori, to be not at all reproducible, and adjustment of its values is continued until our further attempts (perhaps the twentieth pass) to upset the matrix fail. This entails the conclusion that our assump- tion at the beginning of the pass, that the matrix is not reproducible, was wrong, and that now the matrix is indeed reproducible. While the idea as a whole resembles proof by refutation, the actual mechanics that go into the

Knowledge engineering as cross-examination: J S Fulda

attempts to upset the matrix resemble legal cross-examination. While it is rare that the sole purpose of asking a question in ordinary conversation is to call into question a previous answer to a previous question, that is exactly what attorneys in the Anglophone world do in court every day. The more answers that are impeached during cross-examination the better the cross-examination. By analogy, a superior knowledge engineer must continue to assault the matrix until it proves to be impregnable. How is this done? In order to produce an inconsistency, we can pose any question consisting of evidentiary items and their values and check whether the domain specialist's direct assessment of the evidence agrees with the assessment implicit in the sum of the evidentiary values as compared against the threshold. This is the key to the entire cross-examination methodology: the comparison of explicit judgments made by the expert during cross-examination with the judgments implicit in his~her responses to queries for evidentiary values and thresholds during direct examination.

C O N S I S T E N C Y

'Consistency', as used here, is a far broader term than that used in logic, meaning the consistency of an expert's specific choices with his/her best general judgments. Such consistency entails two global conditions: (a) every combination of values that meets or exceeds a threshold must trigger a positive identification from the expert, and (b) no combination of values that sum to less than the threshold may trigger a positive identification from the expert. (Note that we do not ask for a negative identification, any more than the law requires proof of innocence. The fault, disease, or crime may be present, present, or have been perpetrated, respectively; however, the evidence at hand simply does not allow such a finding. There is nothing to prevent a further accumulation of evidence to change this situation. (In the law, of course, this is not completely so, because of the protection against double jeopardy.)) These global consistency requirements suggest the following seven local heuristics, all centering around the threshold.

H E U R I S T I C S

• Heuristic 1." Select matrix values that sum to exactly the threshold and query the expert.

• Heuristic 2: Select matrix values that sum to just below the threshold and query the expert.

• Heuristic 3: Repeat either heuristic 1 or heuristic 2, varying exactly one matrix item at a time, where each variation yields the same sum, and query the expert.

• Heuristic 4." After unsuccessfully attempting to upset the matrix using heuristic 2, add a single evidentiary item, with a zero weight, and query the expert. (A zero weight item occurs if the item contributes to some of the I outcomes but not others.)

• Heuristic 5: After successfully upsetting the matrix using heuristic 2, and adjusting the values accord-

Knowledge-Based Systems Volume 7 Number 1 March 1994 53

Knowledge engineering as cross-examination: J S Fulda

ingly (this is akin to re-direct examination), try a combination of values that sum to further below the threshold, and query the expert. Heuristic 6." After successfully upsetting the matrix using heuristic 1, and adjusting the values accord- ingly, try a combination of values that slightly exceed the threshold, and query the expert. Heuristic 7: After unsuccessfully attempting to upset the matrix using any of the techniques above, try the same combination of matrix values in a different order. This has been shown to work (produce an inconsistency) during police interrogation 7 and it is our experience that it is a useful technique in cross- examination as welP.

When none of these heuristics can be used to upset the matrix, it may be presumed that the global consistency conditions are satisfied because local perturbations in the critical area do not occur. Hence the matrix is now robust and reproducible. In addition, nothing approach- ing I × v e evidence combinations have been examined. As well as producing a reproducible matrix without combinatorial explosion, the methodology just presented is simple enough to gain widespread use, and it requires neither facility with advanced mathematics nor the use of expensive technological aids.

PRIOR WORK

In Reference 5, an extensive survey of knowledge elici- tation techniques was undertaken. Most of the work that has been done on structuring interviews has not suc- ceeded (see Reference 3 for a clear statement of this and a literature review), but one prior technique does resemble the beginnings of what we have put forward. In that technique 5, the knowledge engineer selects a test sample for the expert to work on. Such a sample is selected with a view to choosing representative cases, difficult cases, borderline cases, salient cases etc. from archival data. The expert is then observed as he/she carries out the pre- selected sample of tasks. What is missing from this technique is that no algorithm or heuristics are given for selecting cases that are difficult, borderline, or salient, leaving the sample to be selected by intuition and ad hoc methods. In the present methodology, these cases centre around the threshold, and their selection for the cross- examination of the expert is built in using the closeness of the sum of the weights associated with a given evidence- pattern to the threshold.

FUTURE RESEARCH

The above suggests several avenues for future research. First, it would be helpful to generalize the technique given here to cases where lines of identification are cumulative in some way, though not strictly additive, and where the evidence for each such line is additive: the mathematical concept of a partial ordering is relevant here. Second, it would be helpful to design domain- specific heuristics for cross-examination as was done for the medical domain 4. Third, it would be interesting to check reproducible matrices against the actual practices of the domain specialists from whom the knowledge base was constructed. Discrepancies could then be attributed to the experts' use of more or fewer evidentiary items than have been elicited at the opening of the direct examination. Fourth, it would be rewarding to develop an automated version of the cross-examination methodology; such a product would save expensive knowledge engineering sessions between the expert and the knowledge engineer for those evidentiary areas where the system did not converge even after repeated passes.

A C K N O W L E D G E M E N T S

I would like to acknowledge the institutional support of the Mount Sinai School of Medicine, USA, which was made possible by Professor Craig Benham. It is my pleasure to acknowledge the perceptive comments of Professors Michael Anshel, Craig Benham and Daniel Cohen. Inasmuch as this work is a redaction of my doctoral dissertation, it is my honor and pleasure to dedicate this paper to my mentor of many years, Pro- fessor Michael Anshel.

R E F E R E N C E S

1 Winston, P Artificial Intelligence Addison Wesley (1984) 2 Weiss, S and Kulikowski, C A Practical Guide to Designing Expert

Systems Roman and Allanheld (1984) 3 Fulda, J 'Review of SIGART Newsletter special issue on knowledge

acquisition No 108 (1989)' Computing Reviews Vol 32 (Apr 1991): 225-226

4 Fulda, J "Knowledge engineering in a combinatorial setting as a quasi-legal process: a medical case study - - tiredness induced by malignancies in certain populations of patients' Proc. Annual Confer- ence of the International Association of Knowledge Engineers (1989) : 79 104

5 Hoffman, R "A brief survey of methods for extracting the knowledge of experts' SIGART Newsletter No 108 (1989): 19--27

6 Burzesi, T 'TECREK: a software tool that helps ensure the comple- teness of rule-based expert-system rule sets' M S Thesis Hofstra University (1989)

7 New York Times (15 Nov 1988) (Science Section)

54 Knowledge-Based Systems Volume 7 Number 1 March 1994

knowledge engineering as cross-examination

Documents