the role of induction in knowledge elilcitation

The role of induction in knowledge elicitation Abstract: Knowledge elicitrrriafiom expent is a maj0tproblem in the development of expert systems. This paper d e s some of the di@ulties inherent in the process and suggests that in certain skuarions induction can help. The algorithm c d e r e d is Quidan’s 103, because this is used in available sojiware.

ANNAHART School of cOmputi?lg Lancashire Polytechnic Preston PRl2TQ England

Background t is commonly acknowledged that knowledge elicitation is a, or even the, major bottleneck in the development of expert systems [I]. Despite our years of experience I in related activities in psychology and

systems analysis there is no accepted method- ology or aid for this complex, but essential, process. Areas can be vastly different with respect to their types of knowledge, masoning and special difficulties. A knowledge engineer may quire expertise in several multi-disciplinary arm, and the techniques he uses will depend to a large extent on the area he is studying. bo ther important factor is the size of tpe problem tackled: small problems can be relatively easy to deal with, but increasing the scope of an expert system can have a drastic effect on the problem of formulating the hitherto undocumented knowledge of the expert. For example, a rule-based system with 400 rules will almost certaidybe more complex than ‘4 times a I00-de system’.

The problem of interviewing If no aid is available then it is usually necessary to resort to lengthy interviews which may well be tape-recorded, and then converted to transcript form. These take hours to type, and need ex- tensive editing before they can be used construc- tively. People do not speak in complete sentences or paragraphs, and the’dialogue is marked by “um’s’’ and “er’s’’, asides, part sentences, irrelevancies, repetitions etc.

“We had a psychologist there to help us, but now he’s gone. We’re left with thirteen hours of dialogue on a word-processor. A lot of it’s rubbish - what do we do next?” was the com- ment of one harrassed knowledge engineer. His experience cannot be unique. The orginal must be kept as it is the nearest to the expert’s own rep- resentation, but the text needs tidying up, in a

process of removing, reorganising and high- lighting different material. The text should be broken down into sub-areas, identifying questions and answers, facts, rules, pre- conditions and assumptions, conclusions, heuristics, goals, casual dependencies, etc. Some terms wil l be illdefined e.g. ‘big’, ‘usually’ and others may be synonyms. Some rules may be locally true, and others globally true. It is a good idea to create the equivalent of the systems analysts’ data dictionary, or to build up a card index of related fadterms etc. , and to cross-reference all related material. In this ill-defined and time-consuming way the knowledge is eventually abstracted and Coded.

Problems with the expert In fact, the problem is complicated further, because the expert often does not know how he makes decisions - he has not thought about it in this way before. He will describe what he thinks he does, or thinks he ought to do, and what he actually does may be different. This applies even in cases where the expert is very willing to co- operate. There is the danger that asking him about his methods can change his view of them- you can only ask him for the first time once. Human beings have difficulty in describing what is typical or representative [2] and rhe expert may describe interesting, complicated, or recent cases and omit mundane and straightforward ones. There may also be knowledge that the expert might not care to see written down, even though he uses it. Also, the process of elicitation may force him to think about consequences of his beliefs which he had not previously appreciated.

Ideally you should ask the expert to represent his view of his decision-making in the form that is most natural to him. This may be a graph, set of rules, decision table, doodle, diagram, etc. It means that you should not force him to produce something like flow charts, which are in any case, difficult to check for completeness and correct- ness. He may have documented tests or case notes. All this is a help, but even in the case where rules are already fully documented it can sti l l be difficult to code it in an expert system [3].

There are different types of knowledge includ- ing concepts, relations, facts, heuristics, pro- cedures and classificatory knowledge [4]. These can be classified further as factual or strategic, declarative or procedural, domain or reasoning, surface or deep. In general it is the factual knowledge which is relatively easy to extract (or teach, for that matter) [ 5 ] . The expert may use backward reasoning or forward reasoning, and he may use the equivalent of subjective estimates or degrees of belief which he has not explicitly evaluated.

Alternatives Some companies have found that the problem is alleviated by the expert actually building the expert system himself. An expert can experiment

24 Expert Systems, January 1985. Vol. 2, No. 1.

I or ‘play’ with a shell which has a reasonable interface. He does not need to learn program- ming, and concentrates instead on building a model which he considers reasonable. Similar success is reported with SYNICS which is con- ceptually relatively simple, and allows the expert to build a network. The resulting systems are sometimes considered too ‘simple’ to warrant the label of artificial intelligence or expert system, although their usefulness is surely more important than their name. However, the main de- ficiency with both of these alternatives is that they can be too restrictive. Many problems would need an unacceptable level of contortion, making such an approach impractical. In these cases we are left with The knowledge engineer who needs an understanding of the application area, prw gramming, probability theory and logic.

Induction Against t h i s background any aid is potentially useful, provided it is used sensibly in an appro- priate context. Inductive aids currently on the market fall into this category. The only other alternative would seem to be statistics packages which unfortunately rely on a large databank of examples, and produce output which can be a- cult to understand. (In fact, I recently met a group of statisticians who were interested in using inductive rule generation instead of statistics because the output was easy for the layman to understand.) This is important, because the legal and moral responsibilities sti l l lie with the human beings, who should be able to follow the ‘reasoning’ used by the algorithm. Nevertheless in medical applications statistical techniques have been used with a measure of success [6,7J.

The principle of induction is that the expert provides a set of examples of different types of decisions, called the training set. He also supplies the relevant factors, often called attributes, influencing that decision. The algorithm uses the training set to induce general principles, thereby formulating the decision process, and enabling prediction of decisions for examples not con- tained in the training set. A big advantage of this method is that the expert often finds it easier to provide examples of decision-cases rather than to describe the decision-making process itself[8]. In other words he can describe ’what’ rather than ‘how’. Sometimes the documented examples are readily available.

There have been evaluations of the algorithms available [9,10,1 I] and criticism of induction in general [12]. However, when embarking on a real project it is necessary to consider any tools which are commercially available.

The ID3 Algorithm There are several algorithms in use, but one which is used in many aids or shells is Quidan’s ID3 (see Figure 1) [13]. This uses backward reasoning to induce rules, and uses the infor-

mation statistic. (For further details see [14].) The main danger is that the algorithm could be used blindly with inadequate understanding of what it does, and possible problems. The skill lies in the selection of attributes and examples in the training set. The quality of information coming out of the algorithm depends on that presented to it. It is necessary to provide different cases of the rules. Here we must distinguish between the rules and the problem space. The algorithm requires different cases of decision-types includ- ing both the common or mundane, and rare or special. A random sample from the problem space will not give this: it will reflect the type of problems as they are expected to occur, not the types of different cases which the rules must deal with.

For example, if in diagnosis of dectrical faults 65% of cases are solved by simply mending a fuse then 65% of the sample space essentially describe only one example for the rules! This should also be taken into account when assessing the s u ~ e s s of the ensuing system. In this case a one-rule system could be described as having a success rate of 65%, but in no way could it be deemed ‘expert’! A better solution would be to group examples into classes depending on the ultimate decision or classification, and if required to take random samples from each of these subclasses.

People often ask how many examples are needed. This depends on the complexity and number of rules. In practice you will not know this, which is why the algorithm is being used in the first place. It should be realised that the algorithm will not be used just once. It is useful to indicate gaps, contradictions, or special cases: it gives pointers to questions which the knowledge engineer should ask the expert in order to provoke further discussion. The induced rules can predict more results. If the expert disagrees with these then he will need to refine the mining set to cover details which he previously omitted. The knowledge engineer can experiment with the algorithm away from the expert, thereby reduc- ing the amount of time wasted in consultation.

It is a difficult process to select the attributes, similar to that faced by taxonomists [15]. For example, labels which play no part in the decision-making but which identify the examples must not be used. Unique labels produce ex- cellent rules for describing the training set, but which are useless for anything else. Similar, but less obvious, mistakes can occur, giving rules which are caused by idiosyncraues in the training set and which will be highly sensitive to changes in that set. This requires careful discussion with the expert who, through years of experience, has learnt to recognise which factors are important. There is a danger that he will supply too many attributes, many of which will be highly corre- lated. In general some attributes will play a large part in the decision-making and others may be irrelevant, or confirmatory. The attributes need to be refined in discussions with the expert. Care- ful note should be taken of the way in which attributes are used, e.g. ifthe expert considers X > Y as an attribute then the single attribute (X > y)

Expert Systems, January 1985. Vd. 2, No. 1. 25

Figure 1. Pan of a training set, and the induced rules

26 Expert Systems, January 1985. Vol. 2, No. 1.

Table 1

should be used, and not the two attributes X and Y. ID3 considers only one attribute at a time, and compound attributes may be missed. In general, the output should be examined by the expert to check for peculiarities or genuine ‘discovery’. The training set may be refined many times.

The algorithm was originally designed for categorical attributes rather than real values, e.g. colour with possible values red, blue, green is categorical, but height in metres is real. Most implementations of the algorithm have a facility for dealing with real numbers, but this usually results in rules like: ‘If X > 5 1 THEN.. .’ where 51 is an almost arbitrary cut-off point. Such induced rules warrant carem investigation with the expert, as they will probably only serve as a first approximation.

Another shortcoming is its inability to deal 1 with uncertain or contradictory data. It continues until it has either fully classified the examples or exhausted the training set. This means that the later rules may be tenuously based on.little evidence and very dependent on the paracular training set. However, given these reservations some people have used ID3 very successfully. Some relative merits are shown in Table 1. 1 fichie [ 161 is st i l l convinced that induction is the best answer.

I Expert Systems, January 1985. Vuf. 2, No. 1.

Conclusion The knowledge engineer needs to be aware of aids currently available to him. At present these seem to be: (a) Interview with expert, or discussion between

(b) Software aids based on word-processors, foi several experts.

processing transcripts. (c) statistical packages. (d) Psychological methods [4]. (e) Shells. (0 Induction. Much more help is needed in this area. Ic general, the more usefid aids will be those which are interactive, and describe how conclusions art reached and how strong the evidence is for them. They will probably be userdriven and no1 program-driven. An interactive, explanatoq inductive aid could prove very useful for prob lems where it is relatively easy to generate e- ples, and the process is like classifkation or deci sion-making, and is deterministic not probabilis. tic. For other applications different types of aid! are required.

“. . the more use- MaidswiIlbe those which are interactive, and descde how conclusions are reached and how strong the evidence is for them. Theywill probably be userdriven and not program-dn‘ven ”

27

References [ 11 M. Welbank, ‘A review of knowledge acqui-

sition techniques for expert systems,’ Martlesham Consultancy !Services, British Telecommunicatiions, UK. 1983.

[2] A. Tversky & D. Kahneman, ‘Judgement under Uncertainty: Heuristics and Biases.’ In JolmomLaird and P.C. Watson (A), Thinking: Readings in Cagnitk Science, Cambridge University prrss, 1977, pp. 324-340.

[3] W.P. Sharpe, Zogic Pr - i fo r the Law.’ In M.A. Bramer, 1g84, pp. 217-228.

[4] J.G. Gammack & R.M. Young, ‘psycho- logid Techniques for .mating Expert Knowledge.’ In M.A. Bnuner, 1984,.pp. 101-1 12.

[5] M.J. Cookson, J.G. Holman & D.G. Thompson, ‘Knowledge Acquisition for Medid Expert Systems: 8 system for eliating diagnostic dtdsion making histories.’ In M.A. Bramer, 1984, pp. 113-116.

[a] J. Fox, D. Barber and K.D. B a r b , ‘Alternative to Bayes? A quantitative com- Darison with rulebased diatznosis.‘ Methods

[7] D. J. Spiegclaaurer & R.P. K d l - JOIES, ‘Sta- tistical and kaawledge-based approaches to decision support systems with an application in Gastroenterology,’ 3.R.S.S. (A), 147,1,1984, pp. 35-77.

edge Acquisition by Encoding Expert Rules versus Computer Induction from Examples - A Case Study Involving Soybean Pathol- ogy.’ Int.3. Man-MachineStudies, 12,1980, pp. 63-87.

[8] R.S. Mi~hakki & R.L. Chila~~ky, ‘bowl-

[9] R.S. Michalski, J.G. Carbonell & T.M. Mitchell. (eds.) MacAine h k g , Tioga, 1983.

[iOJT.G. Dieterrich & R.S. Michalski, ‘In- duction Learning of Structural Descrip- tions,’ A m w Intelligence, 16, 1981, pp.

[ll] P.R. ahen & E.A. Feigenbaum (eds.) The Handbook of A m M l InteUigence, 3 , Piman,1982. .

[ 121 J. Fox, ‘Doubts about Induction,’ Bulktin of SPL Insight 1984,2,2, SPL International, Abingdon, Oxford, UK, 1984.

duction fiom hge collestions of examples.’ In D. Michie, Ex* Systems in the Micro- elecaonie Age, Edinburgh University Press, 1979, pp. 168-201.

[14]A. Hart, ‘Experience in the Use of an Inductive System in Knowledge Engineer-

[15] P.H. Sneath & R.R. SokaI, Ntnerical Taxonpntu, W.H. Freeman and Co., San Franasco, 1973.

[la] D. Michie, ‘Towards a knowledge acceler- ator.’ In proceedings of Impact-84, SPL- Insight, Abingdon, Oxfordshire, UK, 1984.

257-294.

[13] J.R. Quinlan, ‘Discovering rules by in-

ing.’ In M.A. B-, 1984, pp. 117-126.

Further reading M.A. Bramer, editor of Research and Develop nrent ofExpert systems, Proc. ofExpert Systems ‘84, Cambridge University Press, 1984.

About the author AMaE.Hart Read mathematics at Cambridge University and then worked for a few years in Software Engineering. Currently a senior lecturer at School of Computing, Lancashire Polytechnic.

28 Ejrpert Systems, January 1985. Vol. 2, No. 1.

the role of induction in knowledge elilcitation

Documents