adaptive intelligent systems || introduction to that rule-based systems are an evolutionary dead end...

Debate III

That Rule-Based Systems are an Evolutionary Dead End in the

Development of Intelligent Systems

Chairperson Vee Khong

230 Debate Ul

- The first speaker will speak for the motion.

- Rule-based systems are an evolutionary dead end in the realm of intelligent systems. There are four notions there. That rules can implement intelligent systems, and we will see what an intelligent system is. Evolution for me means one thing, something that can be extended. It can grow, it can adapt to new things. It can be maintained. A dead end - that means that there is a facet that we can no longer exploit. Intelligent systems means something that shows intelligent behaviour. Let us look at intelligence. What is intelligent behaviour? Everybody here is trying to simulate it, but do we really know what it is? I think one of the first things that was done with intelligent systems was decision systems. We pretend that intelligence is some form of decision. Decisions simulated in a machine. We might go a bit further and say that intelligence is something that understands. We might say that something that understands must also be able to explain. We might say that an intelligent system is something that can carry out some sort of cognitive, perceptive process. By that I mean vision, perceptual senses, pat tern recognition type of capabilities. There is also another way of looking at intelligent systems - learning. An intelligent system must be able to learn. I think those are the four major characteristics associated with intelligence. The opposite of this motion says that we would be able to carry on all these intelligent processes using 'if... then...' type rules, and my duty today is to say that it is impossible. I think it is quite simple to see that it is impossible to use 'if... then..' statements to carry on all these processes. I would grant that some sort of decision system would be able to be written using 'if., then...' rules, but when you start getting into more intelligent processes such as understanding, explaining, cognitive perception, they cannot really do that type ofthing. How would you write, for example, a rule-based system to see, to understand pictures? Impossible. Do you write a rule for each pixel in your image? Nobody would ever dream of that . Learning - what learning techniques do you use today? They are not rule-based but algorithmic processes, that perhaps produce rules, but are not developed as rule-based systems. So I think that from an evolutionary point of view, if we say that we are at the first step of trying to simulate intelligence, rule-based systems are able in certain circumstances to implement decision type systems, decision type intelligence extracted from experts' know how, in certain cases for very limited domains. But for any extension to other intelligent applications, they are a dead end. Further, there is no way of maintaining a system based on 'if... then../ rules. When you start to get into very complex applications of decision systems today, decision systems start to show intelligent behaviour around a hundred rules; one hundred, two hundred, three hundred rules, then you can say that it is perhaps a limited sort of intelligent behaviour. But when you start getting into cognitive processes, what is the number of rules that you need to implement in order to simulate a cognitive process? A thousand rules? Two thousand rules? Will you be able to maintain such a system? No. It is a dead end.

- So what you are saying is that intelligence is a cognitive process and rules cannot best represent these processes?

Debate III 231

- I think that is the major failure, and I would say that another way of saying this is that we have invented all these other mechanisms, such as neural networks. People invented neural networks because they think that rule-based systems are a dead end.

- Do you say that expert systems are a dead end?

- Let us take regular expert systems, such as the diagnostic expert system, MYCIN, to start with the first one, or maybe the first Schlumberger type of decision system on geological data surveys. If you tell me that you have to couple the technique with something else in order for it to be viable, then you are actually saying that by itself a rule-based system is not sufficient.

- The next speaker will speak against the motion.

- Perception by rule-based systems has been worked on. For example by David Waltz and David Marr in vision. They are classical AI systems and they work fine. My second point is that understanding and explaining can be done by conventional rule-based systems. But explaining or understanding means in relation to a model. If you have a very weak model, what can you get? It is a very weak explanation, or shallow understanding. You mentioned complexity but the complexity of quantitative models is very high. The point I want to make is tha t systems like MYCIN or Dipmeter are nearly twenty years old, and they are what we call shallow expert systems with very crude rules like 'if... condition, then... solution*. But this is not what we are doing today. We have moved from the quantitative to qualitative models, what we call deep models. Models with a strong theory of the domain. It is not a criticism, but what we call deep models based on qualitative reasoning are very slow, very inefficient. It takes a very long time to get a solution, but there are means of getting more knowledge. When you have one problem solved, you make a compiled rule and you get better performance. It is always mentioned that using this kind of expert system there is no way of learning, but one approach to learning today is called explanation-based generalisation, and this also uses deep models. Today all the systems we have are based on non-monotonic logic, but that is not exactly rule-based systems. This non-monotonic logic, for example preference logic, makes a spectrum of models, some called preferred models according to a set of rules. It gives you all possible models, and interpretation models give you the preferred ones. As knowledge changes, the preferred model changes as well. That is a real adaptation: what can be true at one time is no longer true at another. Also, we can do analogical reasoning. This is a means for generating new rules, because analogical reasoning is to take a set of source rules, for example A -> B and if there is a C similar to A, the analogy between the A and C derives some kind of new rule. This is a new knowledge, you can test it in your theory, meaning a new model. And one of the main advantages of qualitative models is that most of the time they are quite simple and easy to produce while quantitative models need a lot of tuning to get the right numbers.

232 Debate III

- The next speaker will put views both for and against.

- I interpret the motion as referring to what we know as first generation expert systems. Indeed that is an evolutionary dead end. In order to illustrate this, I will say a little on a distinction made by numerous papers some years ago. We distinguish between the knowledge level and symbol level. At the knowledge level we talk about knowledge in terms of hypotheses and solutions, causal knowledge, rules of thumb and things like that, whereas at the symbol level we talk about the structures we use to represent this type of model. For example logic, production rules, or whatever. I think it is very useful to make this distinction because abstracting knowledge at the knowledge level gives an insight into what kinds of knowledge there are, and the different uses those kinds of knowledge can have. Obvious benefits of making this distinction are that it may lead to a modelling methodology for expert systems, and eventually even to tools which present the various knowledge types available and the way they may interact in a particular kind of problem solving. You may identify particular pieces of knowledge and how they are used in different settings so that you come to knowledge re-use. It allows for a separation of knowledge in terms of knowledge which directs the problem solving, and knowledge which is describing the domain. To put it very simply this distinction is of content versus form. It is about the idea to separate knowledge from the way used to represent the knowledge. I don't mean to say that rule languages are no good, because I think a rich rule language is a very powerful modelling technique, and it should not be discarded. Even if new things pop up in AI, there is no reason to throw the baby away with the bath water. However, first generation expert systems typically equate rules of thumb with production rules. So, the only kind of knowledge is the rule of thumb, and that makes their behaviour not very satisfactory, because when starting from an expert system shell, which only uses production rules, and which is used only to represent rules of thumb, you will forget about all the other types of knowledge, very useful types of knowledge, and you do not try to build a model of the problem solving. That leads to a system which contains private knowledge only, which is one reason why they will be hard to accept for other users. Systems which are brittle because the types of rules built in are meant to solve a very particular type of problem, but no problems which are slightly beyond the scope of the system, and which are very hard to maintain because there was never an intent to describe the knowledge on the right level of abstraction. And the systems will be very hard to adapt. You have no idea where the rules come from and how they cooperate, and what the behaviour will be if you add another rule. And so there are various reasons why this kind of system is unsatisfactory. Now, in high end applications, you cannot present a user with a system like that because a user wants to change the system according to his own beliefs and wants to be able to create a system in par t himself ra ther than being presented with a system which is finished and which he can only turn on and turn off. So that is why I feel that the typical first generation expert system is a dead end, notwithstanding the fact that rule languages will be in use for much longer.

Debate III 233

- Those were remarks for the motion, but I believe rule-based systems can be very useful as long as they are carefully developed and applied. I will try to represent the point of view of somebody using these ideas and techniques. From the usage point of view this concept of rule-based systems has been a big disappointment compared with the expectations that people clearly had a few years ago. I want to make one distinction between rule-based as a programming tool and applications to automate rules, guidelines, procedures etc. As a programming tool there is not that much distinction between the rule-based tool and other programming tools. If you try hard enough with them you can probably manage to programme pretty much anything you want, even including some examples of MYCIN etc. However, once you get into the realm of programming complex rules, you can very often get into serious trouble. If I look at the market, the things that seem to have worked well for people are invariably small scale systems, with few rules, less than one hundred, and especially things where the author is the main user. On the other hand, we have heard of many spectacular failures, such as products for banks and insurance for example. The consensus with large and complex systems seems to be that they are giving a lot of trouble. Even the ones that are still managing to survive do so at great expense and with some difficulty. They are worth what they are costing, but they are certainly not trivial things to maintain. Consider the experience of trying to build a large system. At the beginning we go and sit with somebody for an interview. One of the earliest things you have to do is get the information available, which is often not the case. We have to access a data base somewhere. You have hardly started to develop rules but you do have the information. Jus t doing this already has a big impact on the user. You are already making information available to the users. This already could be of big value to them. And then you go and put rules together. At some point you reach the stage where you feel the system is viable and should go by itself. Then you see that people tend to trust blindly that machine because it appears to be intelligent and know what it is doing. You often observe that the efficiency of the person starts to degrade compared to how it was when you were working with them. By the time you have finished your system very often you have made this information something that the expert system is using, not the end-user. I think the underlying principle is that with big systems you hide the underlying information and are removing the context. While, on the other hand, with smaller systems, when needed you can access the wider context and can compensate for whatever weaknesses there are in your system. One message that I get out of this is that if you want to do something viable, you must make sure that you have some way to access the underlying information so that the users can resolve the discrepancies. There is another issue on a wider scale which matches the developments we see in the management field. You see a very big difference when you interview a worker and when you interview a manager. The manager will easily give you lots of great big sized and clear rules. The workers will have a lot less of those. And in practice it is a worker who will have to apply and put things together and when you start to look at this view, there are many things that are very important for him that have been missed from the managers' rules, because the relationship between the workers and the managers shows the same characteristics as between having only rules and having all the

234 Debate III

richer context. The manager himself is somebody who has been removed from the context that is needed to do a job and tends to abstract the situation and end up with rules. So in the confusion, I see rules as more dangerous than helpful. But if they are used as a tool in a wider context then they are proven to be viable, and if used in that way then we can certainly do a good job.

- Am I right in saying that a rule-based system works only if you have the right knowledge elicitation process to complement it?

- No, I would rather say that they work as long as you leave the person access to the information you yourself have used and don't try to do a complete job.

- We need to make a big distinction between rules as a computational paradigm and as a modelling tool for intelligence. You can prove that anything that you can write in any other way, you can also write with rules. Of course, you might argue every computational paradigm has its own properties, and for rules these are things like modularity and not having to specify the control flow explicitly in advance. These are all good properties and I think rules as a computational paradigm will always be with us because if not, then there is something missing in our computer science education. I think the more important thing is the criticism of rules in knowledge engineering projects. If you take the whole list of properties that an intelligent system should have, that is one point of view. The second point of view is looking at concrete knowledge engineering projects; what is the success of using rules? I will take these two points of view in turn. If you evaluate from the viewpoint of knowledge engineering then I think there are a number of things to say. First of all the failures: these have little to do with using rules or not using rules. Most of the problems that have been encountered are either management issues or in terms of the style in which applications are embedded in a certain existing context. So this is a problem not just for expert systems, but for all sorts of software today. The idea is that you can design in your office, analyse your requirements, make this thing and then go back to the workplace and put it there. It is an idea that does not work. There is a lot of work going on in software engineering with design methodologies that incorporate the workers and the process, and all those things apply to expert systems as well; but they have nothing to do in principle with whether you use rules or not. They just have to do with how you can build applications that are really used. In my view ninety percent of the failures in knowledge engineering have to do with management and are not technical failures. Now the second thing is that we have the first generation of systems and from around 1982 there were papers by a number of people on second generation expert systems pointing out the limitations here. The basic insight is that you need to incorporate deeper models, more qualitative models, different types of reasoning. But again, this has nothing to do with whether you use rules or not. This has to do with the depth of the models that you make; whether you think about the methods or do not think about the methods. You can express all those things in a rule-based computational way. Another thing which was brought up was the lack of a good design methodology at a high level. I agree that knowledge level design tools are

Debate Hi 235

needed and in fact there is a lot of very interesting work on knowledge level modelling which is sweeping through Europe at the moment. So, although this is still in an early phase, we can already see very concrete tools. It will allow you to make a knowledge level description and couple it in systematic ways with an implementation, and this level could be object oriented programming, it could be logic programming, it could be rule-based. So again, the thesis of an evolutionary dead end sounds very silly in view of all these things. From the viewpoint of intelligence, I think there are indeed big issues that have not been confronted by the classical AI paradigm. Although I don't at all agree with previous comments in the area of vision: almost all the sophisticated attempts that I know to build real computer vision systems have been rule-based. Also it is not a problem of size, there are rule-based systems now with ten thousand rules. All these interesting works of learning, and not just explanation based learning but inductive learning, have been done with rules. I don't know where you get tha t idea tha t all those things are impossible.

- I am not saying they are impossible, I am saying that they are a dead end from the maintenance point of view.

- As part of the rule-based paradigm, you can have modules, you can have object oriented structures for your data. Nothing denies this. There is fascinating research going on which goes beyond this in the sense that much more fundamental issues are being considered. For example issues like autonomy, which was not considered a t all in classical AI. Autonomy meaning how can a system that has continued interaction with the world, but is confronted continuously with new situations, be adaptive and build up representations that can still cope with changes? That is one issue which I don't think has to do with rule-based systems, but I would agree that it is not being addressed by any of the work that has been happening for twenty years in the classical symbolic paradigm. There is also the issue of evolution in the sense of where do capabilities come from? Where does knowledge come from? If you build an inductive learning programme then you still assume that the person tha t is using the programme is going to supply the conceptual framework and is going to supply all the examples. And then an inductive process starts , which basically generalises, but there is no new information being created. So, issues like autonomy, evolution, adaptivity and so on, have not been addressed, and I think they need to be, to build things that are situated in the work place and can really fit with the rest of the environment. Some believed in the past you could just go to an expert and say give me a couple of rules, put that in your system and then have an application. This is not the case. But who has ever seriously said that? Nobody who has been seriously working in AI has ever proposed that. There is always a danger of putting up a straw man that doesn't work. - What is the alternative to first generation expert systems? It looks like people are looking for an alternative in the direction of neural networks. I would like to make a comparison between neural networks and first generation expert systems. First generation expert systems were usually defined as "you watch a black box, you see it behave in the real world and then you extract some very

236 Debate ill

simple rules out of it." The disadvantage was that the scope of your expert system was very limited and it could only cover the things that you had defined it to cover. But that was also an advantage in that once you applied it to something which it couldn't cover, it didn't work and you didn't expect it to work. Now it looks like people are trying to implement neural networks where they do exactly the same. They observe a black box, they try to feed it through a neural network and then it sometimes works. But then there is this insecurity that they think it might work for something similar, but they don't know why and they cannot explain why it works or why it doesn't work. It seems to me that neural networks will have more disadvantages than the first generation expert systems.

- I agree with you, but the first thing is that I don't see any difference, technically speaking, between a neural network and a rule-based system where you have weights for each of the conditions; you have a computation, you propagate the value of the weights. In terms of a running system there is no difference whatsoever. So I think it should be clear that computationally there is no difference. The only thing that is different is that there is a mechanism for building up, for finding the weights, based on giving a large number of examples. In tha t mechanism there are many limitations. Perhaps we shouldn't go into it, but the result will be something, as you said, that is basically first generation in the sense that there is no deep model. To put this as rules versus neural networks comes from trivialising the underlying concepts and I think that is a wrong debate.

- About first generation rule-based systems, I think the main problem is that it is a Mickey Mouse system which started with a nice paradigm, the production rules idea. We use "if... then..." rules and write a linear set of rules and there is no engine behind it except to go through that set of rules and figure out the most competent one to fire a t the t ime. It doesn't really help my programming; in fact, it makes it more unmaintainable. As for their use in vision, my question is: isn't the real challenge in vision extraction of knowledge, edge detection, figure-background differentiation, textual analysis, groupings. You cannot do that in a rule-based system. You apply rule bases only after that point. And after that point you might as well use any other paradigms. You could develop input factors for a neural network if you wanted to.

-1 see rules as part of a tool set. We do the pre-processing to get our data in the shape that we need it for propagation to our system. So the question is basically: Are rules a computational paradigm that are mature that you can use in certain contexts but are limited, and that we can combine with many different other paradigms in order to make technological advances, or are rules as a paradigm still capable of evolving? I take the position they are capable of evolving as long as you don't stick to one paradigm of rules, one kind of rules that was invented at some point in time. The idea of technology is that it changes.

Debate ill 237

- When I hear about more advanced generations of rule-based systems, that tends to be even more worrying, because it becomes an even bigger challenge to retain contact with reality once you have to maintain all of the mechanics.

- A final comment. Rules are very easy to use to express certain forms of knowledge. They apply to some situations in a very natural way. So you can't escape from rules.

adaptive intelligent systems || introduction to that rule-based systems are an evolutionary dead end...

Documents