heterogeneous knowledge representation: integrating connectionist and symbolic computation

7
Heterogeneous knowledge representation: integrating connectionist and symbolic computation Danilo Montesi* Department of Computing, Imperial College, 180 Queen’s Gate, London SW7 2BZ, UK Abstract Heterogeneous knowledge representation allows combination of several knowledge representation techniques. For instance, connectionist and symbolic systems are two different computational paradigms and knowledge representations. Unfortunately, the integration of different paradigms and knowledge representations is not easy and very often is informal. In this paper, we propose a formal approach to integrate these two paradigms where as a symbolic system we consider a (logic) rule-based system. The integration is operated at language level between neural networks and rule languages. The formal model that allows the integration is based on constraint logic programming and provides an integrated framework to represent and process heterogeneous knowledge. In order to achieve this we define a new language that allows expression and modelling in a natural and intuitive way the above issues together with the operational semantics. q 1997 Elsevier Science B.V. Keywords: Artificial neural networks; Logic programming; Integration 1. Introduction Knowledge bases are designed to solve problems on a specific domain acquiring, representing and processing knowledge expressed in symbolic form. In general, compu- ter science is confronted by two types of problems. They can be classified as low-entropy and high-entropy problems. By low-entropy problems we mean problems which are clearly and completely defined through a programming language. This class deals with structured problems such as sorting, searching, data processing and deduction systems. Indeed, many knowledge bases use rule languages to infer new knowledge from a previous one. For instance, from the fact that Socrates is a human (human(socrates)) and the knowledge that any human is mortal (mortal(X) human(X)) we can derive that Socrates is mortal (mortal (socrates)). High-entropy problems are those that do not allow a complete description of the problem, or if such description were available it would be very complex and not appropriate to encode uncertainty. For instance, object and speech recognition are high-entropy problems for which a complete description of the problem is not realistic [1]. Symbolic systems have been widely used to express and solve low-entropy problems [2]. The main advantages of these systems are related to the fact that they rely on a logical framework and thus are declarative and with a for- mal semantics [3]. In addition, program optimization based on partial evaluation and semantics preserving transforma- tion have solved many inefficiencies of these systems [4]. Finally, they allow construction of an explanation of the result in terms of premises/consequences. In short, symbolic systems rely on a well-established knowledge representa- tion and processing framework. However, symbolic systems are not appropriate for hand- ling high-entropy problems where the description of the problem (or its solution) through a programming language is very difficult, time consuming and thus expensive. For instance, object recognition is hard to express with any pro- gramming languages (i.e. imperative, logic or functional). Imagine how difficult it would be to write a logic program that would recognize a chair among other objects in a room. The resulting program may not cover all the possible con- figurations due to a lack of the full description of the object to recognize or not being able to recognize a folding chair as the target object. Even if such a program were available it would be extremely complex and not appropriate to recog- nize objects that are not encoded in the program. Therefore, there is the need to have a computational paradigm which can learn and adapt to the source of objects it needs to recognize, and – most important – does not need a complete description of the problem through a programming language. 0950-7051/97/$17.00 q 1997 Published by Elsevier Science B.V. All rights reserved PII S0950-7051(96)00001-9 * Present affiliations: Department of Computer Science, University of Milano, Via Comelico 39/41, 20135 Milano, Italy; e-mail: [email protected] nimi.it and School of Information Systems, University of East Anglia, Norwich, NR4 7JT, UK; e-mail: [email protected]

Upload: danilo-montesi

Post on 05-Jul-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

Heterogeneous knowledge representation: integrating connectionist andsymbolic computation

Danilo Montesi*

Department of Computing, Imperial College, 180 Queen’s Gate, London SW7 2BZ, UK

Abstract

Heterogeneous knowledge representation allows combination of several knowledge representation techniques. For instance, connectionistand symbolic systems are two different computational paradigms and knowledge representations. Unfortunately, the integration of differentparadigms and knowledge representations is not easy and very often is informal. In this paper, we propose a formal approach to integratethese two paradigms where as a symbolic system we consider a (logic) rule-based system. The integration is operated at language levelbetween neural networks and rule languages. The formal model that allows the integration is based on constraint logic programming andprovides an integrated framework to represent and process heterogeneous knowledge. In order to achieve this we define a new language thatallows expression and modelling in a natural and intuitive way the above issues together with the operational semantics.q 1997 ElsevierScience B.V.

Keywords:Artificial neural networks; Logic programming; Integration

1. Introduction

Knowledge bases are designed to solve problems on aspecific domain acquiring, representing and processingknowledge expressed in symbolic form. In general, compu-ter science is confronted by two types of problems. They canbe classified as low-entropy and high-entropy problems. Bylow-entropy problems we mean problems which are clearlyand completely defined through a programming language.This class deals with structured problems such as sorting,searching, data processing and deduction systems. Indeed,many knowledge bases use rule languages to infer newknowledge from a previous one. For instance, from thefact that Socrates is a human (human(socrates)) and theknowledge that any human is mortal (mortal(X) ←human(X)) we can derive that Socrates is mortal (mortal(socrates)). High-entropy problems are those that do notallow a complete description of the problem, or if suchdescription were available it would be very complex andnot appropriate to encode uncertainty. For instance, objectand speech recognition are high-entropy problems for whicha complete description of the problem is not realistic [1].Symbolic systems have been widely used to express andsolve low-entropy problems [2]. The main advantages of

these systems are related to the fact that they rely on alogical framework and thus are declarative and with a for-mal semantics [3]. In addition, program optimization basedon partial evaluation and semantics preserving transforma-tion have solved many inefficiencies of these systems [4].Finally, they allow construction of an explanation of theresult in terms of premises/consequences. In short, symbolicsystems rely on a well-established knowledge representa-tion and processing framework.

However, symbolic systems are not appropriate for hand-ling high-entropy problems where the description of theproblem (or its solution) through a programming languageis very difficult, time consuming and thus expensive. Forinstance, object recognition is hard to express with any pro-gramming languages (i.e. imperative, logic or functional).Imagine how difficult it would be to write a logic programthat would recognize a chair among other objects in a room.The resulting program may not cover all the possible con-figurations due to a lack of the full description of the objectto recognize or not being able to recognize a folding chair asthe target object. Even if such a program were available itwould be extremely complex and not appropriate to recog-nize objects that are not encoded in the program. Therefore,there is the need to have a computational paradigm whichcan learn and adapt to the source of objects it needs torecognize, and – most important – does not need a completedescription of the problem through a programminglanguage.

Knowledge-Based Systems 9 (1996) 501±507

0950-7051/97/$17.00q 1997 Published by Elsevier Science B.V. All rights reservedPII S0950-7051(96)00001-9

* Present affiliations: Department of Computer Science, University ofMilano, Via Comelico 39/41, 20135 Milano, Italy; e-mail: [email protected] and School of Information Systems, University of East Anglia,Norwich, NR4 7JT, UK; e-mail: [email protected]

Page 2: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

The above drawbacks of symbolic systems have led to anew paradigm to solve high-entropy problems. Such modelshave been called (artificial) neural networks (NN) since theypresent an analogy with the principles of biological entities[5,6]. Basically, a neural network consists of many proces-sing neurons linked together through connections. Eachneuron receives an input signal via weighted incoming con-nections, and responds by sending a signal to all of theneurons. It has outgoing connections also. NNs are superiorwith respect to rule-based systems in dealing with problemsinvolving a massive amount of uncertain and noisy datawhere it is important to extract the relevant items quickly.The learning capability with the distributed representation(over the neurons) makes them candidates for such pro-blems. Learning capability is not an exclusive feature ofneural networks. Knowledge bases too have learningcapability [7–9]. However, such capability is not enoughin the context of uncertain and noisy data such as objectand speech recognition. Thus inductive logic programmingis not appropriate for those problems. Another feature ofNNs is their ability to generalize from examples. Theywill always give reasonable answers by interpolating fromthe examples (provided during training) in some way. Theimportant features of the NN approach are the powerfullearning algorithms. Training instead of programming willhelp to reduce the cost and time of software development inmany cases where high-entropy problems are involved. Thedistributed representation of data is very robust, insensitiveto inaccuracies, noise incompleteness and even physicaldamage.

At this point we should clarify the relationship among NNand rule languages. Both approaches have the same expres-sive power since they are Turing complete. Thus any pro-blem that can be solved with one approach can be solvedwith the other too. Obviously, the difference is not on theexpressive power, since they compute the same class offunctions, but on using the most appropriate paradigm torepresent and solve a problem. Since the above two para-digms seem to complement each other they should be com-bined to solve complex problems [10]. Indeed, very oftenwe need to handle both high- and low-entropy problems andthis results in integrated paradigms to acquire, represent andprocess heterogeneous knowledge. Consider for instance theclassic AI problem where a monkey, a chair and a bananaare in a room in different positions. In order to take thebanana that is hanging from the ceiling the monkey shouldrecognize the banana and the chair, infer that it has to walkto the chair, take it, push it below the banana and jump onthe chair to take the fruit. This implies an interleavingbetween high-entropy (connectionist) computation such asobject recognition (i.e. banana and chair) and low-entropy(symbolic) computation such as deductive reasoning (i.e. ifit is a banana, walk to the chair, push the chair below thebanana, jump on the chair and then take the banana).

The contribution of this paper is to provide a formalmodel to integrate NNs and rule languages following [11].

This integration allows expression of the cooperationamong connectionist and symbolic paradigms in a clearand sound way while preserving all the features of boththese paradigms. Thus the rule language relies on the logicalframework that needs to be extended to express NNs. Thederivation of new knowledge through rules is expressedextending the logical framework to cope with NNs whilepreserving the nice features of logical systems in order totake full advantage of the body of results already developedin this area (see [12] for a survey of the state of the art). Onthe NN side we do not make any restriction on its topology,computational or learning properties. We do not aim totransform one paradigm into the other or to claim that oneof them can be avoided [13]. Indeed, there are severalapproaches in the literature that follow this approach. Forinstance, some of them transform the NNs into propositionallogic [14] and show that they are equivalent. Others trans-form the prepositional logic into an NN [15].

The basic idea of our approach is to use constraint logicprogramming (CLP(X)) to model the integration of rulelanguages (expressed as Horn clause language) with NNs(expressed as constraints). The importance of a clear andsound integration between the two paradigms is obvious.The integration of these paradigms can succeed in solvingcomplex problems that require heterogeneous knowledgebases [16]. There are two important points in this plan.The former is to use the well-known paradigm of constraintlogic programming and previously developed results [17].The latter is that NNs can be trained and tested as a stand-alone component and then ‘‘plugged’’ into a rule systemand cooperate with it. Thus the NN does not need anychange to cooperate with the rule-based system. The formalapproach to integration relies on the fact that NNs can beexpressed as a set of equations and thus as constraints overan appropriate domain. The original purpose of CLP(X) wasto integrate, in a clear and sound way, constraint and logicprogramming. In the general schemeX stands for a con-straint domain. The integration of NNs with rule languagesis inherited from the CLP(X) schema where the constraintdomainX will be specialized to express the NN and the logicprogramming side expresses the rule language.

The rest of the paper is organized as follows. Section 2introduces the basic idea as well as the general architecturefor the resulting heterogeneous knowledge base. Sections 3and 4 recall the relevant concepts on NNs and constraintlogic programming respectively. Section 5 provides theformal model to integrate the above two paradigms definingthe resulting language and its operational semantics.Finally, Section 6 concludes the paper and sketches somefuture applications.

2. An overview of the approach

There are several ways to combine two systems: forinstance, serial or parallel composition. Unfortunately,

502 D. Montesi/Knowledge-Based Systems 9 (1996) 501–507

Page 3: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

however, these forms of composition do not allow integra-tion of the reasoning model of different paradigms. Ideally,an integrated model should allow transformation of an input(K1) into an output (Kt) through a derivation relation thatcan model both logic (or symbolic) and connectionistreasoning. According to this vision we can define suchcomputation as a finite derivation of the input into the output(K1 r … r Kt) according to some derivation relationr.Since our system allows combination of heterogeneousknowledge, all the elements of the above computationhave two components: the logic part is denoted withGand the connectionist one is denoted withN. Thus theabove computation can be rewritten asN1 | G1 r … r Nn |Gm. The symbol ‘‘|’’ has no meaning and is introduced justto separate the two components and will be used only in theexamples. The derivations relationr can be denoted moreprecisely byrL,C to highlight the heterogeneous reasoningmodel. The above derivation shows the logic and connec-tionist reasoning expressed through the derivation relation.Under this approach any element of the derivation is derivedfrom the previous one. For instance,Nk | Gr is derived fromNi | Gj by means of a simple logic step (denotedrL) if k ¼ iandr ¼ j þ 1, or a simple connectionist step (denotedrC) ifk ¼ i þ 1 andr ¼ j, or a combined logic and connectioniststep (denotedrL,C) if k ¼ i þ 1 andr ¼ j þ 1. The case ofrL,C is the most general and the others (rL andrC) can beeasily derived from this one. In the followingr stands forany derivation relation (logic, connectionist or both).

The general architecture of the resulting system isdepicted in Fig. 1. The above vision leads to several ques-tions. How can we integrate the logic and the connectionistparadigms? What is a formal model for the resulting model?How can we preserve all the features of both the paradigms?These questions will be answered in the rest of the paperafter recalling the concepts of NNs and constraint logicprogramming. For the time being, we note that constraintlogic programming was developed to integrate constraintand logic programming in a clear and sound framework.Therefore it relies on a logical framework while preservingall the nice features of logic programming such asdeclarativeness and formal semantics [17]. Since NNs areexpressed through equations we can express a wholenetwork as a constraint program. Informally, this

corresponds to modelling the NN as a set of equations.These equations will relate the inputs of a network to theoutputs. The network should be trained to perform its task asa stand-alone component and then plugged into the know-ledge base. This is exemplified next, omitting some details.

Example 2.1. Consider the monkey-banana problemdescribed in Section 1 and depicted in Fig. 2. The monkey,the chair and the banana have different positions denoted byA, B and C respectively. Consider the following functionsdescribing the state transformation (on the left) and therelations describing the problem (on the right):

walk: position 3 position 3 state → state monkey (position,state)push: position 3 position 3 state → state chair (position,state)jump: state → state on the chair (state)

The objects to recognize are the banana and the chair. Inaddition we need to locate them assigning a positiontogether with the monkey. This is a suitable task for NNs.Assume that we already have a network that would recog-nize bananas, chairs and monkeys and locate them, provid-ing their position [18]. Such a network takes as input animage, and provides for the recognized objects their posi-tions. For instance, assume that this can be expressed byNN(Image, monkey, A) in the case of the monkey. Thefollowing knowledge base then allows solution of the mon-key–banana problem through cooperation of NNs and therule-based system.

f1 monkey(A, so) ← NN (Image, monkey, A)f2 chair(B, so) ← NN (Image, chair, B)r1 taken(S) ← NN (Image, chair, C) | on thechair(C, S)r2 on the chair(X, jump(S)) ← monkey(X, S),chair(S, X)r3 monkey(Y, walk(X, Y, S)) ← monkey(X, S)r4 monkey(Y, push(X, Y, S)) ← monkey(X, S),chair(X, S)r5 chair(Y, walk(X, Y, S)) ← chair(X, S)r6 chair(Y, push(X, Y, S)) ← monkey(X, S),chair(X, S)

Note that factf1 asserts that monkey has position A and

Fig. 1. Coupling neural networks with rule-based systems. Fig. 2. Monkey and banana problem.

503D. Montesi/Knowledge-Based Systems 9 (1996) 501–507

Page 4: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

state so. The position is computed by the network that takesas input the image of Fig. 2 and computes as output therecognized object and its position (usingNN (Image, mon-key, A)). Similarly, in fact,f2, the chair and its position arecomputed. Ruler1 states that taken(S) is true if the net-work recognizes the banana, provides its position and thismakesonthechair(C, S) true. Ruler2 says that the monkeycan jump on the chair if it has the same position as the chair.The remaining rules just allow the monkey and the chair tomove under the banana.

In the next two sections we introduce the necessary con-cepts of NNs and constraint logic programming.

3. Neural networks

An artificial NN consists of a network of neurons that canbe expressed in a weighted directed graphG ¼ (V, A, W),whereV is the set of vertices,A a set of directed edges andWa set of edge weights [5]. Each vertex in the graph representsan artificial neuron. An important topological feature of anetwork is its layer structure. In layered networksV can bepartitioned into a number of pairwise disjoint subsetsV¼ V0

∪ V1 ∪ … ∪ VL in such a way that the neurons in layerlonly have edges to neurons in layersl þ 1 andl ¹ 1. Weenumerate neuron numberi within layer l by vi

(l) and thecorresponding state byxi

(l). Since multilayered networks arethe most general in the rest of the paper we will considerlayered networks. Each neuronvi, as shown in Fig. 3, hastwo attributes: thestate xi and thethresholdv i. These attri-butes can be discrete or continuous valued. The state vectordefined by all the neuron is denotedx. The correspondingstate space is denotedQS. The threshold is a measure of theinput strength from neighbouring neurons needed to activateneuroni. The edges of the graph are referred to as connec-tions, as they connect the neurons. With each connection inthe graph there is associated an attribute called the weight,representing the strength of the connection. Connectionweights are usually real valued. A connection with a posi-tive weight is called excitatory. A connection with a nega-tive weight is called inhibitory. We denote the connectionfrom neuronj in layer l ¹ 1 to neuroni in layer l by wij

l . Thematrix W, wherewij is in row i, column j, is called theconnectivity matrix. The corresponding weight space isdenotedQW. Layered networks have a separate matrix foreach layer. The network’s interface to the outside world is

defined by partitioning the neurons into input neurons, out-put neurons and hidden neurons:V ¼ VI ∪ VO ∪ VH. This isreflected in the state vector wherexI, xO andxH denote thecorresponding states of the layers. The union of input andoutput neurons is called the visible units, denotedVI/O. Theinput and output neurons (VI andVO) are accessible to theoutside world. The remaining neuronsVH are hidden fromthe outside world as shown in Fig. 4. NNs use the linear sumof states of neighbouring neurons and connection weight inorder to determine the next state as defined by

u(l)i (t) ¼

∑n

j ¼ 1w(1)

ij x(l ¹ 1)j (t) ¹ v

(l)i (t) (1)

where l denotes the layer andxj(l ¹ 1)(t) the state of theneuronj in layerl ¹ 1 at timet. The weight between neuronsj andi (in layerl) is denoted bywij

(l). The threshold of neuroni is v i

(l)(t). NNs can be described as dynamic systems andhave two different operational modalities: learning andcomputation. As dynamical systems both learning and com-putation are described as processes of moving in space.Learning in NNs entails training the network by presentingtraining patterns to its visible neurons. The aim of training isto set the connection weights and neuron thresholds to someproper values, so that the desired network computation isobtained. The learning process is therefore a search inweight spaceQW. A discussion on learning methods andalgorithms is beyond the scope of this paper. Once theweights are computed the network is ready to provide itsexpected behaviour. Indeed, in the computation mode, thenetwork computes a function (a mapping) from input spaceto output space. Given an inputxI, the network computes anoutput vectorxO, as determined by its connectionsW. Theexpected computation (or dynamic behaviour) of the net-work is that every neuron applies some activation rule toenter a new state.1 The activation rule is a function of thestates of the neurons that have an outgoing connection to theneuron in question and the connection weights. A neuronapplies its activation rulegi to the formula (1) to determinethe next internal state as defined by

x(l ¹ 1)i (t þ 1) ¼ gi(u

(l)i (t)) (2)

Let us now turn our attention to constraint logic pro-gramming.

4. Constraint logic programming

Constraint logic programming (CLP) is a merge of twodeclarative paradigms: constraint solving and logic pro-gramming [17]. Viewing constraint logic programmingrather broadly, it involves the incorporation of constraints

Fig. 3. An artificial neuron.

1 This can be done synchronously or asynchronously. There are also dif-ferences related to the type of network – feed-forward or recurrent – butthese issues are not relevant to our objective.

504 D. Montesi/Knowledge-Based Systems 9 (1996) 501–507

Page 5: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

and constraint-solving methods in a logic-based language.This characterization suggests the possibility of manyinteresting languages, based on different constraints anddifferent logics. However, work on CLP has almostexclusively been devoted to languages based on Hornclauses. We briefly introduce the general CLP(X) schema,where X suggests that many different constraints can beaccommodated over a generic domainX. Specifically forany signatureS, X is a pair: the domain of computationDand the constraintsL. A constraint logic program is acollection of rules of the form

h ← c, b1, …, bn (3)

wherec is the conj unction of constraints andh, b1, …, bn

are ordinary atoms as in logic programming [3]. Sometimeswe represent the rule byh ← B, whereB is the collection ofatoms and constraints in the body. A fact is a ruleh ← c. Agoal (or query) is a rule← c, b1, …, bn. The informalmeaning of a rule is: ‘‘ifb1, …, bn is true andc is solvable,then h is true’’. The notion of solvability is related to thedomainX. For instance, in CLP(R) [17] has arithmetic con-straints and computes over the real numbers where thesignatureS has the function symbolsþ and *, and thebinary predicate symbols¼ , , and # . If D is the set ofreal numbers thenD interprets the symbols ofS as usual (i.e.þ is interpreted as addition, etc). ThenL are the constraintsandR ¼ (D,L) is the constraint domain of arithmetic overthe real numbers. If we omit fromS the symbol *, then thecorresponding constraint domainRLin is the constraintdomain of linear arithmetic over real numbers. Similarlyif we restrictS to þ , ¼ , we obtain the constraint domainRLinEqn, where the only constraints are linear equations. Thusthe constraint domain is built selecting the relevant elementsfor S and providing an appropriate interpretation. Finally wewill identify conjunction and multiset union.

5. The Logic(NN) language

To integrate NNs with rule languages we note that equa-tions (1) and (2) allow expression of a single layer in an NN.Thus a set of the above equations will allow expression of awhole NN. To this purpose we abstract from the detailsconcerning topology network, learning algorithms andtype of activation rule (continuous/discrete). This willallow us to integrate any NN, regardless of the specificfeatures, into a rule language. According to the aboveview an NN is expressed as a set of equations (or con-straints) over a real or Boolean domain. Thus the set ofconstraints can be seen as a constraint program where theinput variables arexi and the output variables arex0. Thusthe constraint program of a network can be expressed as:

Program NEURAL NETWORKInput: xl

Output:x0

beginul

i(t) ¼∑

nj ¼ 1wl

ij xl ¹ 1j (t) ¹ vl

i(t)

:

u1i (t) ¼

∑nj ¼ 1w1

ij x0j (t) ¹ v1

i (t)

xl ¹ 1i (t þ 1) ¼ gi(ul

i(t))

:

x0i (t þ 1) ¼ gi(ul

i(t))

end.

Once the network is interpreted as a constraint program itis simple to model its integration with a rule language. Weneed just to recall that the constraint logic programming

Fig. 4. Multilayer network.

505D. Montesi/Knowledge-Based Systems 9 (1996) 501–507

Page 6: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

allows definition of constraints in rule bodies. Thus theresulting rules have the form

h(x) ← NN(xl ,x0), b1(x), …, bn(x)

whereb1(x), …, bn(x) andh(x) are atoms andNN(xl, x0) is aconjunction of constraints (the constraint program). Theshared variables between the constraint part and the logicpart of the above rule represents thecommunication chan-nelsbetween the connectionist and symbolic systems. In themonkey and banana example, the inputs of the network isthe image, which does not come from the logic part (in thisspecific case) but from the environment. The output vari-ables of the network are: monkey, chair and banana; andtheir positions:A, B and C. In this example the objectrecognition and its position are used in the logic part todeduce new information such astake(S). Thus the sharedvariable among the constraint and the logic part (or theconnectionist and the symbolic systems) (A, B, C) expressthe communication channels. Note that a fact is a rule withempty body and a query has empty head.

In the NEURAL NETWORK program we have notdefined the activation functiongi. This is due to the factthat the activation function can be non-linear. For instance,in the Boltzmann machine, the state of a neuron can only bespecified as a probability. The probability that a neuron isactive is then

P(xi ¼ 1) ¼1

1þ exp( ¹ 2ui =T)(4)

whereT is the temperature.T may be viewed as a measure ofthe noise in the system and affects the probability distribu-tion of states. Other examples of activation are: McCul-loch–Pitts’ rule, sigmoid type rule, simple linear rule andstepwise linear rule. Since we want to be parametric withrespect to activation we will consider the activation functionbuilt in within the constraint domain.

Thus the resulting constraint domain is defined as a triple(D, gi, LNN), whereRLinEqn ¼ (D,LNN) andgi is an activationfunction. Finally, note that a test for the satisfiability ofconstraints is required. The constraintc is said to be satisfi-able iff D | ¼ 'c (where' denotes the existential closure offormula). The constraintc is a conjunction of well-formedformulae built over the languageLNN and represents an NN.Thus the resulting language that expresses the integration ofNNs and rule languages is calledLogic(NN) and is a set ofrules of the form (4) where the constraint domain isRLinEqn

extended with an uninterpreted functiongi. Therefore a het-erogeneous knowledge base (HKB) is a set of rulesexpressed inLogic(NN). In order to understand thecomputational model of the HKB we now turn our attentionto the operational semantics ofLogic(NN).

5.1. Operational semantics

We present the operational semantics as a transitionsystem on states. Consider the tuple, A, C . whereAis a multiset of atoms and constraints andC is multiset of

constraints. There is one other state, denoted byfail. Weassume as given a computation rule which selects atransition type and an appropriate element ofA (ifnecessary) for each state. The transition system is alsoparameterized by a predicateconsistent, which we willdiscuss later. An initial goalG for execution is representedas a state by, G, B . . The transition in the transitionsystems are:

T1 , A ∪ a, C . rL , A ∪ B, C ∪ (a¼ h).

If a is selected by the computation rule,a is an atom,h← B is a rule of HKB, renamed to new variables, andhanda have the same predicate. The expressiona ¼ h isan abbreviation for the conjunction of equationsbetween corresponding arguments ofa andh.

T2 We saya is rewritten in this transition

, A ∪ a, C . rL fail

if a is selected by the computation rule,a is an atom andfor every ruleh ← B of HKB, h anda have differentpredicates.

T3 , A ∪ c, C . rcoop , A, C ∪ c.

if c is selected by the computation rule andc is aconstraint.

T4 , A, C . rC , A, C.

if consistent(C).

T5 , A, C . rC fail

if rconsistent(C).

The transitionsT1 andT2 arise from resolution denotingthe logic steps.T3 introduces constraints into the constraintsolver (which correspond to invoke the NN) denoting thecooperation of the two systems.T4 andT5 test whether theconstraints are consistent, that is, if the NN computes sui-table outputs for the given inputs denoting connectioniststeps. We writer to refer to a transition of arbitrary type.The predicateconsistent(C) expresses a test for consistencyof C, that is, the outputs of the network are computed forsome inputs. Usually it is defined by:consistent(C) iff D | ¼'C, that is, a consistency test over the extendedRLinEqn

domain that models the neural computation.

6. Conclusion

We have presented a formal technique that allows expres-sion of heterogeneous knowledge bases integrating NNs andrule languages. This allows the smooth integration of

506 D. Montesi/Knowledge-Based Systems 9 (1996) 501–507

Page 7: Heterogeneous knowledge representation: integrating connectionist and symbolic computation

knowledge representation through logic and connectionistmodels. The resulting model is defined in the well-knownsetting of constraint logic programming and has a formalmodel both in terms of language and operational semantics.The latter also provides a simple top-down execution model.

We believe that this approach is very promising forfurther investigations. From a practical point of view weare studying how to use the combined logic and connection-ist paradigms for image indexing and retrieval in multi-media database systems [19,20]. From a theoretical pointof view we intend to extend it to allow modular rule basesand several NNs to express more complex problems usingstructured modular systems both in the connectionist andlogic paradigms [21,22].

AcknowledgementsThe work has been partially supported by the EU under

the HCM and ERCIM fellowships.

References

[1] D. Jones, S.P. Franklin, Choosing a network: matching the architec-ture to the application, in: A. Maren et al. (Eds.), Handbook of NeuralComputing Applications, pages 219–232. Academic Press, NewYork, 1990, pp. 219–232.

[2] A. Barr, E.A. Feigenbaum, Handbook of Artificial Intelligence, Addi-son-Wesley, Reading, MA, 1982.

[3] J.W. Lloyd, Foundations of Logic Programming, 2nd ed., Springer-Verlag, Berlin, 1987.

[4] A. Pettorossi, M. Proietti, Transformation of logic programs: a foun-dation and techniques, Journal of Logic Programming 19/20 (1994)261–320.

[5] J. Hertz, A. Krogh, R.G. Palmer, Introduction to the Theory of NeuralComputation, Addison-Wesley, Reading, MA, 1991.

[6] E. Masson, Y.J. Wang, Introduction to computation and learning inartificial neural networks, European Journal of Operational Research47 (1990) 1–28.

[7] J.G. Carbonell, Machine Learning: Paradigms and Methods, MITPress, Cambridge, MA, 1990.

[8] S. Muggleton, Inductive Acquisition of Expert Knowledge, Addison-Wesley, Reading, MA, 1990.

[9] N. Lavrac, S. Dzeroski, Inductive Logic Programming: Techniquesand Applications, Ellis Horwood, Chichester, UK, 1994.

[10] S.Y. Nof, Information and Collaboration Models of Integration.NATO ARW Series, Kluwer, Dordrecht, 1994.

[11] D. Montesi. A formal model to integrate rule languages and artificialneural networks. In International Symposium on Knowledge Acquisi-tion, Representation and Processing, IEEE Press, 1995.

[12] Ten years of logic programming, Journal of Logic Programming(Special Issue) 19/20 (1994).

[13] Y. Pao, D.J. Sobajic, Neural networks and knowledge engineering,IEEE Transactions on Knowledge and Data Engineering 3 (2) (1991)185–192.

[14] A. Aiello, E. Burattini, G. Tamburrini. Purely Neural, Rule-basedDiagnostic Systems. Technical Report 100/93/IC, Instituto di Ciber-netica CNR, 1993.

[15] G.G. Towell, J.W. Shavlik, Knowledge-based artificial neural net-works, Artificial Intelligence 70 (1994) 119–165.

[16] P. Smolensky, G. Legendre, Y. Miyata, Integrating connectionist andsymbolic computation for the theory of language, in: V. Honavar,L. Uhr (Eds.), Artificial Intelligence and Neural Networks: StepsToward Principled Integration, Academic Press, New York, 1994,pp. 509–530.

[17] J. Jaffar, M.J. Maher, Constraint logic programming: a survey. Journalof Logic Programming 19/20 (1994) 503–581.

[18] J. Koh, M. Suk, S.M. Bhandarkar, A multilayer self-organizing featuremap for range image segmentation, Journal of Neural Networks 8 (1)(1995) 67–86.

[19] P. Costantinopoulos, J. Drakopoulos, Y. Yeorgaroudakis, Retrieval ofmultimedia documents by pictorial content: a prototype system, in:Proceedings of the International Conference on Multimedia Informa-tion Systems, McGraw-Hill, New York, 1991, pp. 35–48.

[20] A.E. Gunhan, Neural network based retrieval of information fromdatabase systems: state of the art and future trend, Technical ReportTR 25/94, Department of Information Science, University of Bergen,1994.

[21] M. Bugliesi, E. Lamma, P. Mello, Modular logic programming,Journal of Logic Programming 19/20 (1994) 443–502.

[22] A. Maren, M. Broynooghe, Hybrid and complex networks, in:A. Maren et al. (Eds.), Handbook of Neural Computing Applications,Academic Press, New York, 1990, pp. 203–218.

507D. Montesi/Knowledge-Based Systems 9 (1996) 501–507