equivalence of propositional prolog programs

17
Journal of Automated Reasoning 6: 319-335, 1990. 319 1990 Kluwer Academic Publishers. Printed in the Netherlands. Equivalence of Propositional Prolog Programs HANS KLEINE BONING, ULRICH LOWEN* and STEFAN SCHMITGEN FB II - Praktische Informatik, Universitiit - GH - Duisburg, Postfach 101629, D-4100 Duisburg 1, West Germany (Received: 22 April 1988) Abstract. We show that the equivalence problem for propositional Prolog programs is coNP-complete. Considering yes-no answers only the modified equivalence problem is solvable in polynomial time. Furthermore, the problem whether a program does not terminate for some question is NP-complete. For a fixed question the loop problem can be decided in linear time. Key words. Propositional logic programs, Prolog inference strategy, complexity of inference, loop detection, equivalence of Prolog programs, complexity of equivalence 1. Introduction We are interested in deciding whether two Prolog programs have the same behaviour. Obviously a program without Built-in predicates can be considered as a logical formula. For first-order logic and Prolog programs the equivalence problem is known to be undecidable [3]. For propositional calculus the coNP-completeness of equivalence follows immediately from the NP-completeness of SAT. Restricting ourselves to Horn formulas the problem can be solved in quadratic time [6]. If we investigate propositional Prolog programs and allow definite clauses as questions, the situation is much more complex because the inference strategy of Prolog can run into a loop. For this reason we cannot conclude that two Prolog programs are equivalent if the programs are equivalent as logical formulas and vice versa. As we will see, the equivalence problem for propositional Prolog programs is coNP-complete. To prove this we have to introduce an algorithm first which has the same behaviour as the Prolog inference strategy, but does not need exponential time. Furthermore, this algorithm has not the results yes or no only. If the Prolog inference strategy would loop we obtain the answer loop instead of a looping program. Based on this algorithm and a reduction of the 1-in-3SAT problem we can show that the problem whether there exists a question for which a given Prolog program is looping is NP-complete. This leads to the coNP-completeness of the quivalence problem for propositional Prolog programs. 2. Equivalence of Formulas In this section we introduce some definitions and notations concerning proposi- tional formulas. More details can be found in standard textbooks like, e.g., [2, 10]. * The work of this author was supported by the Studienstiftung des Deutschen Volkes.

Upload: hans-kleine-buening

Post on 06-Jul-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Equivalence of propositional Prolog programs

Journal o f Automated Reasoning 6: 319-335, 1990. 319 �9 1990 Kluwer Academic Publishers. Printed in the Netherlands.

Equivalence of Propositional Prolog Programs HANS K L E I N E B O N I N G , U L R I C H L O W E N * and S T E F A N S C H M I T G E N FB II - Praktische Informatik, Universitiit - GH - Duisburg, Postfach 101629, D-4100 Duisburg 1, West Germany

(Received: 22 April 1988)

Abstract. We show that the equivalence problem for propositional Prolog programs is coNP-complete. Considering y e s - n o answers only the modified equivalence problem is solvable in polynomial time. Furthermore, the problem whether a program does not terminate for some question is NP-complete. For a fixed question the loop problem can be decided in linear time.

Key words. Propositional logic programs, Prolog inference strategy, complexity of inference, loop detection, equivalence of Prolog programs, complexity of equivalence

1. Introduction

We are interested in deciding whether two Prolog programs have the same behaviour. Obviously a program without Built-in predicates can be considered as a logical formula. For first-order logic and Prolog programs the equivalence problem is known to be undecidable [3]. For propositional calculus the coNP-completeness of equivalence follows immediately from the NP-completeness of SAT. Restricting ourselves to Horn formulas the problem can be solved in quadratic time [6].

If we investigate propositional Prolog programs and allow definite clauses as questions, the situation is much more complex because the inference strategy of Prolog can run into a loop. For this reason we cannot conclude that two Prolog programs are equivalent if the programs are equivalent as logical formulas and vice versa. As we will see, the equivalence problem for propositional Prolog programs is coNP-complete. To prove this we have to introduce an algorithm first which has the same behaviour as the Prolog inference strategy, but does not need exponential time. Furthermore, this algorithm has not the results yes or no only. If the Prolog inference strategy would loop we obtain the answer loop instead of a looping program. Based on this algorithm and a reduction of the 1-in-3SAT problem we can show that the problem whether there exists a question for which a given Prolog program is looping is NP-complete. This leads to the coNP-completeness of the quivalence problem for propositional Prolog programs.

2. Equivalence of Formulas

In this section we introduce some definitions and notations concerning proposi- tional formulas. More details can be found in standard textbooks like, e.g., [2, 10].

* The work of this author was supported by the Studienstiftung des Deutschen Volkes.

Page 2: Equivalence of propositional Prolog programs

320 HANS KLEINE BONING ET AL.

Further, we briefly summarize some results concerning the complexity of these

formulas.

2.1. DEFINITIONS AND NOTATIONS

A propositional formula ~t is built from atoms and the Boolean connectives A

(conjunction), v (disjunction), and ~ (negation). The length of such a formula is the number of atoms occurring in it, where multiple occurrences are counted. A

literal is a negated or unnegated atom. In the former case the literal is called

negative, in the other case positive. A clause is a disjunction of literals. A Horn clause is a clause having at most one positive literal and a definite clause is a

clause having exactly one positive literal. We write A < - - A t , . . . , An for a definite

clause A v ~ A ~ v - . - v ~ A , and use 7 as abbreviation for definite clauses. A is called the head of the definite clause and A~ . . . . . A, is its body. In the case of

n = 0 a clause is called a fact, too, and in the case of n > 0 we speak of a rule. A definite clause 7' is called a subclause of a definite clause 7 if each literal occurring

in 7 ' occurs in 7 as well. Obviously, the clause 7" and 7 have the same head

because both are definite clauses. A formula is called in conjunctive normal form if it is a conjunction of clauses.

CNF is the set of all formulas in conjunctive normal form and kCNF is the subset

containing only such formulas, where each clause contains at most k literals. A

Horn formula is a conjunction of Horn clauses and the set of all Horn formulas is

denoted by HORN. Finally, a logic program is a conjunction of definite clauses. A

logic program n is called binary if each clause of n contains at most two literals.

It is called deterministic if there are not any two clauses in rr having the same

head. A truth assignment ~ is a mapping from a set of atoms to the set {0, 1} of

truth values. I f the set of atoms occurring in a formula ~ is a subset of the domain

of ~-, then 3-(~) can be defined canonically. Two formulas ~ and ~2 are called

equivalent if each truth assignment ~" having a domain being a superset of the set

of atoms occurring in ~ and ~2 satisfies 3-(~1) = 1 if and only if ~(~2) = 1. I f ~ and ~2 are equivalent we write ~ ~ ~2. A formula is called a tautology if it is

equivalent to the truth value 1. It is called satisfiable if it is not equivalent to the truth value 0. A formula ~ is called a consequence of a formula ~ if each truth

assignment ~ having a domain being a superset of the set of atoms occurring in and fl satisfies: if ~ ' ( ~ ) = 1 then ~ ( ~ ) = 1. We use the notation ~ implies [3 as synonym for fl is a consequence of ~ and we write ~ ~ fl in this case.

Kowalski [8] has introduced the resolution strategy of SLD-resolution to decide

whether a logic program ~r implies an a tom A: A multiset A1 "" "An of a toms is called a goal and the empty goal is denoted by I_1. For a logic program n a goal A I . . . A , _ I B I . . . B m A , + 1 " . . A , is called an SLD-resolvent of the goal A~ �9 �9 �9 An if the logic program n contains the definite clause At ~ B~ . . . . . B,,.

Page 3: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 321

l A I . . . A , _ I B I . . . B m A , + I . . ' A m in this case. Let us write A ~ ' , . A , t ~,SLD [ ~,SLO denotes the reflexive and transitive closure o f [ ~, s~o More-

over, we say that A ,,-At . . . . . Am matches successfully with the a tom A if

A] ~,SLD~ A I " " . A m [ n.SLD u . A n a t o m A c a n b e r e f u t e d b y n i f A l--~,szo t~ , I

and a derivation A I ~,szD u is called a refutation of A. It is well known that a logic p rogram n implies an a tom A if and only if A can be refuted by zt.

Thus, the problem rt ~ 7 for V : = A ,,-- A ~ , . . . , A, can be decided using SLD-

resolution by trying to refute A by A~ A " " A A, ^ n. We say that 7 is a question to n. We stipulate that we do not query for tautologies ?. Obviously, a clause ~ is

a tau to logy if and only if the head o f 7 occurs in its body, too. Further , we assume

that a query V contains a toms of the underlying logic p rogram only.

Considering a definite H o r n formula rt and a clause ~, for which ~ ~ ? we obtain

two cases. Either 7 is a definite clause or there is a definite clause ),' which is a

consequence o f n and ? ' is a subclause o f ?. Thus, definite clauses are the mos t

impor tan t kind o f consequences when considering definite H o r n formulas. Moreover ,

dealing with logic programs as knowledge-based systems, we can reflect user inputs

using a definite clause A ~ A~ . . . . , A, as question. We query for A after stipulating that there are the facts A~ . . . . , A, .

2.2. COMPLEXITY OF EQUIVALENCE

N o w we briefly sketch some results concerning the complexity for deciding whether

two proposi t ional formulas at 1 and 0c 2 are equivalent. I t is well known that the

satisfiability problem o f proposi t ional formulas is NP-complete [5]. This result

remains valid if we consider formulas o f 3CNF only. Therefore we immediately conclude

L E M M A 2.1. The equivalence problem {(~1, ~2) 10q, ct2 �9 3CNF, ~t I ~ ~2} is c o N P - complete. []

It is well known that two formulas ~ and ~2 are equivalent if we have proved

~ ~ ~z ~ ~ - Moreover , we have ~ ~ ~z if and only if ~ ^ -a~ 2 is not satisfiable.

Finally, ~ ~ ~ 1 ^ �9 �9 �9 ̂ ~n if and only if ~ ~ ~i for each 1 ~< i ~< n. This leads to the following observations:

L E M M A 2.2. ~l ~ ~2 for ~l, ~2 �9 2CNF can be decided deterministically in quadratic time with linear space in the length of ~ ^ ~2.

Proof F r o m [1] it follows that ~ ~ L I v L 2 can be decided deterministically in linear time for a formula ct �9 2CNF and a clause L~ v L 2. []

L E M M A 2.3. ~q ~ ~2for ~l, ct2 �9 HORN can be decided deterministically in quadratic time with linear space in the length of ~q Ar 2.

Proof F r o m [6] it follows that ~ ~ L~ v �9 - �9 v Ln can be decided deterministically in linear time for a formula ~t E HORN and a clause L~ v �9 �9 �9 v L, . []

Page 4: Equivalence of propositional Prolog programs

322 HANS KLEINE BONING ET AL.

3. Equivalence of Prolog Programs

3.1. THE PROLOG INFERENCE STRATEGY

SLD-resolution has been combined in Prolog with a depth-first-search control- strategy with backtracking, see [4, 7]. To give a brief description of this inference strategy we introduce the notion of a refutation tree�9 A logic program n is called a Prolog program if the clauses of n are ordered and if the atoms in the body of a clause are ordered. Considering Prolog programs a goal is an ordered sequence of atoms. The ordering of an SLD-resolvent G' of G results in a canonical way from the ordering of G and the ordering in the body of the clause used for resolution. Thus, let rr be a Prolog program and let G be a goal. The refutation-tree T,(G) is an ordered labelled tree. A node labelled by A I ' " A , has a descendant for every definite clause A I,--A~ A' in ~. The descendants are labelled by

- �9 . , t l I

A ~ ' " A ~ A 2 " " A , and they are ordered according to the ordering of the clauses A ~ +-- A ~, A; in rc. Obviously, each edge in a refutation tree corresponds to an

�9 . . ~ n l

SLD-resolution step. The Prolog inference strategy traverses a refutation tree T,(A) in preorder from

left to right when trying to refute A and terminates if it reaches a node labelled by u . Thus, a path from the root (labelled A) to a leaf with label u corresponds to

a refutation of A. If we use the Prolog inference strategy to decide whether a Prolog program z implies a definite clause A ~ Aj . . . . . A,, we add the facts A~ . . . . . A, in front of n and query afterwards for the atom A. There are three different possibilities of the behaviour of the Prolog inference strategy: Either it reaches a node labelled by u . In this case we say, that the Prolog inference strategy succeeds. Or all nodes have been examined and a node labelled by u has not been reached. In this case we say, that the Prolog inference strategy fails. Or, finally, we are in an infinite loop. Then we say that the Prolog inference strategy is looping. Therefore we define the following function prolog(n, ~) for a Prolog program n and a definite clause ~:

r y e s if the inference succeeds Prolog strategy prolog(rc, ?) , = l n o if the Prolog inference strategy fails

L loop if the Prolog inference strategy is looping.

The following observations are obvious because of the soundness and complete- ness of SLD-resolution for a conjunction of definite clauses:

prolog(n, ~) = yes =~ r~ ~ prologQc, ~) = no ~ n ~ ~.

In the case of prolog(n, ~ )= loop we do not know whether the clause ~ is a consequence of n. Consider the following example:

zr:=(A .--B) A (B*--A) A (C ~ A) A C.

We have n ~ A and n ~ C, but we conclude prolog(n, A) = prolog(n, C) = loop.

Page 5: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 323

3.2. COMPLEXITY OF prolog(n, y)

In this section we analyse the complexity of the function prolog(n, y) for a Prolog program n and a definite clause ?. We consider random access machines and use a PASCAL-like notation for our data structures and formulation of the algorithms. We use a global variable

YAK formula: ARRAY [1..MaxAtoms] OF "atom;

where each atom of n is represented by the following record:

TYPE status = (new,loop,yes,no,inf);

atom = RECORD

state : status;

clauses : LIST OF (LIST OF "atom)

END; { atom }

We store in fo rmula [ i ] ^ .c lauses the list of those bodies of clauses in rc which have the atom A, as head. I f ff is a string encoding a Prolog program n, we can construct the just mentioned representation of rc in time linear in length(n), if we assume that we can access to fo rmula [ i ] in time independent of i and if we obtain the index i in time O(1) from its encoding A,.

Now we will present a function r e fu t e for simulating the Prolog inference strategy. Therefore we define five different states for an atom A occurring in a Prolog program n.

state(A) = yes: There are finitely many refutations for the goal A and there is not an infinite branch in T, (A) .

state(A) = no: The Prolog inference strategy fails when trying to refute the goal A.

state(A) = loop: The Prolog inference strategy is looping when trying to refute the goal A.

state(A) = inf'. After finitely many refutations for the goal A the Prolog inference strategy is looping or there are infinitely many refutations for A.

state(A) = new: The goal A has not been considered yet.

It is necessary to consider more than the first possibility of refutation for a goal and to distinguish between finitely many solutions and the state inf to reflect backtrack- ing in a correct way. This can be shown by the following Prolog program:

n := (A ~ B , C) ^ B ^ (B*--B).

We obtain prolog(n, A) = loop, because there is no refutation for C and infinitely many refutations for B, i.e., state(B) = inf.

Page 6: Equivalence of propositional Prolog programs

324 HANS KLEINE BONING ET AL.

Now let us consider the refutation of an a tom A with a single definite clause

?: ---- A *-- C, . . . . . Cn. We obtain the following results for state(A, ?):

state(A, V) = yes iff n = 0 or state(Ci) = yes for 1 ~< i ~< n.

state(A, ?) = inf iff state(Ci) �9 {yes, inf} for 1 ~< i ~< n and there is a j with

state(c]) = inf.

state(A, ? ) = loop iff the refutation of A is recursively called within the

refutation o f a C], 1 ~<j ~< n, and state(Ck) �9 {yes, inf} for l <~k < j

or there is a C], 1 ~<j ~< n, with state(C]) = loop and

state(Ck) �9 {yes, inf} for 1 ~< k < j

or there is a C], 2 ~<j ~< n, with state(c]) = no, there is

a Ck, 1 <<.k <j , with state(Ck) = i n f a n d state(Ct) �9 {yes, in f } for 1 ~< l < j.

state(A,?) =no iff there is a C], l<~j<~n, with s ta t e ( c ] )=no and state(Ck) = yes for 1 ~< k < j .

Thus, we obtain the following results for the refutation of a goal A with a Prolog

program n. (Let the clauses with head A in n be numbered ?~ . . . . . ?x.)

state(A) = yes iff state(A, ?A) = no for 1 <<. i < k, state(A, 7~) = yes and

state(A, ?~) ~ {loop, inf}, k < j <~ n.

state(A) = inf iff state(A, 7~ ) = no for 1 ~< i < k and state(A, 7~) = inf

or state(A, 7~) = no for 1 <~ i < k, state(A, ?~) = yes and

state(A, 77) �9 {inf, loop} for a k < j ~< n.

state(A) = loop iff state(A, 7~) = no for 1 ~< i < k and state(A, 7~) = loop.

state(A) = no iff state(A, ?a) = no for 1 ~< i ~< n.

Obviously we have to test clauses with head .4 until we get a clause 7~ for which state(?'~) ~ n o or there is no clause left with head A. Only in the case of

state(?~) = yes do we have to consider the remaining clauses for A. Otherwise the

result of the last clause considered is the result of state(A). Note that the states are not static. Before we start any refutation we initialize the

state of each a tom with new. Considering an a tom A for refutation the first time there is a first reinitialization with loop. Thus we can reflect a recursive call of A in a correct way (state(A, 7) = loop case 1). There is a second reinitialization with inf after having found a clause ?~ with state(A, 7 ~ ) = y e s to reflect the case that we obtain a recursive call of A when we already know that there is a refutation for A

(state(A) = inf case 2). These remarks should be sufficient to confirm that we have

~yes if r e f u t e ( f o r m u l a [ i ] , f o r m u l a [ i ] ^ . c l ause s ) �9 {yes, inf} prolog(n, A,) = [ r e f u t e ( f o r m u l a [ i ] , f o rmu la [ i ] ^ . c l a u s e s ) otherwise

Page 7: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 325

with the following functions r e fu t e and t e s t c l a u s e , if we assume that the values fo rmula [ i ] ^ . s t a t e are initialized with new. (The function FIRST projects the

head of a list and the function REST pr~ects the tail of a list.)

FUNCTION refute (A: ~atom; C: LIST OF (LIST 0F ^atom)):status;

VAR stop,infsol: boolean;

result: status;

BEGIN

IF A'.state = new { A has not been considered }

THEN IF C = NIL

THEN A'.state:=no { no clauses for A }

ELSE A'.state:=loop; { i. reinitialization }

stop:=false;

WHILE (NOT stop) AND (C <> NIL) D0

result:=testclause(FIRST(C)); { test next clause }

IF result=no

THEN C:=REST(C)

ELSE stop:=true { state(A,cl) <> no }

ENDIF

ENDWHILE;

IF result <> yes

THEN A^.state:=result { state of last clause is valid }

ELSE infsol:=false;

A'.state:=inf; { 2. reinitialization }

C:=REST(C); { test remaining clauses }

WHILE (C <> NIL) AND (NOT infsol) DO

result:=testclause(FIRST(C));

infsol:=(result=loop) 0R (result=inf);

C:=REST(C)

ENDWHILE;

IF infsol THEN A'.state:=inf

ELSE A'. state: =yes

ENDIF

ENDIF

ENDIF

ENDIF;

refute:= A^.state

END; { refute }

{ loop or inf has

been found }

where t e s t c l a u s e is the following function:

Page 8: Equivalence of propositional Prolog programs

326 HANS KLEINE BONING ET AL.

FUNCTION testclause (body: LIST 0F "atom): status; VAR success,infiflag: boolean;

result: status; BEGIN success:=true; infiflag:=false; result:=yes; { initialization (could be a fact) } WHILE success AND (body <> NIL) DO

result:=refute(FIKST(body),FIRST(Body)'.clauses); { consider next atom in the body of the clause }

success:=(result=yes) 0R (result=inf); IF result=inf THEN infiflag:=true ENDIF; body:=REST(body)

ENDWHILE; CASE result 0F

yes,inf: IF infiflag THEN testclause:=inf { state(A,cl)=inf } ELSE testclause:=yes { state(A,cl)=yes }

ENDIF; loop: testclause:=loop; { state(A,cl)=loop ( case i and 2) }

no: IF infiflag THEN testclause:=loop { state(A,cl)=loop

(case 3) } ELSE testclause:=no { state(A,cl)=no }

ENDIF ENDCASE END; { testclause }

Thus for computing prolog(n, 7) for a definite clause 7 := A, ~ Ai~ . . . . . Ai,, we construct ~ := A~I A " '" A A~,, An. We store ~ in the global variable fo rmula . Afterwards we compute r e f u t e ( f o r m u l a [ i], f o rmula [ i] ^. el auses) . The complex- ity of any call of the function is linear in length(n) because we resolve with each clause of n at most once. Further examination is not necessary because we store the state for each atom A, after trying to refute A~ for the first time. Hence we have proved the following result:

T H E O R E M 3.1. prolog(n, 7) can be computed for a Prolog program n and a definite clause 7 in time linear in length(n). []

Since loops are one of the main differences between deduction in logic programs and Prolog programs it is interesting to know whether there exists a question 7 for a given Prolog program n such that prolog(rr, 7) = loop.

DEFINITION. loop- tes t is the set of Prolog programs n for which there exists a question 7 with prolog(n, 7) = loop.

Page 9: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 327

T H E O R E M 3.2. loop- tes t is NP-complete .

Proof. We know from Theorem 3.1 that prolog(n, 7) can be computed in linear time for a Prolog program n and a given question 7. Thus n ~ loop- tes t can be

decided nondeterministically in polynomial time by choosing a question 7 nondeter- ministically and verifying prolog(n, 7) = loop.

In order to prove that loop- tes t is NP-hard we reduce the 1-in-3SAT problem of propositional calculus formulas:

Given a propositional calculus formula ~t in conjunctive normal form with three literals per clause, where each clause does not contain negated literals; then it is NP-hard to decide whether there is a truth assignment ~-, such that each clause has exactly one true literal, see [9].

We associate a Prolog program n~ with a given propositional calculus formula

ct : = A A,~ v Aa v Ai3, l <~ , <~ n

Av positive literals, in the following way:

I < ~ i < ~ n

^ Loopn ~ Loopo

where

ni~:=Loop,_ l ~ Ail, A,2 (1)

^ Loop,_ 1 ~'-'Ai2, Ai3 (2)

^ Loop, _ 1 ~- Ail, Ai3 (3)

A Loop, _ 1 ~" A d , Loop, (4)

A Loopi_ 1 ~ A , 2 , Loopi (5)

A Loopi_ 1 ~-- A,3, Loop, (6)

^ Loopi_ l (7)

This construction can be done in time polynomial in the length of ~t. We prove

at e 1-in-3SAT .~ n~ e l oop- t e s t

�9 Assume that there is a truth assignment ~" for ~t such that each clause of ~t has exactly one true literal A*, 1 ~< i ~< n. Let us consider the question

7s~ := Loopo. - -A* . . . . . A * .

Asking this question to n~ means asserting the facts A*, 1 ~< i ~< n, in front of n~ and then trying to refute the goal Loopo.

1. Let us consider the refutation of a goal Loopj, 0 ~<j ~< n - 1. There are seven clauses in n~ which can be matched with the goal Loopj. Each of the clauses (1), (2) or (3) fails because by definition of A* exactly one of the

Page 10: Equivalence of propositional Prolog programs

328 HANS KLEINE BONING ET AL.

three facts A(j+ 1)1, A(j+ I)2, A ( j + I)3 is asserted to n~ and there is no other clause with head Au+ l)k, 1 ~< k ~< 3. Next we have to consider the clauses (4), (5) and (6). One of these clauses can be matched obtaining the new goal Loopj+l because exactly one of the three facts A j, , 1 ~< k ~< 3, is asserted to 7r~. Hence we obtain Loopj +1 as a new goal which has to be refuted.

2. Let us consider the refutation of the goal Loopn. The only clause of n~ which can be matched is Loopn ~-Loopo and we obtain the new goal Loopo for refutation.

Thus we conclude prolog(n~, ~'~r) = loop. �9 Assume that there is a question ~, with prolog(rc~, 7) = loop. The result for any

question ? with a head different from Loops would be prologOr~, ~,)= no, because there is no clause in n~ which can be matched. Thus ? has to be equal to Loopi ,,- BI, �9 � 9 Bm for some 1 ~< i ~< n. With ~ : - {B~ . . . . . B,, } we con- struct a truth assignment J r for ~ in the following way:

{10 i f A t J ~ " ~ ' ~ ( A i j ) : = otherwise

We have prolog(n~, ?) = prolog(~, Loops) = loop with ~ : = B1 A �9 �9 �9 ̂ B m ^ n~- Consider a goal Loopj in a looping derivation of Loops with ~ : 1. In the case of 1 ~<j ~< n - 1 the only possibility to get such a looping

derivation is using one of the clauses (4), (5), (6) to refute Loopj. Thus

exactly one of the three facts A(j+OI, A(j+ 1)2, .4(j+ ;)3 must be a clause of ~ , that is has to be in ~ . Otherwise we would obtain pro- log(n~, Loops) =yes by using one of the clauses (1), (2), (3) if more than one of the three facts are in ~ or by using clause (7) if none of the three facts is in ~ . Thus we obtain 3 r (A( j + l)k) = 1 for exactly one 1 ~< k ~< 3.

2. In the case of j = n the only clause that can be used for refutation is Loopn ~ Loopo and we obtain the new goal Loopo.

Thus starting with goal Loop~ we obtain the goals Loopj, 0 <<,j ~ n in the looping derivation. Hence we obtain ~"r(Ajk) = 1 for exactly one 1 ~< k ~< 3 for each 1 ~<j ~< n. We conclude ~ is in 1-in-3SAT.

This completes the proof of the theorem. []

REMARKS. 1. It is not necessary to have facts in the Prolog program n~. In the construction of n~ the facts Loop~_ ~ can be omitted. The only difference is that now prolog(n~, ?) = no, if none of the facts A~I, A,2 or A;3 is asserted.

2. Further, it is not necessary to have more than two different clauses with the same head. With an easy construction we get a Prolog program n~, from ~ which satisfies this condition. Assume that we have r /> 3 clauses with the same head

A Head #- Btl , . . . , Bin ,. I <<. l <~ r

Page 11: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS

These clauses can be substituted by:

329

(Head~Heado)^ A (Head,-I~Headi) A(Head,~B,l , ' - ' ,Bin,)"

Thus, we obtain that NP-completeness o f loop-test is not affected if n

�9 is fact-free

�9 has at mos t two clauses with the same head

�9 has only clauses with at most three literals.

Finally we want to prove that n ~ loop-test can be decided deterministically in

polynomial time, if we either consider binary Prolog programs or if we consider

deterministic Prolog programs:

L E M M A 3.3. It E loop-test can be decided deterministically in polynomial time for binary Prolog programs ~.

Proof. Let rc be a binary Prolog p rogram and let A [ A ~ ̂ ... ^ A . . . . sLo G arbi-

trary. Since each clause o f n contains at most two literals, we conclude that G

contains at mos t one atom. Thus, we get prolog(n, 7) e {loop, fail} if and only if

there has never been resolved with a fact o f A~ A �9 �9 �9 A An An . Hence we conclude

that there exists a clause 7 : = A ~ A~ . . . . . An with prolog(n, 7) = loop if and only

if prolog(n, A ) = loop. We need to test only whether there exists an a tom A

occurr ing in It with prologOt, A) = loop to decide n ~ loop-test for binary proposi-

tional Prolog programs rt and therefore Theorem 3.1 yields the assertion. []

L E M M A 3.4. r~ ~ loop-test can be decided deterministically in polynomial time for deterministic Prolog programs ~.

Proof. We can construct a directed graph if(n) to a Prolog p rogram n in the

following way:

�9 Each a tom o f ~ is represented by a vertex o f if(n).

�9 For each clause 7 : - A ,-- B~ . . . . . Bn o f It we stipulate that B~ . . . . . Bn are

successors o f A, i.e., there is an edge f rom A to B, for 1 ~< i ~< n. I f A is a fact,

the successor o f A is a special vertex labelled by u .

This construct ion can be done in time polynomial in the length o f n. With this

construct ion we prove for deterministic Prolog programs that n e loop-test if and

only if there is a cycle in ~(n).

�9 Assume there is a cycle in f~(n) including vertex V~. The path o f this cycle is

(V~ . . . . . Vn, 1/1) n /> 1. This means there are the following clauses in n

assuming n + 1 : = 1:

V,~A'I . . . . . A~,, V,+~,B~ . . . . . B'k, for l <<.i<<.n, li>~O, ki>>-O.

Considering the question 7 : = V I ~ C ~ , . . . , C m with {C I . . . . . C m } =

{A~]I ~< i ~< n, 1 ~<j ~< l; for each i}\{V, . . . . . V~} we obtain prolog(n, 7) =

loop because we are considering deterministic Prolog programs and as such

there are no other clauses with head Vi in n.

Page 12: Equivalence of propositional Prolog programs

330 HANS KLEINE BONING ET AL.

�9 Assume there is a question 7: = A ~ B ~ , . . . , B~ with prolog(n, 7 ) = loop, then there is an infinite branch in the refutation tree for this question. We conclude that there must be a branch containing a goal V* Y~ �9 �9 �9 Y~ and V*ZI �9 �9 �9 Zm for

refutation. Hence we obtain that there must be a path in f~(n) from V* to V*,

that is there is a cycle in ~(n).

It is well known that the test for cycles in a directed graph can be done deterministically in polynomial time, which completes the p roof of the lemma. []

3.3. YES-NO EQUIVALENCE OF PRoLoG PROGRAMS

Since loops are one of the main distinctions between deduction in logic programs and Prolog programs we want to consider a kind of equivalence of Prolog programs first which does not consider loops in some fact.

y - - n

D E F I N I T I O N . Two Prolog-programs no and n~ are y e s - n o equivalent (no ,~ n~) if

and only if there do not exist a question 7 and an i ~ {0, 1}, such that

prolog(n,, 7) = yes and prolog(n~ _ ~, 7) = no

In theoretical computer science this y e s - n o equivalence is often called 'weak

equivalence'. To show how we can do a test of y e s - n o equivalence we have to introduce some lemmas first.

L E M M A 3.5. Given a Prolog program n and a question 7 :=- X ~ Y~ . . . . . Y~ with

prolog(n, 7) ~ {yes, loop}. Then it holds that prolog(n, 7') e {yes, loop} for any

7" : - X*-- ZI . . . . . Zm with {Z~ . . . . . Zm } ~ { gl . . . . . gn }" Proof. �9 Let us assume that prolog(n, 7) = yes. Because of the soundness of SLD-resolu-

tion for definite Horn formulas we obtain n ~ 7. Thus we have n ~ 7' as

well and we can conclude prolog(n, 7 ' ) # no, which means prolog(n, 7')

{yes, loop }. �9 Let us assume that prolog(n, 7) = loop. With ff : - Y~ A �9 �9 �9 A Y~ A n we have

prolog(ff, X ) = loop and we know that there is an infinite branch in the refutation tree T~(X). With if' : - Zi A " " A Zm A n we obtain that T~(X) is

a subtree of T~,(X) because adding the facts { Z I , . . . , Z m } \ { Y l . . . . . Y,} to means adding some branches to T~(X). Thus we have an infinite branch in T~.(X) and therefore it holds that prolog(~', X ) ~ no. Since pro- log(~', X ) = prolog(n, 7') we obtain prolog(n, 7") ~ {yes, loop }.

This completes the proof of the lemma. []

R E M A R K . With Lemma 3.5 we conclude for a Prolog program n and a question 7' =- X , - Zl . . . . . Zm with prologOt, 7") = no that prolog(n, 7) = no for any question 7 =_X4-Y~ . . . . . Yn with {Y1 . . . . . Yn}=_{Z~ . . . . . Zm}.

L E M M A 3.6. Given a Prolog program ~ and let ~ : - X ~ B 1 . . . . . By, r >1 0, be a clause o f ~. Assume that we query for 7 :=- X ~ YI . . . . . Yn and that we obtain

Page 13: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 331

prolog(n, 7) = yes because X and 3 have been matched successfully. Then it holds that

prolog(n, 3) = yes.

P r o o f Assume that all clauses o f rt are numbered f rom (1) to (n) and that the number o f ~ is ( j ) . Moreover , we assume prolog(n, 7) = yes because X and 6 have been ma tched successfully. This means tha t X could not be matched successfully with

the clauses (1) . . . . . ( j - 1). Let ~ be the Pro log p r o g r a m consisting o f the same clauses as n in the same order,

but wi thout the clauses with head X and number (k) /> ( j ) . Obvious ly we obta in

prolog(r~, 7) = no.

We know prolog(n, 7i) = yes for all 7t : = B~ *-- YI . . . . . Yn with 1 ~< i ~< r because prolog(n, 7 ) = yes and X and 6 have been matched successfully. Thus we obta in

prolog(~, 7i) = yes for all 1 ~< i ~< r because clauses with head X only were removed f rom n to get ~. Note , that if we had to use a clause with head X for the refuta t ion

o f a 7i this would lead to prolog(n, 7) = loop.

We conclude that adding the facts B1, �9 �9 �9 B, to n or ~ does not change the results

o f prolog(n, 7) or prolog(~, 7). We obtain prolog(Bl A �9 �9 �9 ̂ Br ^ ~, 7) = no and thus

p r o l o g ( ~ , X ~ - Y ~ . . . . . Yn, B1 . . . . . Br) =no . With the r emark to L e m m a 3.5 we conclude prolog(~, ~5) = no as well. Hence we obta in prolog(n, 3) = yes because X cannot be matched successfully with the clauses (1) to ( j - 1). Using clause ( j ) (which is the p r o g r a m clause 3) leads to success which yields the assertion. []

L E M M A 3.7. Given a Prolog program n, a question 7 :=- X*-- Yi . . . . . Yn and a

question 7' : - Y '(---Z 1 . . . . . Z r with Y" �9 {YI . . . . . Yn}, prolog(n, 7) �9 {yes, loop}

and prolog(n, 7') �9 {yes, loop}. Then it holds that prolog(n, X *-- Ul . . . . . Urn) �9

{yes, loop} with {U1 . . . . . Um}~=({Y~ . . . . . Y n } u { Z ~ , . . . , Zr} ) \ {Y '} . P r o o f With prolog(n, 7) �9 {yes, loop} we have prolog(n, ~ �9 {yes, loop} for

: = X ,--- Vi . . . . . Vt and { V 1 , . . . , V~ } = { I:1 . . . . . Y~ } u {Zi . . . . . Zr }. With pro- log(n, 7") �9 {yes, loop} we obtain prolog(n, ~') �9 {yes, loop} for the clause

~ ' : = Y'*-- U~ . . . . . Um and {Ut . . . . , U,, } = {Vt . . . . . V t } \ { Y ' } , because o f L e m m a

3.5. Since prolog(n, ~ =prolog(Vl A "" �9 A V I A n, X ) �9 {yes, loop} we obta in pro-

log(U~ A �9 " A U,, A n, X ) �9 {yes, loop} because instead o f having the fact Y" as-

serted to n it holds that prolog(U~ A " �9 A U,, A n, Y ' ) �9 {yes, loop}. This completes the proof . []

y - - n

T H E O R E M 3.8. no ~ nl i f and only i f there does not exist an i �9 {0, 1} and a clause

7 o f no or nl with

prolog(n, 7) = yes and prolog(nl _ e, ~) = no. (8)

P r o o f . V - - I t

�9 I f n0 ~ n~ it is obvious tha t there is no clause 7" in n0 A nt for which (8) does not hold by definition o f y e s - n o equivalence.

�9 Assume tha t (8) holds. Fur the r let us assume that there is a quest ion

7 ' : = X ~ Yl . . . . . Y, with

proiog(no, 7') = yes and prolog(nl, 7") = no. (9)

Page 14: Equivalence of propositional Prolog programs

332 HANS KLEINE BONING ET AL.

We will show that this is a contradiction by an induction over the length t of

the refutation of goal X with Y~ A �9 �9 �9 A Yn A no-

t = 1 : This means the result yes yields from the refutation of goal X with the

fact X which has to be a clause of no because we do not query for tautologies.

Thus we obtain prolog(rr o, X ) = y e s by Lemma 3.6. With (8) we obtain that

prolog(nl, X ) ~ no and with Lemma 3.5 prolog(nl, X ~ Yl . . . . . Yn) ~ no. This

is a contradiction to (9).

t --. t + 1: The answer prolog(no, ~') = yes is the result of the refutation of

goal X with a rule 6 : = X ~ B~ . . . . . B r which has to be a clause of no. This

implies that there is a refutation for each of the subgoals B ~ , . . . , B ,

Thus we obtain prolog(no, ~i) =yes for all ~ , i :=Bi~ Y~ . . . . . Y,, 1 ~< i ~< r,

and by induction hypothesis we obtain prolog(n~, ~i) ~ {yes, loop}. Because of Lemma 3.6 we know prolog(no, 6) = yes and with (8) we obtain

prolog(nl , 3) E {yes, loop}. Hence we have

prolog(n I , X ~- B1 . . . . . B,) ~ {yes, loop}

prolog(nl , Bi ~ Yi . . . . . Y , ) ~ {yes, loop},

and with Lemma 3.7 we obtain

prolog(nl , X *- Yl . . . . . Y , ) ~ {yes, loop}

which is a contradiction to (9).

This completes the proof of the theorem.

l ~ i ~ r

[]

Thus we have the result that for testing the yes -no equivalence it is enough to test

all clauses of both programs. Hence we have proved the following lemma:

LEMMA 3.9. Y e s - n o equivalence for Prolog programs nl and n2 can be decided in

quadratic time with linear space in the length o f nl ^ n2. []

REMARK. Because of the soundness and completeness of SLD-resolution for a

conjunction of definite clauses and the definition of yes -no equivalence it is obvious y - - n

that nl ~ n2 implies n~ ~ rr2, but not vice versa as shown in the following

example.

EXAMPLE. Consider the programs n~ and rr2:

rh : - (A ,-- B) ^ (B ,-- A)

n 2 :~--" A ^ B. y - - n

It holds that rq ~ n 2 but n~ ~# 7~ 2 because of n I ~ 14 and •2 ~ A. []

3.4. COMPLEXITY OF EQUIVALENCE OF PROLOG PROGRAMS

So far we have not considered loops when testing the yes -no equivalence of Prolog programs. Now we analyse the 'strict' equivalence of Prolog programs.

Page 15: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 333

D E F I N I T I O N . Two Prolog-programs nl and nz are equivalent (nl ~ n2) if and only

if prolog(n I , y) = prolog(n2, 7) for all questions 7.

Obviously there is a relationship between the y e s - n o equivalence and the y - - n

equivalence of Prolog programs. I f we know that 711 ~ n2 and prolog(ni, 7) v ~ loop for

P P all questions 7 then it holds that n 1 ~ 7[2. Further, it is obvious that if n I ~ n 2 then

y - - n y - - n P 711 ~ 7[2, but we can construct Prolog programs with 711 ~ n2 and n I ~ n 2.

EXAMPLE.

711 : ~ ( m ~.- B ) A (B *- A) ^ A

7[2 : - A ^ (A *- B) ^ (B *- A). y - ,

It holds that nl ,~ n2 but prolog(nl, A) = loop and prolog(n2, A) = yes. Hence we P

o b t a i n 711 ~/~ 7[2. [ ]

Considering vnl g zr 2 it is obvious that we do not have such a relationship between v

711 ~ 712 and 711 m 7[2 since there are examples with n 1 m n 2 and nl @ 7[2 and vice versa. v

For the example above we have 711 m 7[2 and 711 ~ 7[2- The following example shows the opposite case.

EXAMPLE.

7[ 1 :~---~ ( m *- B) ^ (B *- A) ^ A

7[~-- (A ~- B) ^ (B ~-.4). P

Obviously it holds that 711 ~ 7[2 but 711 ~ A and 7[2 ~ A which yields 711 ~ n2. []

When testing thee y e s - n o equivalence of nl and 7[2 we do not have to consider loops. For testing n I ~ n2 loops are relevant. This leads to the following theorem.

P T H E O R E M 3.10. The equivalence problem {(no, n|) I no ~, nl for Prolog programs no and 711 } is coNP-complete .

Proof Since y e s - n o equivalence is necessary for equivalence we can do the test

of y e s - n o equivalence first. This test can be done in time polynomial in the length of 7[0 ^ 711. Thus we have to consider Prolog programs no and 7[~ only for which it holds that 7[oY~ n 711- For these Prolog programs we conclude because of the

P definition of y e s - n o equivalence: 7[0 ~ 711 if and only if there do not exist a question

and an i e {0, 1} with prolog(7[,, 7) = loop and prolog(n~ _,, y) e {yes, no}. To prove that this problem is coNP-comple te we will use the same reduction n,

o f the 1-in-3SAT problem as in the p roof of Theorem 3.2:

7[o(") : - 7[~

7[1(~) := A Loopi. O<<.l~n

This construction can be done in time polynomial in the length of ~t. Since prolog(no(Ct), 7) = prolog(7[l (~), y) = yes for each clause Y occurring in 7[o(at) or nl (ct) we obtain 7[0(~)Y~ 711 (~) using Theorem 3.8.

Page 16: Equivalence of propositional Prolog programs

334 HANS KLEINE BONING ET AL.

P Next we prove that Xo(a) ~, 7rl (~) if and only if prolog(no(~t), y) # loop for all

questions 7. P

�9 Assume that no(a) ,~ nl (~t), which means prologOro(,O, ?) = prologOh (~), 7) for all questions y. Since prolog(~l (~t), ?) = yes for all questions with head Loop,, O<i<~n, and prolog(xl(oO)=no else, that is prolog(nl(a), y) #loop for all

questions 7, we obtain that prolog(no(a), 7) # loop for all questions T as well. �9 Assume that prolog(no(Ot), 7 ) # loop for all questions y. Since we know that

P prolog(lrl(~t), ~) ~ (yes, no} for all questions 7, we obtain %(~) ~ nl(~t) because

~o(~) % " ~,(~). Theorem 3.2 yields prologOto(~t), 7) # loop for all questions ), if and only if ~t is not in 1-of-3SAT. This problem is coNP-complete which completes the proof of the theorem. []

R E M A R K . With minor changes in the construction of r~o(~) and nt(~) it is obvious that the same proof applies for Prolog programs that

�9 are fact-free

�9 have at most two clauses with the same head �9 have at most three literals per clause.

Instead of no(a) and n~(~) we construct r~0(~) and ~ ( ~ ) as follows:

~o(O0 : = (Loop. ~ Loopo ) A ~'(r A (Loop,_ l *" B, X~ I) I <~i<~n

with

~1(~ := (L~176176176 A ~i(o 0 A ( L o o p i _ t ~ B ) l <~i<~n

tXO) . . . s Aa) A (vi/'(l) r (.2) I ) ~t(~) : _ ~ i - l ~il , v ~ i - l , -

A (x~ , - A,~, A,~) A (X~l ,--x~ i)

A (X~3_) 1 4---All , mi3 ) A (X~ 3)_ 1 #'-X~ 4) 1)

- - . ) . ~_ X(:) ) A ,~(X (4),_l~v "- A f t , Loop,) A ~ai . . . . 1

"--(5) 4-- X(, 6) ) A(X~1~--A,2, Loop,) A i x { i - 1 t - I

A ( x ~ l ~-ai3, Loop,) A ( x ~ l ~-x~? ,)

A (X!?I '-- B).

Thus we obtain that the coNP-completeness of the test of the equivalence of Prolog programs is not affected for Prolog programs with the above restrictions.

P L E M M A 3.11. 7[ 1 ~ ~2 can be decided deterministically in polynomial time for binary Prolog programs 7g I and rr2.

P Proof. We prove that for binary Prolog programs nl ~ rr2 follows from the fact

that we know prolog(nt, ?) = prolog(n2, ~) for each clause 7 - A ~ B and 7 - A for which the atoms A, B occur in rq ^ It 2. Obviously, this yields the assertion.

Page 17: Equivalence of propositional Prolog programs

EQUIVALENCE OF PROPOSITIONAL PROLOG PROGRAMS 335

Consider an arbitrary clause V := A ~ A ~ , . . . , A, and let A I G A 1 ^ "" ^ An A • , S L D

arbitrary. Then G contains at most one atom, hence in the case of prolog(nl, ~) = yes we conclude prolog(rh, A ~ A i ) = yes for some 1 ~< i ~< n. There- fore we have prolog(n2, A ~ A , ) = yes. Since r~: is a binary Prolog program we obtain prolog(n2, ~) = yes. As can be seen, prolog(n~, ~) = no implies prolog(z2, ~) ~ yes. Further, from prolog(nl, ~) = no we get prolog(nl, A) = no.

Therefore prolog(n2, A) = no holds, which guarantees prolog(n2, ~) ~ loop because n is binary. This completes the proof of Lemma 3.11. []

4. Conclusions

We have presented a linear time procedure, which results for a propositional Prolog program ~ and a question V yes, no or loop depending on what the Prolog inference strategy would answer. Based on this procedure we have discussed the loop problem and different equivalence problems for propositional prolog programs.

If we restrict the programs or introduce Built-in predicates some questions remain open, e.g., the complexity of deciding the equivalence problem for deterministic prolog programs or the problem of whether yes-no equivalence can be decided in linear time. Considering programs with retract and assert which can be used to modify the program during execution, the undecidability of the equivalence prob- lem can be shown by well-known methods simulating the halting problem of counter machines.

References

1. Aspvall, B., Plass, M. R. and Tarjan, R. E., "A linear-time algorithm for testing the truth of certain quantified boolean formulas', Information Processing Letters 8, 121-123 (1979).

2. B6rger, E., Berechenbarkeit, Komplexitiit, Logik, Vieweg-Verlag, Braunschweig (1985). 3. B6rger, E., 'Logic as machine: complexity relations between programs and formulae', in Current

Trend in Computer Science (ed. E. B6rger), Computer Science Press. 4. Clocksin, W. F. and Mellish, C. S., Programming in Prolog, Springer-Verlag, Berlin (1984). 5. Cook, S. A., 'The complexity of theorem-proving procedures', Proc. Third ACM Symp. on Theory

of Computing, pp. 151-158 (1971). 6. Dowling, W. F. and Gallier, J. H., 'Linear-time algorithms for testing the satisfiability of proposi-

tional Horn formulae', J. Logic Programming 1, 267-284 (1984). 7. Kleine Brining, H. and Schmitgen, S., Prolog, Teubner-Verlag, Stuttgart (1986). 8. Kowalski, R., 'Predicate logic as programming language', Information Processing 74 (ed. J. Rosen-

feld), North-Holland, Amsterdam, pp. 556-574 (1974). 9. Schaefer, T. J., 'The complexity of satisfiability problems', Proc. lOth ACM Symp. on Theory of

Computing, pp. 216-226 (1978). I0. Shoenfield, J. R., Mathematical Logic, Addison-Wesley, London (1967).