proving properties of games using answer set … · using answer set programming - optimization...

36
Fakult ¨ at Informatik EMCL Master Project Proving Properties of Games using Answer Set Programming - Optimization Nguyen Trung Kien Supervisor : Prof. Michael Thielscher Advisor : Dipl. Stephan Schiffel

Upload: tranngoc

Post on 04-May-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Fakultat Informatik

EMCL Master Project

Proving Properties of Gamesusing Answer Set Programming -

Optimization

Nguyen Trung Kien

Supervisor : Prof. Michael Thielscher

Advisor : Dipl. Stephan Schiffel

Abstract

A General Game Player (GGP) is a system capable of understanding and play-ing well a variety of games, including games that are previously unknown toit, without human intervention. Given only the rules of the games, a successfulplayer must be able to extract game specific knowledge to use in constructing ef-ficent and effective search heuristics in strict time limit. A usual way to extractgame properties is checking via simulating; however, the results are not nec-essarily correct. A recent approach described in [Schiffel and Thielscher, 2009]shows the ability to prove extracted knowledge with the help of Answer SetProgramming (ASP). Though the system can prove properties of small gamesvery fast, it runs really slow or even cannot ground ASP programs for big gameslike chess or checkers. In this report, we present our optimizations that help thesystem run faster and more effectively.

Contents

1 Introduction 2

2 Preliminaries 42.1 Game Description Language . . . . . . . . . . . . . . . . . . . . . 42.2 Answer Set Programming . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Syntax and Semantics of Answer Set Prolog . . . . . . . . 72.2.2 Syntax supported by Potassco Systems . . . . . . . . . . . 82.2.3 Incremental Answer Set Programming . . . . . . . . . . . 9

3 Proving properties of General Games using ASP 113.1 GDL rules as State Transition System . . . . . . . . . . . . . . . 113.2 Proving Properties of General Games using ASP . . . . . . . . . 123.3 The Automated Theorem Prover for GGP . . . . . . . . . . . . . 14

4 Optimizations 164.1 Weakly Restricted Variables . . . . . . . . . . . . . . . . . . . . . 164.2 Current state generating . . . . . . . . . . . . . . . . . . . . . . . 184.3 Recalculating features domains . . . . . . . . . . . . . . . . . . . 184.4 Recalculating moves domain . . . . . . . . . . . . . . . . . . . . . 194.5 Using Incremental ASP . . . . . . . . . . . . . . . . . . . . . . . 214.6 State abstracting . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.7 Running Clingo and iClingo in parallel . . . . . . . . . . . . . . . 244.8 Removing function symbols from true and next . . . . . . . . . . 26

5 Experiments 275.1 Comparing proving time . . . . . . . . . . . . . . . . . . . . . . . 275.2 Comparing grounding time and the size of ground programs . . . 295.3 Order of program rules . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Conclusion 31

Bibliography 33

1

Chapter 1

Introduction

General Game Playing is an area which deals with developing agent systemsthat can understand and play well previously unseen games encoded in theGame Description Language [Love et al., 2008]. For small games like Tic-tac-toe, only by brute force checking every state, a GGP agent can play the gameseasily. This can not be applied in case of big games like chess or checkers. Theplayer must be able to construct appropriate search strategies and heuristicsfor such games. For example, the player should choose Minimax α/β searchingalgorithm for two-player, zero-sum and turn-taking games; while it may chooseMaxN algorithm for simultaneous games, etc. The player should also estimateevaluation values for intermediate states, so that it can choose for itself a good(if not best) action out of all legal ones. These require the player to have theability to extract game-specific knowledge like boards, pieces, step counters,etc., solely from the rules of the games.

Previous attempts on extracting properties from game description rules in-clude [Kuhlmann et al., 2006] and [Kaiser, 2007]. [Kuhlmann et al., 2006] iden-tify structures like game boards, step counters, movable pieces by matchingthe game description clauses to template patterns syntactically. Their systemthen uses self simulation to verify whether the extracted properties are violated.[Kaiser, 2007] approaches the problem from statistic view. First, he categorizesthe game state facts into groups based on core predicates of the facts; and thenplay some random matches to identify the arguments of the facts in which thesymbols change turns after turns. The arguments that make many changeswill be identified as pieces, otherwise they will be recognized as coordinates ifthey tend to stay the same. Both [Kaiser, 2007] and [Kuhlmann et al., 2006]rely on random matches to guess the properties and do not try to prove them.That means they do not know whether the extracted knowledge holds in everystate or not. The first attemp to automatically prove extracted properties ispresented in [van der Hoek et al., 2007]. However in their approach, they try tosearch the whole set of reachable positions of the game, that is not possible inthe majority of games since the number of reachable positions is too large.

A recent approach, described in [Schiffel and Thielscher, 2009], shows theability to use automated Answer Set Solving system to prove properties thatare valid through all legal positions. They present a way to automatically re-duce a proving task so that the prover only needs to prove a simpler inductionstep and its base case. Their experiments show that for almost all the games,

2

properties like those indicating player’s control or whether a game is zero-sumcan be proved very fast whereas more expensive features like board can onlybe proved in some games. Though the approach provides a well-defined prooffor properties, i.e., the properties that are proved to be true would definitelyhold in all legal positions, the bottle neck usually stay in the grounding phaseof the logic program and in the execution of the solver [Potassco]. The timeneeded for grounding and proving is very sensitive to the way the properties areencoded. This project aims to improve the encodings so that the system canwork efficiently in a wider range of games.

The rest of the report is organized as follows. The next chapter will reviewthe basic syntax of the Game Description Language and Answer Set Prolog -in general and in particular the syntax that is supported by the ASP solversfrom the [Potassco] package. The chapter that follows will recapitulate the ap-proach to proving properties using Answer Set Programming like that shownin [Schiffel and Thielscher, 2009]. We then present our work in refining theproblem encoding, including changes in fact generation, domain restriction, in-cremental encoding, state abstraction etc.. The experimental results then willshow the effect of the changes on the efficiency of the system in proving prop-erties. The last chapter will summarize our work and discuss potential futurework.

3

Chapter 2

Preliminaries

This chapter will provide an overview of the Game Description Language as pro-posed in [Love et al., 2008] and Answer Set Prolog - in general and in particularthe syntax supported by the Answer Set Programming systems [Potassco]

2.1 Game Description Language

The Game Description Language (GDL) has been proposed and developed by[Love et al., 2008] to encode the rules of games in General Game Playing (GGP).The class of games covered by GDL are finite, discrete, deterministic multi-player games with complete information. In a GGP contest, at the beginning,the description of a game written in GDL is sent by an agent called Gamemas-ter to each player. The players then have to process those rules automaticallyin order to figure out legal moves and choose moves that may lead to a win-ning position. The Gamemaster plays the role of a referee, tracking the stateof the game, notifying the players about their turn and sending them informa-tion about the last moves taken by their opponents. When a terminal state isreached, the Gamemaster sends terminate signal to all players and records theirgoals as stated in the rules of the game.

In GDL, the state of a game is described in terms of a set of true facts.The syntax of GDL borrows the form of normal logic program, using a fewdistinguished predicate symbols (keywords) to conceptualize games. Moreover,it is purely axiomatic: explicit algebra or mathematics is not included but canbe axiomatized as part of the game rules. The keywords used in GDL are asfollowed:

role(R) indicates that R is a playerinit(F) the term F holds in the initial statetrue(F) the term F holds in the current statelegal(R, M) M is a legal move of player R in the current statedoes(R, M) player R takes move Mnext(F) the term F will hold in the next stateterminal the current state is a terminal stategoal(R, N) means player R can get goal value N

4

GDL also provides a special predicate distinct(X, Y) to state that the twoarguments are syntactically different.

It is important to note that the reserved predicates can not be freely usedin GDL rules. Instead, in order to ensure the intended semantics of them, somerestrictions are introduced as below.

Definition 2.1 (Dependency Graph) - Let Π be a set of GDL rules. The de-pendency graph of Π is obtained by letting all the relation constants in Π be thevertexes, and drawing an edge from p to q if there is a rule in Π that has q inthe head and p in the body.

Definition 2.2 (GDL Restriction) - Let Π and G be a GDL game descrip-tion and its dependency graph respectively. Then Π must satisfy the followingconditions:

• role only appears in facts

• init only appears in the heads of rules, and in G, init is not connectedto any of does, goal, legal, next, true or terminal

• next only appears in the heads of rules

• does only appears in the bodies of rules, and is not connected to any ofgoal, legal or terminal in G

• true only appears in the bodies of rules

GDL also imposes some syntactic restrictions on a set of clauses to ensurefiniteness and decidability of the derivation of instances, i.e., Safety, StratifiedNegation and Recursion restriction.

Definition 2.3 (Safety) - A rule is safe if every variable in its head or in anegative literal appears in at least one positive subgoal in its body.

Definition 2.4 (Stratified Negation) - A set of rules is said to be stratified ifthere are no cycles in its dependency graph that involve a negation.

Definition 2.5 (Recursion Restriction) - Let Π and G be a GDL game descrip-tion and its dependency graph respectively. Suppose Π contains a rule

p(u1, ..., um)⇐ b1(t1) ∧ ... ∧ q(v1, ..., vk) ∧ ... ∧ bn(tn)

such that p and q appear in a cycle in G, then ∀i ∈ i, ..., k, vi contains novariables, vi is one of u1, ..., um or vi appears in some tj and bj is not in anycycle with p in G (1 ≤ j ≤ n).

As an example, Listing 2.1 (next page) shows a description of the game Tic-Tac-Toe encoded in GDL. The rules can be summarized as follows. Lines 1-2declare the two players of the game - xplayer and oplayer. Lines 4-8 show theinitial state, with every cell in Tic-Tac-Toe board initialized with b (blank). Apredicate control is used to encode the turn to move of a player, which is set toxplayer in the initial state. Rules 10-12 and 19-23 update the state of the gameafter players take their actions, while rule 14-15 is a frame axiom, which says a

5

cell stays the same in the next state if it is already marked. Rules 37-43 definelegal moves of players, i.e., a player can mark a cell if it has control and that cellis blank in the current state (37-39), and if oplayer has control, then xplayeris only allowed to take action noop - which means here ”do nothing”. Rules57-64 state the conditions that terminate the game, and lines 45-55 specify howto calculate the goals achieved by each player.

Listing 2.1: Tic-Tac-Toe

1 (role xplayer)

2 (role oplayer)

3

4 (init (cell 1 1 b))

5 (init (cell 1 2 b))

6 ...

7 (init (cell 3 3 b))

8 (init (control xplayer ))

9

10 (<= (next (cell ?m ?n x))

11 (does xplayer (mark ?m ?n))

12 (true (cell ?m ?n b)))

13

14 (<= (next (cell ?m ?n ?w))

15 (true (cell ?m ?n ?w))

16 (distinct ?w b))

17 ...

18

19 (<= (next (control xplayer ))

20 (true (control oplayer )))

21

22 (<= (next (control oplayer ))

23 (true (control xplayer )))

24

25 (<= (row ?m ?x)

26 (true (cell ?m 1 ?x))

27 (true (cell ?m 2 ?x))

28 (true (cell ?m 3 ?x)))

29 ...

30

31 (<= (line ?x) (row ?m ?x))

32 ...

33

34 (<= open

35 (true (cell ?m ?n b)))

36

37 (<= (legal ?w (mark ?x ?y))

38 (true (cell ?x ?y b))

39 (true (control ?w)))

40

41 (<= (legal xplayer noop)

42 (true (control oplayer )))

43 ...

44

45 (<= (goal xplayer 100)

46 (line x))

47

48 (<= (goal xplayer 50)

49 (not (line x))

50 (not (line o))

51 (not open))

52

53 (<= (goal xplayer 0)

54 (line o))

55 ...

56

57 (<= terminal

58 (line x))

59

60 (<= terminal

61 (line o))

62

63 (<= terminal

64 (not open))

2.2 Answer Set Programming

Answer Set Programming (ASP) is a programming paradigm that deals withfinding answer sets of the logic programs which encode the problems that weneed to find the solutions. Solving a problem in ASP style consists of two stages:

• Encoding the problem into a logic program such that the answer sets ofthe logic program correspond to the solutions of the problem

• Grounding the logic program (assigning ground terms for its variables)and then finding answer sets of the generated ground program

To find answer sets, special purpose inference engines - called answer setsolvers - are used. In the last ten years, the need for efficient inference enginesto find answer sets of logic programs has led to the development of some mature

6

systems like Smodels [Smodels], DLV [DLV] and Clasp [Potassco] of which, Claspsolver out-performed in the recent ASP solver competition [ASPCom, 2007].

In this section, we will summarize the syntax and semantics of Answer SetProlog - a logic programming language developed upon the answer set/stablemodel semantics of logic programs. We will also have a closer look at the syntaxthat the Potassco systems support, especially, the way to encode an incrementalASP program.

2.2.1 Syntax and Semantics of Answer Set Prolog

Given a first order language L. A positive literal l is of the form p(t) or ¬p(t)where p(t) is an atom in L. not l is a negative literal. Here we only considernormal logic program, in which, every rule has a non-empty head and containsno occurrences of ¬ as well as or. A rule r in such program has the form of:

h ← l0 , ..., lk, not lk+1, ..., not ln

where h, l0, ..., lk, lk+1, ln are literals without ¬.We define the following notation: head(r) = h, body(r)=l0 , ..., lk, not

lk+1, ..., not ln, pos(r) = l0, ..., lk, and neg(r) = lk+1, ..., ln.If head(r) = ∅, r is called a constraint. If body(r) = ∅ then r is a fact.A ground atom (contains no variable) p (not p) is said to be satisfied by a

set of ground atoms M if p ∈ M (p /∈ M). We denote this as M |= p (resp.M |= not p). Given a set of ground literals S, M |= S iff M |= p for all p ∈ S.

A ground rule is obtained from a logic program rule by assigning groundterms for its variables. Let Π be an ASP program. We use Ground(Π) todenote the set of all ground instances of the rules in Π. The reduct of Π relativeto M - ΠM is obtained from Ground(Π) by removing all rules that have notp in their bodies, such that p ∈ M and by removing all the negative literalsfrom the remaining rules. M is called an answer set of Π if M is the minimalHerbrand model of ΠM .

[Niemela et al., 1999] presented an extension of logic programs with weightatom which we also use in this report:

l p : q(t) u

The atom above states that, in every answer set, there exists at least l andat most u number of different instances of p that satisfy q(t). Both the lowerbound l and upper bound u can be omitted, in that case, l is initialized with 0and the atom would mean there is an arbitrary number of different instances ofp to hold in an answer set. When the curly braces () are replaced by squarebrackets ([ ]) in the atom, the instances of p need not be different.

As an example, consider the following simple ASP program, Π:

q(1). q(2). p(1).z(X)← q(X), not p(X).

The ground program w.r.t. Π is:

q(1). q(2). p(1).z(1)← q(1), not p(1).z(2)← q(2), not p(2).

7

Let M1 = q(1), q(2), p(1), z(2), the reduct ΠM1 of Π relative to M1 is:

q(1). q(2). p(1).z(2)← q(2).

Obviously, M1 is the unique (also the minimal) Herbrand model of ΠM1 ,therefore M1 is an answer set of Π.

Reader may refer to [Gelfond, 2008] for more information.

2.2.2 Syntax supported by Potassco Systems

In order to solve an ASP program Π, first, Π must be grounded by a grounder bysubstituting all the variables in Π with variable-free terms. After this phase, wegain a propositional program Ground(Π). Ground(Π) then will be used as theinput for a solver to compute answer sets. It is obvious that Ground(Π) mustbe finite and equivalent to Π. To guarantee an equivalent finite ground programto exist, different grounders propose different restrictions on ASP rules. Forinstance, dlv [DLV] requires a program to be safe, lparse [Smodels] grounderonly deals with w-restricted programs and grounder Gringo [Potassco] (standalone or integrated in Clingo and iClingo) supports λ-restricted programs asstated in the definition below.

Definition 2.6 (λ-restrictedness) - Let Π be a normal logic program. For eachrule r of Π, let var(r) be the set of all variables occurring in r. We call Π λ-restricted if there exists a mapping λ from the set of all predicates appearing inΠ to the set of natural number, such that, for every rule r ∈ Π and each variableV ∈ var(r), there is some atom A ∈ pos(r) such that V ∈ A and λ(p) > λ(q)with p is the predicate of head(r) and q is the predicate in A.

To understand the idea of λ-restrictedness, let’s consider the following pro-gram (restated example 3.1 in [Martin et al., 2008]).

Example 2.1 Zigzag program

1 : zig(0)← not zag(0). zig(1)← not zag(1).2 : zag(0)← not zig(0). zag(1)← not zig(1).3 : zigzag(X,Y )← zig(X), zag(Y ).4 : zagzig(Y,X)← zigzag(X,Y ).

The set of all predicates in the program with their respective arities is zag/1,zig/1, zagzig/2, zigzag/2. Since rules 1 and 2 have ground head, theytrivially satisfy λ-restrictedness and we can map zag/1 7→ 0 and zig/1 7→ 0.The mapping for predicates in rule 3 must satisfy λ(zigzag/2) > λ(zig/1)(relative to variable X) and λ(zigzag/2) > λ(zag/1) (relative to variable Y).Maping zigzag/2 7→ 1 guarantees that condition. Finally, zagzig/2 7→ 2 willmake rule 4 λ-restricted, thus, the total mapping zag/1 7→ 0, zig/1 7→ 0,zigzag/2 7→ 1, zagzig/2 7→ 2 witnesses λ-restrictedness of the above pro-gram. If we add rule

5 : zigzag(X,Y )← zagzig(Y,X)

8

to the program then we can not find any mapping λ to satisfy both conditionsλ(zagzig/2) > λ(zigzag/2) (rule 4) and λ(zigzag/2) > λ(zagzig/2) (rule5) and therefore the program is no longer λ-restricted. Notice that, zigzag/2and zagzig/2 appear in a cycle in the dependency graph of the program, andthat the program is still safe, negation stratified and recursion restricted (i.e.,satisfies all the syntactic restrictions imposed on GDL rules). However, if wereplace rule 4 with

zagzig(Y,X)← zigzag(X,Y ), zig(X), zag(Y )

the program will be again λ-restricted since we can apply mapping zag/1 7→0, zig/1 7→ 0, zagzig/2 7→ 1, zigzag/2 7→ 2 on it.

2.2.3 Incremental Answer Set Programming

[Potassco] package provides the system iClingo to deal with incremental ASPprograms. The main difference between a non-incremental and an incrementalASP program lies in the use of an incremental constant. At the beginning, theconstant is set to 0 and the solver tries to search for an answer set. If it failsto find a solution, the incremental constant will be increased by one, and theprocess will start again. Let’s consider an example of encoding a well-knownBlocks-World instance [E. Erdem, 2002] into an incremental ASP program. In-formally stated, in the problem instance, we have three blocks b0, b1 and b2.We also have a location called table. At the initial state b0 and b1 are placedon table, while b2 stays on top of b0. The blocks must be moved some way sothat, at the last step, we have b0 on table, b1 is on b0 and b2 must be placedon top of b1. The condition is that only one block can be moved at a time: itcan either be placed on the table or on top of another block. A block which isunder (occupied by) another block can not be moved.

Listing 2.2: Blocks World

1 #base.

2 block(b0; b1; b2).

3 location(table).

4 location(X) :- block(X).

5

6 on(b0, table , 0). on(b1, table , 0). on(b2 , b1, 0).

7

8 #cumulative t.

9 1 move(X,Y,t) : block(X) : location(Y) : X != Y 1.

10

11 :- move(X,Y,t),

12 1 on(A,X,t-1) : block(A) : A != X,

13 on(B,Y,t-1) : block(B) : B != X : B != Y : Y != table .

14

15 on(X,Y,t) :- move(X,Y,t).

16 on(X,Z,t) :- on(X,Z,t-1), block(X), location(Z), X != Z,

17 move(X,Y,t) : location(Y) : Y != X : Y != Z 0.

18

19 #volatile t.

20 goal(t) :- on(b0, table , t), on(b1, b0, t), on(b2, b1, t).

21 :- not goal(t).

Listing 2.2 shows an incremental ASP encoding of the problem which is ashorter, modified version of the encoding in [Martin et al., 2008]. Notice that

9

in an incremental ASP program (we consider only the syntax supported byiClingo), there are three special directives #base, #cumulative and #volatilewhich separate the encoding into different parts: static, cumulative and querypart respectively. The incremental constant t used with directives cumulativeand volatile is the place holder for step numbers. The meaning of the rulesis summerized as follows. Lines 2-4 define b0, b1 and b2 as block, table aslocation and state that all the blocks are also locations. Line 6 uses predicateon/3 to describe the location of each block at the initial state, e.g., on(b0,table, 0) means block b0 is on table at time step 0. The cumulative partstarts with rule 9 to state that there is exactly one move at a time to placeblock X onto location Y (Y must be different from X). The constraint on line11-13 is used to omit all the cases that try to move a block X on top of block Ywhen either block X is occupied by another block A, or Y is not table and isunder some block B. Line 15 propagates the changes to the state at step t atersome block are moved to a new location. Lines 16-17 state a frame axiom thata block is on the same location if it is not moved to a new one. Rule number 20in cumulative part defines a goal at time step t, and finally, the query on line21 is used to check whether the goal is satisfied at step t.

The shortest solution to the problem is move(b2,table,1) move(b1,b0,2)move(b2,b1,3) that can be checked by running iClingo on the encoding (sup-posed the encoding is saved in file blocksworld.lp): iclingo blocksworld.lp.

Some important restrictions on the encoding of incremental ASP programmust be followed to obtain well-defined incremental computation result fromiClingo. First, the sets of ground instances of head atoms appearing in thebase, cumulative and volatile parts must be pairwisely disjoint. Second, groundinstances of head atoms in volatile part must not be used in base and cumulativeparts, and of cumulative part must not be used in base part. Finally, for eachpair of distinct steps, the set of ground instances of either cumulative partsor volatile parts must be disjoint. It is obvious that the blocksworld encodingabove satisfies all these three conditions.

Lastly, although planning is a natural application domain for iClingo wherefinding a shortest path is preferred, later we will show that, proving propertiesin games that has nothing to do with planning also can benefit from using anincremental encoding.

10

Chapter 3

Proving Properties usingAnswer Set Programming

This chapter recapitulates the method of using Answer Set Programming forproving properties in games proposed in [Schiffel and Thielscher, 2009]. To un-derstand the idea, first, we need to know the way their system considers a gamedescription written in GDL - a State Transition System.

3.1 GDL rules as State Transition System

As shown in Section 2.1, a program encoded in GDL must satisfy stratifiednegation restriction that means it admits a standard model. Based on themeaning of standard model, the program can be seen as a state transition systemas follows.

Let Π be a valid GDL description. Since Π is finite, it’s obvious that Π hasa finite set of function symbols, including constants (functions with arity 0).From this set, a set of ground terms

∑can be computed. Information about

the players and the inital state can be extracted easily from the GDL rules. Thestates that are different from the initial state are encoded using the keywordtrue. Let S = f1, ..., fn be a finite subset of

∑. The current state of the

game is determined by the following set of facts:

Strue = true(f1), ..., true(fn)

The legal moves of each player can be derived as instances of legal(r,M) fromΠ and the facts that hold in current state (Strue). If terminal is derivable thenS is a terminal state. When a state is terminal, goal values for each player arethe derivable instances of goal(r,N). Let A : r1, ..., rn 7→

∑a function that

assigns to each player a move. To encode A as a joint move of all the players,the flowing set of facts is used:

Adoes = does(r1, A(r1)), ..., does(rn, A(rn))

Finally, the derivable instances of next(F ) determine the state that results fromapplying joint move A to the set Strue.

11

3.2 Proving Properties of General Games usingAnswer Set Programming

The properties of games (or game-specific knowledge) we consider here are theproperties that hold across all finitely reachable states. Since GDL game isfinite, this kind of properties in general can be detected by searching completelythrough the state transition diagram for the game. However, there are manygames with large search spaces that make this approach unpractical. Moreover,if a game is simple enough for a player to do a complete search, then the playerdoes not actually need to have game-specific knowledge since it can solve thegame by brute force search anyway.

The idea to prove properties that hold across all finitely reachable states isto reduce the proving task to a simple proof of an induction step and its basecase. Informally stated, to prove a property φ by induction, a player must dotwo component proofs:

• that φ holds in the initial state

• and that if φ holds in a state, then once all the players do a joint move, φshould also hold in the state that follows

Due to the closeness of a set of GDL rules to an Answer Set Prolog program,both in syntax and semantics, using ASP paradigm to prove properties in generalgames comes straightforward. In addition, the existence of some efficient ASPsolvers helps to put this proving task into practice.

Now, let us consider the formal description of the proof method. Let Π bea set of game rules written in GDL. Let DOM be the set of negation − freerules that determines the domains of moves and features according to Π usingpredicates mdom and fdom respectively. For p ∈ init, true, next let φp bean atom that, together with an respective set of rules, encodes the fact thatφ is satisfied in the state represented by keyword p. For instance, if M is ananswer set for φnext, then the state determined by f : next(f) ∈ M satisfiesφ according to the following definition.

Definition 3.1 Let Π be a valid GDL description with its respective set ofground terms

∑. A state property is a first-oder formula φ over a signature

whose ground atoms are from∑

. Let S ∈ 2Σ be a state in the state transitionsystem for Π (remember that S is a set of ground terms), then the notion of Ssatisfying φ (written as: S |= φ) is defined on the basis of

S |= f iff f ∈ S (where f is atomic and ground)

and with the usual definition for the logical connectives

Recall that, given a set Γ of logic formulae and let ϕ a logic formula then provingΓ |= ϕ is equivalent to prove that the set Γ ∪ ¬ϕ is UNSAT. This proving taskcan be taken with the help of an ASP solver by checking whether the programconsisting of Γ and

t :- ϕ

:- t

12

is not satisfiable (has no answer set).The two induction steps that the player must follow to prove that a property

φ holds across all reachable states is as follows.

• Show that the program consisting of G ∪DOM and

t0 :- φinit

:- t0

is UNSAT

• Assume Ω is a conjunction of properties that have been proved to holdacross all reachable states (Ω can be empty). The induction step mustshow that the program consisting of G ∪DOM and

1 : 0 true(F ) : fdom(F ).2 : Ωtrue.

3 : φtrue.

4 : 1 does(R,M) : mdom(M) 1 :- role(R). (1)5 : :- does(R,M), ¬legal(R,M).

6 : t :- φnext.

7 : :- t.

is UNSAT

In the program above, Line 1 allows an arbitrary number of instancesof features to hold in the current state. Line 2 encodes the propertiesthat have been proved to hold earlier. Line 3 axiomatizes the inductionhypothesis that the property φ holds in the current state. Line 4 requireseach player to select exactly one move and Line 5 excludes all the movesthat are not legal. Line 6-7 encodes the query whether φ holds in the nextstate. Note that, does(R,M) is usually used to define φnext.

For the correctness proof of the method, please refer to Section 4 in thepaper [Schiffel and Thielscher, 2009].

Let us now consider an example of proving the uniqueness of the third argu-ment of the feature cell(X,Y,C) in the game Tic-Tac-Toe (Listing 2.1). Assumethat, the uniqueness of the argument of the feature control(R) has been provedto hold across all the legal reachable states. Here, we are only interested in theinduction step, since the base case can be proved easily by checking whether thethird argument of cell(X,Y,C) is unique for each pair of X and Y in the initialstate (in fact, it is). The induction step of the proof is encoded in the programshown in Listing 3.1. The rules of the program are understood as the descrip-tion above. There is one thing to notice: in order to axiomatize the hypothesisthat the third argument of cell is unique in the current state, the program firstgenerates all the possible combination of cell (line 7, 20) and then preservesonly the cases that satisfy the uniqueness of the third argument (line 15 - 18).The same holds for encoding the uniqueness of the argument of control.

13

Listing 3.1: Proving the uniqueness of the third argument of cell

1 column (1; 2; 3).

2 row (1; 2; 3).

3 mark(x; o; b).

4 role(xplayer; oplayer ).

5

6 fdom(control(X)) :- role(X).

7 fdom(cell(X, Y, C)) :- column(X), row(Y), mark(C).

8

9 mdom(mark(X, Y)) :- column(X), row(Y).

10 mdom(noop).

11

12 unique_control :- 0 true(control(R)) : role(R) 1.

13 :- not unique_control.

14

15 unique_cell1(X, Y) :- 0 true(cell(X, Y, C)) : mark(C) 1,

16 column(X), row(Y).

17 unique_cell :- unique_cell1(X, Y) : column(X) : row(Y).

18 :- not unique_cell.

19

20 0 true(F) : fdom(F) .

21 1 does(R, M) : mdom(M) 1 :- role(R).

22 :- does(R, M), not legal(R, M).

23

24 legal(R, mark(X, Y)) :- true(cell(X, Y, b)), true(control(R)).

25 legal(xplayer , noop) :- true(control(oplayer )).

26 legal(oplayer , noop) :- true(control(xplayer )).

27

28 next(cell(X, Y, x)) :- does(xplayer , mark(X, Y)), true(cell(X, Y, b)).

29 next(cell(X, Y, o)) :- does(oplayer , mark(X, Y)), true(cell(X, Y, b)).

30 next(cell(X, Y, C)) :- true(cell(X, Y, C)), C != b.

31 next(cell(X, Y, b)) :- true(cell(X, Y, b)), does(R, mark(Z, T)), X != Z.

32 next(cell(X, Y, b)) :- true(cell(X, Y, b)), does(R, mark(Z, T)), Y != T.

33

34 unique_next_cell1(X, Y) :- 0 next(cell(X, Y, C)) : mark(C) 1,

35 column(X), row(Y).

36 unique_next_cell :- unique_next_cell1(X, Y) : column(X) : row(Y).

37 :- unique_next_cell.

3.3 The Automated Theorem Prover for GGP

The above proof method was implemented and integrated into the FLUX generalgame player [Schiffel and Thielscher, 2007]. The properties to be proved andthe ASP programs used to prove those properties are generated automaticallyfrom the game description. To make this work, the system has to compute thedomains of the features and moves in the game first. This was done by usingthe dependency graph of the game.

Figure 3.1 shows a portion of the dependency graph for the game Tic-Tac-Toe. Ellipses represent argument of functions or predicates while squares rep-resent function symbols and constants. There is an edge between an argumentnode and a constant (or a function symbol) if that constant occurs in the re-spective argument of a function or predicate in a rule of the game. An edge isadded between two argument nodes if there is a rule in the game such that thesame variable occurs in both arguments. Arguments in each connected compo-nent of the graph share a domain. The constants and function symbols in theconnected components are the elements of the domain.

14

For example, in the description of Tic-Tac-Toe game (Listing 2.1), rules10-12 determine that the first argument of mark is connected with the firstargument of cell since they share the same variable ?m, therefore (mark, 1)is connected with (cell, 1). From the init rules (line 4-8), we know that thethe constants 1, 2, 3 appear in the first argument of cell, therefore they arealso connected with (cell, 1).

1

2cell, 1mark, 1

3

noop

legal, 2

mark/2

Figure 3.1: A portion of a dependency graph to calculate domains of functionsand predicates in game Tic-Tac-Toe

15

Chapter 4

Optimizations

The experimental results in [Schiffel and Thielscher, 2009] show that for almostall the games, properties like those encoding the uniqueness of the argumentof control/1 or whether a game is zero-sum can be proved very fast by thesystem whereas proving more expensive properties (e.g., board) can only succeedin some games. The problem comes mainly from the fact that in many complexgames, the generated ASP programs cannot be successfully grounded because ofmemory and time limitations1. In this chapter, we will present our optimizationsthat help the system prove properties for a wider range of games and also speedup the proving process. The efforts concentrate on reducing the size of theground programs and make the ASP solver run faster.First we will discuss a way to transform a set of GDL rules to an ASP programso that the ASP program can be processed by the Potassco systems.

4.1 Weakly Restricted Variables

From the Preliminaries chapter, we have seen that, GDL rules and Answer SetProlog programs are very close w.r.t. both syntax and semantics. However,since they satisfy different syntactic restrictions, GDL rules must be adaptedto be used with ASP solvers. Recall that, the class of logic programs that aresupported by Potassco systems are λ-restricted (Section 2.2.2). The respectiverestrictions proposed upon GDL rules are Safety, Stratified Negation and Re-cursion Restriction (Section 2.1). Consider again the Example 2.1. The logicprogram consisting of rules 1-5 satisfies restrictions on GDL rules, however, itcan not be used with Potassco systems since there does not exist a mapping thatwitnesses λ-restrictedness of the program. Notice that zigzag/2 and zagzig/2appear in a cycle and use the same set of variables (considered as Weakly Re-stricted Variables (WRV) by the grounder Gringo). One possible fix is addingthe two predicates zig(X) and zag(Y ) (i.e., two domain predicates) to rule 4.The resulting program preserves the semantics (i.e., has the same answer sets)as before, but now it can be processed by Clingo. One should also pay attentionwhen transforming the special predicate distinct/2 from GDL syntax to ASPsyntax (!=). Because != is considered as negative and when guessing a mapping

1The memory limitation in the experiments was 1 GB

16

λ for predicates in a rule r, the grounder Gringo (stand-alone or embedded inClingo or iClingo) only uses predicates that appear in pos(r).

The following algorithm shows a method to transform a set of GDL rules toan ASP program that can be processed by Potassco Systems.

Given a set of GDL rules Π. Let G be the dependency graph of Π as definedin Definition 2.1. For every rule r in Π, let HeadV ar(r) be the set of variablesof the atom in the head of r, let BodyV ar(r) be the set of variables of all theatoms in the body of r, that are different from distinct and do not appear in acycle (in G) with the head atom. The set Diff(r) defined by:

Diff(r) = HeadV ar(r) \BodyV ar(r)

contains Weakly Restricted Variables of r. For each distinct variable V inDiff(r), add domain predicate for V to the body of r. The domain of V isthe domain of the corresponding argument of the head atom, which can becomputed using dependency graph (as shown in Section 3.3). Finally, add thedomain predicates for V as facts to program Π.

Now, let us return to Example 2.1 with rules 1-5. The respective dependencygraph to calculate domains for the predicates in the program is depicted inFigure 4.1. The graph shows that the second argument of zagzig shares thesame domain with the first argument of zigzag and with the unique argumentof zig; and that the domain contains two constants 0, 1. With the same method,we get the common domain of (zagzig, 1), (zigzag, 2) and (zig, 1).

zagzig, 2 zigzag, 1 0

zig, 1

zagzig, 1 zigzag, 2 1

Figure 4.1: The dependency graph to calculate domains for predicates in pro-gram Zigzag

Applying the algorithm to the Zigzag example, the first three lines are pre-served. When the rule on line 4 is processed, since zagzig and zigzag appear ina cycle and that Diff(r) = X,Y , the algorithm adds the domain predicatesfor X and Y to the body of the rule. Let dom of zagzig 1 and dom of zagzig 2be the domain predicates for the first and second argument of predicate zagzig.Line 4 will be changed to:

zagzig(Y,X)← zigzag(X,Y ), dom of zagzig 1(Y ), dom of zag zig 2(X).

And then the following facts will be added to the program:

dom of zagzig 1(0).dom of zagzig 1(1).dom of zagzig 2(0).dom of zagzig 2(1).

17

Although, the program is already λ-restricted after applying the changes above,the algorithm still process the rule on line 5 as it did with the rule on line 4.Leave the algorithm as this would help it work in general case (note that, thealgorithm only aims to fix weakly restrictedness error in case there are cyclesin the dependency graph of a game description, which happens quite often ingeneral games).

It’s obvious that the answer sets of the new program are the answer setsof the old program plus all the facts denoting domain elements, i.e., the set ofpredicates in each answer set that appear in the original program stay the same.

4.2 Current state generating

A typical programming pattern in ASP includes Generate, Define and Testparts. In which, Generate part is used to generate possible states of theworld, Define part defines new atoms based on the information introducedby Generate part. And lastly, a Test part consists of constraints to omit orpreserve all answer sets that satisfy some specific conditions. The idea is sim-ple, however, care must be taken while encoding a problem since generatingeverything and then restricting the answer sets may use up time and resources.Let us consider again the program in Listing 3.1. First, every combination ofcell and control is generated by line 6-7, including also the combinations inwhich argument of control and third argument of cell are not unique. Then,instances of cell and control are used to compute the instances of the predi-cates next and legal and therefore instances of next and legal based on notunique control and cell are also considered to find answer sets. The wronganswer sets are then omitted by the constraints in line 12-18.

The workload of the ASP solver would be reduced if the constraints areconsidered in generating states. For example, in the above program, to encodethe fact that no or only one player can have control in each state, the followingrule should be used:

0 true(control(R)) : role(R) 1.

and to generate unique cell:

0 true(cell(X,Y,C)) : mark(C) 1 :- column(X), row(Y ).

instead of the rules in line 6-7, and 12-20.

4.3 Recalculating features domains

The domains for the arguments in all the features are important for the systemto generate game states (see the previous chapter). Carefulness must be takenwhen calculating those domains since the size of the domains affect groundingand solving a program. Let us consider the following simple program:

18

Example 4.1

1 : succ(1, 2).2 : succ(2, 3).3 : q(1).4 : p(X) :- q(X), succ(X,Y ).

If we apply the method in Section 3.3 then the argument of predicate p willshare a same domain as the argument of q and the first argument of succ, thatis 1, 2 (since they use the same variable X). But it is easy to realize that thedomain of the argument of p only contains the constant 1. Knowing that somepredicates share a same domain may help a game player construct some usefulstrategies. However, in proving properties where the system has to generateall possible current states, this may badly affect the efficiency of the grounderand the solver. A simple change in domain computing will help to solve thisproblem: in order to calculate the domain for an argument in the head of arule, we get the intersection of the domains of the corresponding arguments inthe body of that rule. For example, in the program above, the domain of theargument of q is 1 and of the first argument of succ is 1, 2, then the domainfor the argument of p is the intersection of 1 and 1, 2, that is 1.

4.4 Recalculating moves domain

As stated in the previous chapter, in order for the Automated theorem proverto work, the system uses a dependency graph to calculate the domain of allpossible moves. And then, the calculated move domain (mdom) will be used in(1) to generate moves for individual players in each state. Notice that mdomcontains all the possible instances of actions (which are the second argumentsof predicate legal) supplied with every elements in the domains of their argu-ments. For example, in the game Tic-Tac-Toe, mdom contains noop and everyinstances of mark(X,Y ) with X,Y ∈ 1, 2, 3.This method turns out to be feasible for single player games and for multi-playergames that are symmetric w.r.t. the legal moves of all the players. However,for the games that define different restrictions on legal moves of each player,this method will generate many illegal moves that can never happen in a realmatch. For example, in chess games, player white is only allowed to movewhite pieces and player black is only allowed to control black pieces. Whereas,with the mdom calculated and used as the method above, player white canmanipulate also all the black pieces and vice versa, which is incorrect. Al-though all the answer sets that contain illegal moves will be omitted (by therule :- does(R,M),¬legal(R,M).) letting unreal information to be generatedwould make the ground program and the search space of the ASP programbecome really large. In addition, game descriptions normally restrict variablesin rules that determine legal moves (e.g., pieces in chess games can not jumparbitrarily, they have their own style of moving); this should also be consideredto reduce the domain of moves.

19

Algorithm 1 Algorithm to make static domain from game rules

function make static domain rules(PName/Arity, GameRules)if PName/Arity has been processed then

return [ ]else

remember PName/ArityRules ← get every rule in GameRules that has PName/Arity in its headStaticRules = [ ]for every rule R in Rules do

if R is a fact thenadd static R to StaticRules

else(Head :- Tail) = RDomRules = [ ]NewTail = [ ]for every literal L in Tail do

DomPreds = [ ]if L is static then

add L to NewTailelse if L is positive then

if L is true(. . . ) thenDomPreds ← get domain predicates for every variables in Ladd predicates in DomPreds to NewTail

elseNP/NA ← LNP and NA is predicate name and arity of LDomPreds ← make static domain rules(NP/NA, GameRules)add static L to NewTail

end ifif NewTail 6= [ ] then

add (static Head :- NewTail) to DomRuleselse

add static Head to DomRulesend if

end ifadd DomPreds to DomRules

end forend ifadd DomRules to StaticRules

end forreturn StaticRules

end ifend function

We propose here a different method to capture legal moves of players ingeneral games. The idea is to compute static moves domain recursively fromlegal rules. Notice that we can make a static rule from a game rule by firstpreserving all the static literals (both positive and negative literals that do notdepend on the true predicate) and deleting all other negative literals in its body.

20

For every literal that is positive, if it has the form of true(. . . ), we replace itwith the predicate domains for all the variables appearing in it. The predicatedomains can be computed using dependency graph and are added to the gamerules as facts. Finally, if the literal depends on the true predicate and thepredicate of which hasn’t been processed so far, we gather from the game rulesall the rules that have the predicate in the literal as their heads and processthem recursively as before.

The calculating process is shown in Algorithm 1. The initial inputs to thefunction make static domain rules are legal/2 and the game rules. The resultof the process consists of static rules for legal/2 and static rules for the predicatesof the positive literals that appear in the bodies of the legal rules. All thepredicates that were “made” static are added a prefix static to their names inorder to differentiate them from the old predicates.

The moves domain calculated as above are now player-dependent. We needto change the way to use it in the program (1). The rule 4 will be changed to:

4 : 1 does(R,M) : static legal(R,M) 1 :- role(R).

Recall that, in the program (1), does(R,M) is usually used to define φnext.Therefore, instances of φnext that depend on illegal actions are also generated.The wrong answer sets are only omitted later with the rule 5. This wastes timeand memory. We can eliminate this waste by adding the following rule

legal does(R,M) :- legal(R,M), does(R,M).

to the program and replace every occurrence of does(R,M) in the rules thatdefine φnext with legal does(R,M).

4.5 Using Incremental ASP

To encode the induction step of the proof method, we said that an arbitrarynumber of instances of features could hold in a current state (Section 3.2), i.e.,a state does not necessarily contain instances of all the features. Each featureis generated by the rules described in Section 4.2. In many games, there arecases that the uniqueness of features can be proved to be false using only somedependent features, not all. This characteristics naturally lead to an idea ofusing incremental ASP program in which the rules to generate features areintroduced step by step - which can be done with the help of the incrementalconstant. The ASP solver then is configured to run the ASP program until itfinds an answer set.

Recall that, an incremental ASP program consists of three parts: base partincludes all static data that hold in all steps, cumulative part contains data thatdepend on each step and volatile part encodes queries. Applying to provingproperties in general games, data that are static (does not depend on truepredicate) will be put in the base part, the rules that generate features andthe rules that have head atoms which depend on true predicate build up thecumulative part. And lastly, the query is placed in the volatile part.

Listing 4.1 shows an incremental description of a program to prove theuniqueness of the third argument of the feature cell in game Tic-Tac-Toe.Notice the use of the incremental constant k in the program. The rules to gen-erate feature control and cell (14-15) is introduced in different steps: control

21

in step 1 (k == 1) and cell in step 2 (k == 2). The constant k is also addedto the atoms that appear in the head of all the rules in the cumulative and thevolatile parts (except the atoms appearing in generation rules 14-15). Thedescription satisfies the three restrictions on encoding an incremental ASP pro-gram presented in Section 2.2.3. First, the set of ground instances of head atomsappearing in the three parts are pairwisely disjoint. Ground instances of headatoms in volatile part are not used in the other parts and of cumulative partare not used in base part. And finally, since every atom appearing in the headof cumulative and volatile parts are added a constant k, the set of groundinstances of either part is obviously disjoint for each pair of distinct steps.

Listing 4.1: Incremental encoding of the program in Listing 3.1

1 #base.

2 column (1; 2; 3).

3 row (1; 2; 3).

4 mark(x; o; b).

5 role(xplayer; oplayer ).

6

7 mdom(R, mark(X, Y)) :- column(X), row(Y), role(R).

8 mdom(xplayer , noop).

9 mdom(oplayer , noop).

10

11 1 does(R, M) : mdom(R, M) 1 :- role(R).

12

13 #cumulative k.

14 0 true(control(R)) : role(R) 1 :- k==1.

15 0 true(cell(X, Y, C)) : mark(C) 1 :- column(X), row(Y), k==2.

16

17 legal_does(R, M, k) :- legal(R, M, k), does(R, M).

18 legal(R, mark(X, Y), k) :- true(cell(X, Y, b)), true(control(R)).

19 legal(xplayer , noop , k) :- true(control(oplayer )).

20 legal(oplayer , noop , k) :- true(control(xplayer )).

21

22 next(cell(X, Y, x), k) :- legal_does(xplayer , mark(X, Y), k),

23 true(cell(X, Y, b)).

24 next(cell(X, Y, o), k) :- legal_does(oplayer , mark(X, Y), k),

25 true(cell(X, Y, b)).

26 next(cell(X, Y, C), k) :- true(cell(X, Y, C)), C != b.

27 next(cell(X, Y, b), k) :- true(cell(X, Y, b)),

28 legal_does(R, mark(Z, T), k), X != Z.

29 next(cell(X, Y, b), k) :- true(cell(X, Y, b)),

30 legal_does(R, mark(Z, T), k), Y != T.

31

32 #volatile k.

33 :- does(R, M), not legal(R, M, k).

34 unique_next_cell1(X, Y, k) :- 0 next(cell(X, Y, C), k) : mark(C) 1,

35 column(X), row(Y).

36 unique_next_cell(k) :- unique_next_cell1(X, Y, k) : column(X) : row(Y).

37 :- unique_next_cell(k).

In case there is more than one feature having the same predicate that needto hold in a current state, they must be introduced in the same step (same valuefor k).

Since it is not clear what is the good order to introduce the features (whichmay affect the efficiency of the solver), our system sets the step numbers for thefeatures randomly.

The program above can be run with iClingo using option --imax=2 (2 isthe maximum value of k in this program).

22

4.6 State abstracting

It usually happens that, in order to prove a property, we need to prove somedependent properties first. For example, in Tic-Tac-Toe if the argument of thefeature control has not been proved to be unique before, the uniqueness of thethird argument of cell would have been proved to be false. If no argument ofa dependent feature is unique, or if the uniqueness of all the arguments of thatfeature can not be proved due to the lack of time or memory, the rule to generatethat feature will have the form of:

0 true(feature(. . . )) :- tail(. . . ) (2)

(tail(. . . ) represents the conjunction of all the literals in the tail of the rule, ifit has). Notice that the weight atom above doesn’t have an upper bound andmeans that every combination of true(feature(. . . )) can hold or not hold ina current state. Therefore it has the same meaning with the following weightatom:

0 true(feature(...)) : tail(. . . ) and that means an ASP solver must consider every subset of the set consisting ofall the combinations of true(feature(. . . )). Let |N | be the number of instancesof tail(. . . ) then the number of cases that an ASP solver must consider is 2|N |.

Let Π be a set of ASP rules and G be the dependency graph of it. There isan edge from a predicate q to a predicate p in G if p appears in the head andq appears in the body of a rule in Π. That edge is labeled with not if q is in anegative literal (which contains a default negation not). A path from v to u is asequence of edges such that starting from v, following the edges in the sequence,we can reach u. We say that u is strictly positive dependent on v if there is nopath from v to u that contains an edge labeled with not. It’s obvious that, ifwe add more facts containing v to the program, the instances of u in the answersets of the old program will be preserved in the corresponding answer sets ofthe new program. This is not true if u is negatively dependent on v (if there isa path from v to u that contains an odd number of not edges). Consider thefollowing example:

Example 4.2

1 : p(1). p(2).2 : r(1).3 : q(X) :- p(X), not r(X).

The only answer set of the program above is p(1), p(2), r(1), q(2). q isstrictly positive dependent on p. If we add a fact p(3) to the program, theanswer set of the new program is p(1), p(2), p(3), r(1), q(2), q(3) thatpreserves the instance q(2). q is negatively dependent on r. If we add r(2) tothe program then the answer set now is p(1), p(2), r(1), r(2) that doesn’tcontain q(2).

Now we can claim that, in proving the uniqueness of some argument of afeature f1, if f1 is strictly positive dependent on a feature f2 and if in theprogram to prove the uniqueness of f1, f2 is generated by the rule (2), then wecan abstract that rule to:

true(f2(. . . )) :- tail(. . . )

23

The rule lets every combination of true(f2(. . . )) to hold in a current state.Therefore, instead of considering 2|N | cases, an ASP solver now only considersone case. Notice that, this case is one of the 2|N | cases and since f1 is strictlypositive dependent on f2, abstracting every other case to this case, i.e., addingmore instances of f2 to those cases, will only adding more (or no) instances of f1to the answer sets. Therefore, reducing the problem of proving the uniquenessof some argument of the feature f1 where an arbitrary number of instances oftrue(f2(. . . )) holds to the problem in that every combination of true(f2(. . . ))holds is sufficient.

The claim above still holds in case f1 is positively dependent on f2 (forevery path from f2 to f1, there is only zero or an even number of edges labeledwith not), however, this case rarely happens in general game rules and thereforeour system does not consider it.

4.7 Running Clingo and iClingo in parallel

As we will see in the experimental results in the next chapter, there are casesthat for the same program, running the normal encoding is much faster than theincremental encoding and vice versa. There are also cases that the ASP solverscan not run either the normal or incremental encoding of a program due to thelimitation of time or memory. This leads to an idea of running ASP solverson both the encodings of a program simultaneously, i.e., running Clingo andiClingo at the same time. When either of the two solvers successfully proves thecorrectness or incorrectness of a property, the whole process is terminated. SinceCPU with more than two cores are popular nowadays, running two processes atthe same time is feasible.

The flowchart of a small program that controls the execution of Clingo andiClingo simultaneously is depicted in Figure 4.2. First, Clingo and iClingoare executed and forced to run in background. The program continues withchecking whether either of the two processes is still running. If yes, the totalmemory used by both processes are calculated. If the total used memory exceedsa limit (here is 1 GB), either the iClingo process is killed and a variable restartis set to yes to mark this action or the Clingo process is killed if it is the onlyprocess that is still running. If no, our program will check whether Clingoor iClingo has finished its work successfully (the property has been proved tobe true or false) by reading their log files. If yes, the program prints out theresult and stops; otherwise, it returns to the beginning of the loop, i.e., checkingwhether Clingo or iClingo is still running.

Normally, the three checking conditions in this loop are executed very fast.To prevent the program from reading the log files too often (which needs toaccess the hard drive), the program is set to sleep for 0.5 second whenever itreturns to the first checking condition.

If sometime in the execution, both processes are killed or if the proving taskswas so easy that both processes stopped before the first condition is checked,the above loop is terminated. Our program continues with checking whetherClingo was killed (i.e., it ran out of memory) and the variable restart was setto yes (i.e., iClingo was killed before). If the condition is satisfied, iClingowill be restarted, otherwise, the program will print out the last result.

24

Start

run Clingo& iClingo

in bg

Clingo /iClingo stillrunning?

totalMem used≥ 1GB?

(killiClingo,

restart=yes)/kill

Clingo

propertyproved?

print resultClingo

killed &restart=yes?

run iClingo

Stop

yes

no

yes

no

no

yes

yes

no

Figure 4.2: System flowchart

25

If the memory limitation is not strict, in the case the total memory usedby both processes exceeds a threshold, instead of killing iClingo, we just tem-porarily stop it (to keep its state in the memory) and wake it up later if needed.This would help save time since iClingo doesn’t have to compute from thebeginning if it is requested to continue its work.

The program above can be easily implemented in any shell programminglanguage.

4.8 Removing function symbols from true andnext

Another small change we have done to ASP programs that we should also men-tion here is removing function symbols from the two predicates true and next.Notice that all of the true predicates have the form of true(feature(. . . )) andwe can simplify them to true feature(. . . ). The same change is also applied tothe next predicates. This change helps reduce the size of the groundings of theASP programs though only a small amount.

26

Chapter 5

Experiments

In this chapter, we will discuss our experimental results.

5.1 Comparing proving time

All of the optimizations we presented in chapter 4 have been integrated into thesystem of [Schiffel and Thielscher, 2009]. The experiments were conducted inthe same way as they did and using a variety of games in [Games Repository].The time used for the whole proving process for each game was limited to 30minutes (to prove all the uniqueness of arguments of features and zero-sum).

Five running configurations were experimented:

• Nonoptimized version as in [Schiffel and Thielscher, 2009] (nonoptimized)

• Optimizations on normal encodings of ASP programs (non-incremental)(opClingo)

• The same as the above configuration but with SatELite-like preprocessingenabled (helps to reduce the size of an ASP program through variable andclause elimination, see [Martin et al., 2008] and [N. Een and A. Biere, 2005]for more details) (opClingo-SatELite)

• Optimizations on incremental encodings of ASP programs (opiClingo)

• Running the last two configurations in parallel (parallel)

Some results are shown in Table 5.1. The experimental results consist of:

• Proving the uniqueness of the argument of the control feature (providedthat the feature exists) in every state

• Proving the uniqueness of the contents of a board cell (boards were iden-tified manually) given that the argument of control has been proved to beunique

• Proving that the game is a zero-sum game (i.e., the goal value assigned toall players add up to 100 in every terminal state)

27

Properties nonoptimized opClingo opClingo-SatELite opiClingo parallel

amazons variantcontrol yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MG MG MG MG MGzero-sum yes, 0.31 yes, 0.29 yes, 0.29 yes, 0.31 yes, 0.35checkers variantcontrol yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MG MS MP MS MSpiece count MG TO, 671.02 yes, 125.30 TO, 694.01 yes, 143.76zero-sum yes, 1.14 yes, 0.52 yes, 0.69 yes, 3.61 yes, 0.79chinesecheckers4control yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board yes, 264.03 yes, 95.15 yes, 34.36 yes, 106.50 yes, 40.35zero-sum no, 21.46 no, 19.97 no, 14.94 no, 39.67 no, 16.44chinesecheckers6control yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MS yes, 224.99 yes, 46.07 MS yes, 51.85zero-sum no, 31.20 no, 29.71 no, 21.65 no, 26.54 no, 23.13cubicupcontrol yes, 5.54 yes, 0.79 yes, 0.11 yes, 0.18 yes, 0.11board TO, 883.90 yes, 7.53 yes, 5.54 yes, 7.23 yes, 7.66zero-sum yes, 2.22 yes, 0.06 yes, 0.06 yes, 0.06 yes, 0.06endgame variantcontrol yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MG MS MP MS MSzero-sum yes, 0.09 yes, 0.04 yes, 0.03 yes, 0.04 yes, 0.04knightthroughcontrol yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board yes, 49.41 yes, 20.55 yes, 8.27 yes, 13.61 yes, 8.95zero-sum yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00knightwarboard MG no, 0.55 no, 31.80 no, 0.52 no, 0.63score MG TO, 872.02 yes, 28.64 yes, 740.93 yes, 32.37zero-sum no, 6.4 no, 5.24 no, 7.94 no, 10.65 no, 8.11minichesscontrol yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.01 yes, 0.00board MP yes, 2.14 yes, 0.96 yes, 1.90 yes, 1.01zero-sum yes, 0.01 yes, 0.01 yes, 0.01 yes, 0.01 yes, 0.01othello variantcontrol yes, 0.11 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MG MG MG MG MGzero-sum MG MG MG MG MGpacman3pcontrol yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board no, 0.18 no, 0.17 no, 1.35 no, 0.15 no, 0.16zero-sum no, 0.07 no, 0.05 no, 23.73 no, 0.05 no, 0.05pawntoqueencontrol yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MG yes, 237.72 yes, 128.12 MS yes, 222.65zero-sum yes, 0.07 yes, 0.02 yes, 0.02 yes, 0.07 yes, 0.02quartoboard no, 2.22 no, 1.79 no, 39.36 no, 2.32 no, 2.63zero-sum yes, 2.77 yes, 2.01 yes, 191.11 yes, 2.12 yes, 2.28tictactoecontrol yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00zero-sum yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00tictactoex9control yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board yes, 829.79 yes, 14.15 yes, 0.84 yes, 4.56 yes, 0.87zero-sum yes, 0.02 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00tttcc4 variantcontrol yes, 0.01 yes, 0.00 yes, 0.00 yes, 0.00 yes, 0.00board MG MS MP no, 19.50 no, 19.72zero-sum no, 0.01 no, 0.01 no, 0.00 no, 0.00 no, 0.00wargame01board MS yes, 765.35 yes, 39.08 TO, 894.05 yes, 43.00

Table 5.1: Comparing proving time (in second) for some properties in a vari-ety of games. Experiments were run on an Intel Core 2 Duo - 3.16GHz withRAM limited to 1GB. M means Out Of Memory Error, G - Grounding, P -Preprocessing, S - Solving. TO means TimeOut while solving

Note that, in the Table 5.1, the games that have the suffix variant weremodified to be run with the nonoptimized system since it doesn’t fix the errorof Weakly Restricted V ariables while transforming from the GDL rules to ASPrules.

The symbols in the table are understood as follows: M means the systemencountered an Out of Memory Error. Since Clingo, iClingo process an ASPprogram through three phases: grounding, processing and solving, the errorsthat occur in the phases are symbolized by MG, MP and MS respectively.TO means Time Out, i.e., the system could not finish its work in the runningtime that was set to it. yes (no) indicates that the property was proved to betrue (false). And finally, every value appearing in the table is in second.

From the table, we can see that, the optimizations we have done so far helpto speed up the system dramatically in the majority of cases. For example, thenonoptimized configuration takes 264 seconds to prove the uniqueness of the con-tent of a board cell in chinesecheckers4 while the time needed for opClingoand opiClingo are 95.15 seconds and 106.50 seconds respectively. Or even muchbetter in the same proving for the game tictactoex9: nonoptimized takes829.79 seconds while the time taken by opClingo and opiClingo are only 14.15seconds and 4.56 seconds respectively. In many cases that the nonoptimizedsystem uses up 1 GB of RAM when grounding, processing or solving, the op-timized systems still run very efficiently, e.g., proving the feature piece countin checkers, board cell in minichess. There are still some cases that even theoptimized systems cannot ground the ASP programs, for instances, proving theuniqueness of the content of a board cell in amazons or othello.

The SatELite-like preprocessing enabled system (opClingo-SatELite) worksimpressively in many cases, e.g., proving the uniqueness of an argument of thepiece count feature in checkers or of the score feature in knightwar. However,it performs really poor in the games pacman3p and quarto. It’s also a good ideato try the SatELite-like preprocessing option with iClingo, unfortunately, thiscombination had some bugs at the time we did the experiments.

It is not clear which of the two systems opClingo and opiClingo per-forms better. Here we choose the opiClingo system to run in parallel withthe opClingo-SatELite system. And it seems that they are good complementsto each other. The running time of this parallel system is usually slightly greaterthan the smaller running time of the two component systems (except the casewhich proves the uniqueness of the content of a board cell in pawntoqueen). Thiscomes from the fact that running the two systems simultaneously exhaustivelyuses the power of both cores of the CPU.

5.2 Comparing grounding time and the size ofground programs

In order to learn the influences of the optimizations on grounding a program,we did some comparisons among the nonoptimized, opClingo and opiClingoconfigurations w.r.t. to grounding. The grounding time and the size of theground programs corresponding to the ASP programs proving the uniquenessof the content of a board cell in a collection of games are presented in the Table5.2. To ground a non-incremental program, we simply used Gringo. To ground

29

an incremental program, we used Gringo with the option --ifixed=k with k isthe number of steps in that program (see Section 4.5). The memory was limitedto 1 GB. No time limitation was set.

Games nonoptimized opClingo opiClingo

checkers variant MG, 16.06 s 19.06 MB, 12.74 s 20.67 MB, 4.66 schinesecheckers6 47.58 MB, 8.72 s 2.94 MB, 0.59 s 3.37 MB, 0.67 sendgame variant 27.25 GB, 74.42 m 18.47 MB, 3.46 s 18.48 MB, 3.40 sknightthrough 24.54 MB, 4.77 s 0.93 MB, 0.32 s 0.97 MB, 0.25 sminichess 205.15 MB, 38.86 s 0.36 MB, 0.20 s 0.38 MB, 0.14 spawntoqueen 121.28 GB, 420.00 m 3.39 MB, 3.60 s 3.65 MB, 0.81 stictactoex9 8.65 MB, 1.88 s 0.79 MB, 0.23 s 0.80 MB, 0.20 s

Table 5.2: Comparing the size of the ground programs (MG/GB) and thegrounding time (second/minute)

Each pair of values in the table represents the size of the ground programand the time used for grounding respectively. The experimental results showthat, for every case, the time to ground an optimized program and the size ofthe respective grounding is extremely smaller than the time to ground and thesize of the ground program of the non-optimized version.

5.3 Order of program rules

Some of our experiments show that the running time of the solvers in thePotassco package are very sensitive to the order of the rules in an ASP pro-gram (whereas the grounding time is stable). For example, Table 5.3 showsthe running time of the programs proving that knightthrough is a board gamewith two configurations nonoptimized and opClingo. The rules in the pro-grams were randomly reordered with the help of the shell command sort -R.Each configuration was run five times.

nonoptimized 56.95 44.46 66.10 59.54 45.06opClingo 20.05 27.30 19.50 23.91 20.79

Table 5.3: Comparing proving time (in second) when the rules in the programare randomly reordered

The same happens to an incremental ASP program when we change the or-der in which the features are generated. To find out which order of the rules inan ASP program is a good order needs more researches.

Finally, we should mention that, the time to apply all of the optimizationspresented in chapter 4 to an ASP program is not remarkable compared to thetime needed to solve that program.

30

Chapter 6

Conclusion

This chapter will summarize our work and discuss some potential future work.

This project aims to optimize the Automated Theorem Proving system in[Schiffel and Thielscher, 2009]. We have shown that by changing the way thecurrent states are generated, recalculating the feature domains and moves do-main, the size of the ground programs of the ASP programs used to prove theextracted properties in general games can be reduced dramatically. We alsoproposed a method to transform a set of GDL rules into a set of ASP rulesthat can be processed by the ASP systems in the [Potassco] package. The useof incremental ASP encoding in proving properties in general games was alsodiscussed. Our experiments show that executing the normal and incrementalASP solvers simultaneously can help solve many problems that either of the twosystems cannot do when they are executed separately (given the same limita-tions on time and memory usage).

Future workAlthough the optimized system can prove the properties in a wider range of

games, it still doesn’t succeed in a subset of games when the grounding of theASP programs become too large. We propose here some methods to deal withthis problems.

• Section 5.3 shows that the order of the rules in an ASP program sensitivelyeffect the efficiency of the Potassco ASP solvers. To find out a good orderof the rules that speed up the solvers for every ASP program may bedifficult. However, given that we only consider the use of the solversin proving properties in general games where every game has a similarstructure, finding a good order of the rules in general case is practical.

• A new bottom-up approach to solving ASP programs with lazy ground-ing in [A. Dal Palu et al., 2008] or a top-down, goal-directed approach tocredulous reasoning in ASP described in [P. A. Bonatti et al., 2008] mayhelp to solve the ASP programs when grounding is impossible.

• [L. Tari et al., 2005] described a declarative language that allowed an ASPmodule to import processed answer sets from another ASP module. There-fore, this language allows an ASP program to be broken into some modules

31

that may be easier to solve. When the answer sets of all the modules havebeen computed, they are combined in one ASP program to calculate thelast answer sets of the originated program. This method may help improveour system.

32

Bibliography

[A. Dal Palu et al., 2008] A. Dal Palu et al., GASP: Answer Set Programmingwith Lazy Grounding, In LaSh 2008: Logic And Search Computation of struc-tures from declarative descriptions, 2008

[ASPCom, 2007] ASP Solver Competition - Postdam 2007,http://asparagus.cs.uni-potsdam.de/contest

[DLV] The DLV Project, http://www.dbai.tuwien.ac.at/proj/dlv

[E. Erdem, 2002] E. Erdem, The Blocks World. Available at:http://people.sabanciuniv.edu/esraerdem/ASP-benchmarks/bw.html

[Games Repository] Games Repository on TU-Dresden GGP Server,http://euklid.inf.tu-dresden.de:8180/ggpserver/public/

show games.jsp

[Gelfond, 2008] M. Gelfond. Answer sets, Handbook of Knowledge Representa-tion, p. 285-316, Elsevier, 2008

[Kaiser, 2007] D. M. Kaiser, Automatic Feature Extraction for AutonomousGeneral Game Playing Agents, Florida International University

[Kuhlmann et al., 2006] G. Kuhlmann. et al., Automatic Heuristic Construc-tion in a Complete General Game Player, The University of Texas at Austin

[Love et al., 2008] N. Love et al., General Game Playing: Game DescriptionLanguage Specification, Stanford University, 2008. Available at:

http://games.stanford.edu

[L. Tari et al., 2005] L. Tari, C. Baral and S. Anwar, A Language for ModularAnswer Set Programming: Application to ACC Tournament Scheduling, 2005

[Martin et al., 2008] Martin et al., A User’s guide to gringo, clasp, clingo

and iclingo, 2008

[Niemela et al., 1999] I. Niemela, P. Simons, and T. Soininen, Stable ModelSemantics of Weight Constraint Rules, Lecture Notes in Computer Sciences,Springer-Verlag

[N. Een and A. Biere, 2005] N. Een and A. Biere, Effective preprocessing inSAT through variable and clause elimination, Proc. of the Eigth InternationalConference on Theory and Applications of Satisfiability Testing, p. 61-75,2005

33

[Potassco] Potassco, the Postdam Answer Set Solving Collection,http://potassco.sourceforge.net

[P. A. Bonatti et al., 2008] P. A. Bonatti, E. Pontelli and T. Son, CredulousResolution for Answer Set Programming, AAAI, 2008

[Schiffel and Thielscher, 2007] S. Schiffel, M. Thielscher, Fluxplayer: A success-ful general game player. In Proc. of AAAI, p. 1191-1196, 2007

[Schiffel and Thielscher, 2009] S. Schiffel, M. Thielscher, Automated TheoremProving for General Game Playing

[Smodels] Computing the Stable Model Semantics,http://www.tcs.hut.fi/Software/smodels

[van der Hoek et al., 2007] W. van der Hoek, J. Ruan, and M. Wooldridge,Strategy logics and the game description language, Proc. of the Workshopon Logic, Rationality and Interaction, Beijing 2007

34