speeding up the calculation of heuristics for heuristic search-based planning

8
Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning Yaxin Liu, Sven Koenig  and  David Furcy College of Computing Georgia Institute of Technology Atlanta, GA 30312-0280 yxliu,skoenig,dfurcy  @cc.gatech.edu Abstract Heuris tic sea rch-based plan ners, suc h as HSP 2.0, solve STRIPS-style planning problems efciently but spend about eighty per cent of the ir plan ning time on calc ulat ing the heuristic values. In this paper, we systematically evaluate al- ternative methods for calculating the heuristic values for HSP 2.0 and demonstrate that the resulting planning times differ subst antially . HSP 2.0 calcula tes each heuristic valu e by sol v- ing a relaxed planning problem with a dynamic programming metho d similar to value iteratio n. We identify two different approaches for speeding up the calculation of heuristic val- ues, namely to order the value updates and to reuse infor- mation from the calculation of previous heuristic values. We then show how these two approaches can be combined, re- sulting in our PINCH method . PINCH outper forms both of the other approaches individually as well as the methods used by HSP 1.0 and HSP 2.0 for most of the large plann ing prob- lems tested. In fact, it speeds up the planning time of HSP 2.0 by up to eighty perc ent in sever al domains and , in gener al, the amount of savings grows with the size of the domains, allow- ing HSP 2.0 to solve larger planning problems than was pos- sible before in the same amount of time and without changing its overall operation. Introduction Heuristic search-based planners were introduced by (Mc- Dermott 1996) and (Bonet, Loerincs, & Geffner 1997) and are now very popular. Se ver al of the m entered the sec- ond planning competition at AIPS-2000, including HSP 2.0 (Bonet & Geffner 2001a), FF (Hoffmann & Nebel 2001a), GRT (Ref anidi s & Vlaha vas 2001), and AltAlt (Nguye n, Kambhampati, & Nigenda 2002). Heuri sti c search-ba sed planners perform a heuristic forward or backward search in the space of world states to nd a path from the start state to a goal state. In thi s pape r , we study HSP 2.0, a promi nent heurist ic sear ch-ba sed planner that won one of four honorable mentions for overall exceptional perfor- mance at the AIPS-2000 planni ng competi tion. It was one of the rst planners that demonstrated how one can obtain informed heuristic values for STRIPS-style planning prob- lems to make planning tra ctabl e. In its defa ult congu ra- tion, it uses weighted A* searches with inadmissible heuris- tic values to perform forward searches in the space of world Copyright  c  2002, American Associatio n for Articial Intelli- gence (www.aaai.org). All rights reserved. states. However , the calculati on of heuristic va lues is time- consuming since HSP 2.0 calculates the heuristic value of each state that it encounters during the search by solving a rela xed planning problem. Conse quent ly , its calculat ion of heuristic values comprises about 80 percent of its plan- ning time (Bonet & Gef fne r 2001b). Thi s sugges ts that one might be able to speed up its planning time by speed- ing up its calcul ation of heuris tic valu es. Some heuri stic search-based planners, for example, remove irrelevant op- erat ors befor e calc ulati ng the heuri stic values (Hof fmann & Nebel 2001b) and other planners cache information ob- taine d in a prepr ocess ing phase to simp lify the cal culat ion of all heuristic values for a given planning problem (Bonet & Geffner 1999; Refanidis & Vlahavas 2001; Edelkamp 2001; Nguyen, Kambhampati, & Nigenda 2002). Different from these approaches, we speed up HSP 2.0 without changing its heurist ic values or overall oper ation. In this paper, we study whether different methods for calculat- ing the heuristic values for HSP 2.0 result in different plan- ning times. We syst emat ical ly evalua te a lar ge number of different methods and demonstrate that, indeed, the result- ing planning times differ subst antially . HSP 2.0 calculates each heuristic value by solving a relaxed planning problem with a dynamic programming method similar to value iter- ation . W e ident ify two differ ent approac hes for speeding up the calculation of heuristic values, namely to order the value updates and to reuse information from the calculation of previous heuristic values by performing incremental cal- culat ions. Each of these approac hes can be impl ement ed indi vidua lly in rathe r str aight forward ways. The question arises, then, how to combine them and whether this is ben- ecia l. Since the approach es cannot be combined easi ly , we devel op a new method. The PINCH (Pri oritized, IN- Cremental Heuristics calculation) method exploits the fact that the relaxed planning problems that are used to deter- mine the heuristic values for two different states are similar if the stat es are simil ar . Thus , we use a met hod from al - gorithm theory to avoid the parts of the plan-construction process that are identical to the previous ones and order the rema ining val ue update s in an efcient way . We d emons trat e that PINCH outperforms both of the other approaches indi- vidually as well as the methods used by HSP 1.0 and HSP 2.0 for most of the large plan ning probl ems test ed. In fact, it speeds up the planning time of HSP 2.0 by up to eighty

Upload: edirlei

Post on 02-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 1/8

Speeding Up the Calculation of Heuristicsfor Heuristic Search-Based Planning

Yaxin Liu, Sven Koenig   and  David FurcyCollege of Computing

Georgia Institute of TechnologyAtlanta, GA 30312-0280

  yxliu,skoenig,dfurcy    @cc.gatech.edu

Abstract

Heuristic search-based planners, such as HSP 2.0, solveSTRIPS-style planning problems efficiently but spend abouteighty percent of their planning time on calculating theheuristic values. In this paper, we systematically evaluate al-ternative methods for calculating the heuristic values for HSP2.0 and demonstrate that the resulting planning times differsubstantially. HSP 2.0 calculates each heuristic value by solv-

ing a relaxed planning problem with a dynamic programmingmethod similar to value iteration. We identify two differentapproaches for speeding up the calculation of heuristic val-ues, namely to order the value updates and to reuse infor-mation from the calculation of previous heuristic values. Wethen show how these two approaches can be combined, re-sulting in our PINCH method. PINCH outperforms both of the other approaches individually as well as the methods usedby HSP 1.0 and HSP 2.0 for most of the large planning prob-lems tested. In fact, it speeds up the planning time of HSP 2.0by up to eighty percent in several domains and, in general, theamount of savings grows with the size of the domains, allow-ing HSP 2.0 to solve larger planning problems than was pos-sible before in the same amount of time and without changingits overall operation.

IntroductionHeuristic search-based planners were introduced by (Mc-Dermott 1996) and (Bonet, Loerincs, & Geffner 1997) andare now very popular. Several of them entered the sec-ond planning competition at AIPS-2000, including HSP 2.0(Bonet & Geffner 2001a), FF (Hoffmann & Nebel 2001a),GRT (Refanidis & Vlahavas 2001), and AltAlt (Nguyen,Kambhampati, & Nigenda 2002). Heuristic search-basedplanners perform a heuristic forward or backward searchin the space of world states to find a path from the startstate to a goal state. In this paper, we study HSP 2.0,a prominent heuristic search-based planner that won one

of four honorable mentions for overall exceptional perfor-mance at the AIPS-2000 planning competition. It was oneof the first planners that demonstrated how one can obtaininformed heuristic values for STRIPS-style planning prob-lems to make planning tractable. In its default configura-tion, it uses weighted A* searches with inadmissible heuris-tic values to perform forward searches in the space of world

Copyright   c    2002, American Association for Artificial Intelli-

gence (www.aaai.org). All rights reserved.

states. However, the calculation of heuristic values is time-consuming since HSP 2.0 calculates the heuristic value of each state that it encounters during the search by solvinga relaxed planning problem. Consequently, its calculationof heuristic values comprises about 80 percent of its plan-ning time (Bonet & Geffner 2001b). This suggests thatone might be able to speed up its planning time by speed-ing up its calculation of heuristic values. Some heuristicsearch-based planners, for example, remove irrelevant op-erators before calculating the heuristic values (Hoffmann& Nebel 2001b) and other planners cache information ob-tained in a preprocessing phase to simplify the calculation of all heuristic values for a given planning problem (Bonet &Geffner 1999; Refanidis & Vlahavas 2001; Edelkamp 2001;Nguyen, Kambhampati, & Nigenda 2002).

Different from these approaches, we speed up HSP 2.0without changing its heuristic values or overall operation. Inthis paper, we study whether different methods for calculat-ing the heuristic values for HSP 2.0 result in different plan-ning times. We systematically evaluate a large number of different methods and demonstrate that, indeed, the result-

ing planning times differ substantially. HSP 2.0 calculateseach heuristic value by solving a relaxed planning problemwith a dynamic programming method similar to value iter-ation. We identify two different approaches for speedingup the calculation of heuristic values, namely to order thevalue updates and to reuse information from the calculationof previous heuristic values by performing incremental cal-culations. Each of these approaches can be implementedindividually in rather straightforward ways. The questionarises, then, how to combine them and whether this is ben-eficial. Since the approaches cannot be combined easily,we develop a new method. The PINCH (Prioritized, IN-Cremental Heuristics calculation) method exploits the factthat the relaxed planning problems that are used to deter-

mine the heuristic values for two different states are similarif the states are similar. Thus, we use a method from al-gorithm theory to avoid the parts of the plan-constructionprocess that are identical to the previous ones and order theremaining value updates in an efficient way. We demonstratethat PINCH outperforms both of the other approaches indi-vidually as well as the methods used by HSP 1.0 and HSP2.0 for most of the large planning problems tested. In fact,it speeds up the planning time of HSP 2.0 by up to eighty

Page 2: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 2/8

Non-Incremental Computations Incremental Computations

Unordered Updates VI, HSP2 IVI

Ordered Updates GBF, HSP1, GD PINCH

Table 1: Classification of Heuristic-Calculation Methods

percent in several domains and, in general, the amount of savings grows with the size of the domains, allowing HSP2.0 to solve larger planning problems in the same amount

of time than was possible before and without changing itsoverall operation.

Heuristic Search-Based Planning: HSP 2.0

In this section, we describe how HSP 2.0 operates. We firstdescribe the planning problems that it solves and then howit calculates its heuristic values. We follow the notation anddescription from (Bonet & Geffner 2001b).

The Problem

HSP 2.0 solves STRIPS-style planning problems withground operators. Such STRIPS-style planning problemsconsist of a set of propositions     that are used to describethe states and operators, a set of ground operators

 

  , the

start state        , and the partially specified goal        .Each operator        has a precondition list             ,an add list

           

  , and a delete list           

  .The STRIPS planning problem induces a graph search prob-lem that consists of a set of states (vertices)

 

   , a start state  , a set of goal states                  , a set of ac-tions (directed edges)

                    

  for each state      where action     transitions from state        to state                      

with cost one. All opera-tor sequences (paths) from the start state to any goal state inthe graph (plans) are solutions of the STRIPS-style planningproblem. The shorter the path, the higher the quality of thesolution.

The Method and its HeuristicsIn its default configuration, HSP 2.0 performs a forwardsearch in the space of world states using weighted A* (Pearl1985) with inadmissible heuristic values. It calculates theheuristic value of a given state by solving a relaxed versionof the planning problem, where it recursively approximates(by ignoring all delete lists) the cost of achieving each goalproposition individually from the given state and then com-bines the estimates to obtain the heuristic value of the givenstate. In the following, we explain the calculation of heuris-tic values in detail. We use

 

 

      

  to denote the approximatecost of achieving proposition         from state        , and 

 

     

to denote the approximate cost of achieving the pre-conditions of operator        from state        . HSP 2.0defines these quantities recursively. It defines for all        ,      

, and     

  (the minimum of an empty set is definedto be infinity and an empty sum is defined to be zero):

 

 

      

 

  if        

 

               

   

 

    ℄ otherwise  (1)

 

 

     

   

        

 

 

      

(2)

procedure Main()

forever do

set 

  to the state whose heuristic value needs to get computed next

for each             

  do set 

 

   

for each     

  do set 

 

 

repeat

for each     

  do set 

 

 

 

       

 

 

for each         

  do set 

 

 

             

   

 

until the values of all 

 

  remain unchanged during an iteration

 /* use 

 

     

 

     

 

 

  */ 

Figure 1: VI

Then, the heuristic value  

 

       of state        can becalculated as

 

 

     

 

     

 

 

      

  . This allows HSP 2.0

to solve large planning problems, although it is not guaran-teed to find shortest paths.

Calculation of Heuristics

In the next sections, we describe differentmethods that solveEquations 1 and 2. These methods are summarized in Ta-ble 1. To describe them, we use a variable      if its values areguaranteed to be propositions, a variable

 

  if its values are

guaranteed to be operators, and a variable     if its values canbe either propositions or operators. We also use variables 

 

(a proposition variable) for      

  and 

 

  (an opera-tor variable) for        . In principle, the value of variable 

 

satisfies 

 

   

 

      

  after termination and the value of variable  

 

  satisfies  

 

   

 

       after termination, except forsome methods that terminate immediately after the variables 

 

for         have their correct values because this alreadyallows them to calculate the heuristic value.

VI and HSP2: Simple Calculation of Heuristics

Figure 1 shows a simple dynamic programming method thatsolves the equations using a form of value iteration. ThisVI (Value Iteration) method initializes the variables

 

 

  tozero if          , and initializes all other variables to infinity. Itthen repeatedly sweeps over all variables and updates themaccording to Equations 1 and 2, until no value changes anylonger.

HSP 2.0 uses a variant of VI that eliminates the vari-ables  

 

  by combining Equations 1 and 2. Thus, this HSP2method performs only the following operation as part of itsrepeat-until loop:

foreach         

  do set 

 

 

             

 

 

 

     

 

 

℄  .

HSP2 needs more time than VI for each sweep but re-duces the number of sweeps. For example, it needs only 5.0sweeps on average in the Logistics domain, while VI needs

7.9 sweeps.

Speeding Up the Calculation of Heuristics

We investigate two orthogonal approaches for speeding upHSP 2.0’s calculation of heuristic values, namely to orderthe value updates (ordered updates) and to reuse informationfrom the calculation of previous heuristic values (incremen-tal computations), as shown in Table 1. We then describehow to combine them.

Page 3: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 3/8

procedure Main()

forever do

set 

  to the state whose heuristic value needs to get computed next

for each             

  do set 

 

   

for each     

  do set 

 

 

repeat

for each     

  do

set 

 

 

 

 

       

 

 

if  

 

 

   

 

  then

set 

 

   

 

 

for each         

  do

set 

 

  

 

   

 

until the values of all 

 

  remain unchanged during an iteration

 /* use 

 

     

 

     

 

 

  */ 

Figure 2: GBF

IVI: Reusing Results from Previous Searches

When HSP 2.0 calculates a heuristic value  

 

       , it solvesequations in

 

 

      

  and 

 

     

  . When HSP 2.0 then calcu-lates the heuristic value  

 

   

   of some other state  

  , itsolves equations in

 

 

      

  and 

 

     

  . If  

 

        

 

      

  forsome         and  

 

       

 

       for some        , one could just cache these values during the calculation of   

 

       andthen reuse them during the calculation of 

 

 

   

 

  . Unfor-tunately, it is nontrivial to determine which values remainunchanged. We explain later, in the context of our PINCHmethod, how this can be done. For now, we exploit the factthat the values

 

 

      

  and 

 

      

  for the same      

  and thevalues  

 

       and  

 

       for the same        are often similarif 

 

  and 

  are similar. Since HSP 2.0 calculates the heuristicvalues of all children of a state in a row when it expands thestate, it often calculates the heuristic values of similar statesin succession. This fact can be exploited by changing VI sothat it initializes  

 

  to zero if          but does not re-initializethe other variables, resulting in the IVI (Incremental ValueIteration) method. IVI repeatedly sweeps over all variablesuntil no value changes any longer. If the number of sweepsbecomes larger than a given threshold then IVI terminates

the sweeps and simply calls VI to determine the values. IVIneeds fewer sweeps than VI if HSP 2.0 calculates the heuris-tics of similar states in succession because then the valuesof the corresponding variables

 

 

  tend to be similar as well.For example, IVI needs only 6.2 sweeps on average in theLogistics domain, while VI needs 7.9 sweeps.

GBF, HSP1 and GD: Ordering the Value Updates

VI and IVI sweep over all variables in an arbitrary order.Their numberof sweeps can be reduced by ordering the vari-ables appropriately, similarly to the way values are orderedin the Prioritized Sweeping method used for reinforcementlearning (Moore & Atkeson 1993).

Figure 2 shows the GBF (Generalized Bellman-Ford)

method, that is similar to the variant of the Bellman-Fordmethod in (Cormen, Leiserson, & Rivest 1990). It ordersthe value updates of the variables  

 

  . It sweeps over all vari-ables  

 

  and updates their values according to Equation 2.If the update changes the value of variable

 

 

  , GBF iteratesover the variables  

 

  for the propositions      in the add list of operator

 

  and updates their values according to Equation 1.Thus, GBF updates the values of the variables  

 

  only if their values might have changed.

procedure Main()

forever do

set 

  to the state whose heuristic value needs to get computed next

for each             

  do set 

 

   

for each     

  do set 

 

 

set 

 

   

while 

 

   

set 

 

 

  the set of operators with no preconditions

for each     

 

  do set 

 

   

 

              

set 

 

   

for each     

  do set 

 

 

   

 

for each     

 

  do

set 

 

 

 

       

 

 

for each         

if  

 

 

     

 

set 

 

 

   

 

set 

 

   

 

     

for each     

  do set 

 

   

 

 

 /* use 

 

     

 

     

 

 

  */ 

Figure 3: HSP1

Figure 3 shows the method used by HSP 1.0. The HSP1method orders the value updates of the variables  

 

  and 

 

, similar to the variant of the Bellman-Ford method in(Bertsekas 2001). It uses an unordered list to rememberthe propositions whose variables changed their values. It

sweeps over all variables 

   that have these propositions aspreconditions and updates their values according to Equa-tion 2. After each update of the value of a variable

 

 

  , GBFiterates over the variables  

 

  for the propositions      in theadd list of operator

 

  and updates their values according toEquation 1. The unordered list is then updated to containall propositions whose variables changed their values in thisstep. Thus, HSP1 updates the values of the variables  

 

  and 

 

only if their values might have changed.Finally, the GD (Generalized Dijkstra) method (Knuth

1977) is a generalization of Dijkstra’s graph search method(Dijkstra 1959) and uses a priority queue to sweep over thevariables  

 

  and  

 

  in the order of increasing values. Similarto HSP1, its priority queue contains only those variables  

 

and 

   whose values might have changed. To make it evenmore efficient, we terminate it immediately once the valuesof all

 

 

      

  for      

  are known to be correct. We do notgive pseudocode for GD because it is a special case of thePINCH method, that we discuss next.1

PINCH: The Best of Both Worlds

The two approaches for speeding up the calculation of heuristic values discussed above are orthogonal. We nowdescribe our main contribution in this paper, namely howto combine them. This is complicated by the fact that theapproaches themselves cannot be combined easily becausethe methods that order the value updates exploit the prop-erty that the values of the variables cannot increase duringeach computation of a heuristic value. Unfortunately, thisproperty no longer holds when reusing the values of the vari-ables from the calculation of previous heuristic values. We

1PINCH from Figure 5 reduces to GD if one empties its priorityqueue and reinitializes the variables  

 

  and  

 

  before SolveE-quations() is called again as well as modifies the while loop of SolveEquations() to terminate immediately once the values of all 

 

      

for      

  are known to be correct.

Page 4: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 4/8

thus need to develop a new method based on a method fromalgorithm theory. The resulting PINCH (Prioritized, INCre-mental Heuristics calculation) method solves Equations 1and 2 by calculating only those values  

 

        and  

 

       thatare different from the corresponding values

 

 

      

  and 

 

     

from the calculation of previous heuristic values. PINCHalso orders the variables so that it updates the value of eachvariable at most twice. We will demonstrate in the experi-

mental section that PINCH outperforms the other methodsin many large planning domains and is not worse in mostother large planning domains.

DynamicSWSF-FP

In this section, we describe DynamicSWSF-FP (Rama-lingam & Reps 1996), the method from algorithm theorythat we extend to speed up the calculation of the heuristicvalues for HSP 2.0. We follow the presentation in (Rama-lingam & Reps 1996). A function

     

 

 

 

 

 

 

  :

 

 

 

   

 

is called a strict weakly superior func-tion (in short: swsf) if, for every          , itis monotone non-decreasing in variable

 

 

  and satisfies:     

 

 

 

 

 

     

 

       

 

 

 

 

 

 

     

 

   

 

 

. The swsf fixed point (in short: swsf-fp) problem is to compute the unique fixed point of      equa-tions, namely the equations

 

 

   

 

   

 

 

 

 

  , in the 

variables  

 

 

 

  , where the  

 

  are swsf for        .The dynamic swsf-fp problem is to maintain the unique fixedpoint of the swsf equations after some or all of the functions 

 

have been replaced by other swsf’s. DynamicSWSF-FPsolves the dynamic swsf-fp problem efficiently by recalcu-lating only the values of variables that change, rather thanthe values of all variables. The authors of DynamicSWSF-FP have proved its correctness, completeness, and otherproperties and applied it to grammar problems and shortestpath problems (Ramalingam & Reps 1996).

Terminology and VariablesWe use the following terminology to explain how to useDynamicSWSF-FP to calculate the heuristic values for HSP2.0. If   

 

   

 

   

 

 

 

   then  

 

  is called consistent.Otherwise it is called inconsistent. If   

 

  is inconsistent theneither

 

 

 

 

   

 

 

 

 

  , in which case we call 

 

  under-consistent, or  

 

 

 

   

 

 

 

   , in which case we call 

 

overconsistent. We use the variables  

 

  to keep track of the current values of   

 

   

 

 

 

   . It always holds that  

 

   

 

   

 

 

 

 

(Invariant 1). We can therefore com-pare  

 

  and  

 

  to check whether  

 

  is overconsistent orunderconsistent. We maintain a priority queue that alwayscontains exactly the inconsistent

 

 

  (to be precise: it stores 

rather than  

 

  ) with priorities   

 

 

 

   (Invariant 2).

Transforming the EquationsWe transform Equations 1 and 2 as follows to ensure thatthey specify a swsf-fp problem, for all

     

  ,      

  and      :

 

 

      

 

  if        

 

               

   

 

    ℄ otherwise  (3)

 

 

     

   

        

 

 

      

(4)

procedure AdjustVariable( 

  )

if      

  then

if      

  then set  

 

 

else set  

 

 

             

 

 

else /*     

  */ set  

 

 

 

       

 

 

if  

  is in the priority queue then delete it

if  

 

   

 

  then insert 

  into the priority queue with priority   

 

 

 

procedure SolveEquations()

while the priority queue is not empty do

delete the element with the smallest priority from the priority queue and assign it to 

if   

 

 

 

  then

set 

 

   

 

if      

  then for each     

  such that        

  do AdjustVariable( 

  )

else if      

  then for each         

  with     

  do AdjustVariable( 

  )

else

set 

 

   

AdjustVariable( 

  )

if      

  then for each     

  such that        

  do AdjustVariable( 

  )

else if      

  then for each         

  with     

  do AdjustVariable( 

  )

procedure Main()

empty the priority queue

set 

  to the state whose heuristic value needs to get computed

for each         

  do set 

 

   

for each         

  do AdjustVariable( 

  )

forever do

SolveEquations()

 /* use 

 

         

 

     

 

 

  */ 

set 

   

  and 

  to the state whose heuristic value needs to get computed next

for each           

     

   

  do AdjustVariable(p)

Figure 4: PINCH

The only difference is in the calculation of   

 

       . Thetransformed equations specify a swsf-fp problem since

 

  ,  

               

   

 

    ℄ , and  

 

     

 

 

        are all

swsf in 

 

      

  and 

 

     

  for all      

  and all     

  . Thismeans that the transformed equations can be solved withDynamicSWSF-FP. They can be used to calculate

 

 

     

since it is easy to show that  

 

            

 

        and thus 

 

     

 

     

 

 

          

 

     

 

 

       .

Calculation of Heuristics with DynamicSWSF-FP

We apply DynamicSWSF-FP to the problem of calculating

the heuristic values of HSP 2.0, which reduces to solvingthe swsf fixed point problem defined by Equations 3 and 4.The solution is obtained by SolveEquations() shown in Fig-ure 4. In the algorithm, each  

 

  is a variable that containsthe corresponding

 

 

      

  value, and each 

 

  is a variable thatcontains the corresponding  

 

       value.PINCH calls AdjustVariable() for each

 

 

  to ensure thatInvariants 1 and 2 hold before it calls SolveEquations() forthe first time. It needs to call AdjustVariable() only for those 

 

whose function  

 

  has changed before it calls SolveEqua-tions() again. The invariants will automatically continue tohold for all other

 

 

  . If the state whose heuristic value needsto get computed changes from  

  to    , then this changes onlythose functions

 

 

  that correspond to the right-hand side of 

Equation 3 for which             

       

       , in other words,where

  

  is no longer part of  

  (and the corresponding vari-able thus is no longer clamped to zero) or just became part of   (and the corresponding variable thus just became clampedto zero). SolveEquations() then operates as follows. The 

 

solve Equations 3 and 4 if they are all consistent. Thus,SolveEquations() adjusts the values of the inconsistent

 

 

  .It always removes the  

 

  with the smallest priority fromthe priority queue. If 

 

 

  is overconsistent then SolveEqua-

Page 5: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 5/8

procedure AdjustVariable( 

  )

if  

 

   

 

  then

if  

  is not in the priority queue then insert it with priority   

 

 

 

else change the priority of  

  in the priority queue to   

 

 

 

else if  

  is in the priority queue then delete it

procedure SolveEquations()

while the priority queue is not empty do

assign the element with the smallest priority in the priority queue to 

if      

  then

if   

 

 

 

  then

delete 

  from the priority queue

set 

 

   

 

set 

 

   

 

for each        such that            doif 

 

 

   

  then

set  

 

 

 

       

 

 

else set  

 

   

 

   

 

   

 

AdjustVariable( 

  )

else

set 

 

   

if      

  then

set  

 

 

             

 

 

AdjustVariable( 

  )

for each     

  such that        

  do

set  

 

   

AdjustVariable( 

  )

else /*     

  */ 

if   

 

 

 

  then

delete 

  from the priority queue

set 

 

   

 

for each         

  with     

 

 

   

 

   

 

AdjustVariable( 

  )

else

set 

 

   

 

set 

 

   

set  

 

 

 

       

 

 

AdjustVariable( 

  )

for each         

  with     

if   

 

   

 

  then

set  

 

 

             

 

 

AdjustVariable( 

  )

procedure Main()

empty the priority queue

set 

  to the state whose heuristic value needs to get computed

for each         

  do set  

 

   

 

   

for each     

  with         

  do set  

 

   

 

 

for each     

  do

 

 

 

AdjustVariable( 

  )

forever do

SolveEquations()

 /* use 

 

         

 

     

 

 

  */ 

set 

   

  and 

  to the state whose heuristic value needs to get computed next

for each         

  do

 

 

 

AdjustVariable( 

  )

for each     

   

  do

 

 

 

             

 

 

AdjustVariable( 

  )

Figure 5: Optimized PINCH

tions() sets it to the value of   

 

  . This makes  

 

  consistent.Otherwise

 

 

  is underconsistent and SolveEquations() setsit to infinity. This makes  

 

  either consistent or overcon-sistent. In the latter case, it remains in the priority queue.

Whether  

   was underconsistent or overconsistent, its valuegot changed. SolveEquations() then calls AdjustVariable()to maintain the Invariants 1 and 2. Once the priority queueis empty, SolveEquations() terminates since all  

 

  are con-sistent and thus solve Equations 3 and 4. One can provethat it changes the value of each  

 

  at most twice, namelyat most once when it is underconsistent and at most oncewhen it is overconsistent, and thus terminates in finite time(Ramalingam & Reps 1996).

Algorithmic Optimizations

PINCH can be optimized further. Its main inefficiency isthat it often iterates over a large number of propositions oroperators. Consider, for example, the case where        hasthe smallest priority during an iteration of the while-loop inSolveEquations() and  

 

 

 

  . At some point in time,SolveEquations() then executes the following loop

for each            

  with      

  do AdjustVariable(  

  )

The for-loop iteratesover all propositionsthat satisfy its con-dition. For each of them, the call AdjustVariable(     ) executes

set  

 

 

               

 

 

The calculation of   

 

  therefore iterates over all operatorsthat contain      in their add list. However, this iteration canbe avoided. Since

 

 

 

 

  according to our assumption,SolveEquations() sets the value of   

 

  to  

 

  and thus de-creases it. All other values remain the same. Thus,  

 

cannot increase and one can recalculate it faster as follows

set  

 

   

 

   

 

 

Figure 5 shows PINCH after this and other optimizations. Inthe experimental section, we use this method rather than theunoptimized one to reduce the planning time. This reducesthe planning time, for example, by 20 percent in the Logis-tics domain and up to 90 percent in the Freecell domain.

Summary of Methods

Table 1 summarizes the methods that we have discussed andclassifies them according to whether they order the value up-dates (ordered updates) and whether they reuse informationfrom the calculation of previous heuristic values (incremen-tal computations).

Both VI and IVI perform full sweeps over all variables

and thus perform unordered updates. IVI reuses the valuesof variables from the computation of the previous heuris-tic value and is an incremental version of VI. We includedHSP2 although it is very similar to VI and belongs to thesame class because it is the method used by HSP 2.0. Itsonly difference from VI is that it eliminates all operator vari-ables, which simplifies the code.

GBF, HSP1, and GD order the value updates. They arelisted from left to right in order of increasing number of binary ordering constraints between value updates. GBFperforms full sweeps over all operator variables interleavedwith partial sweeps over those proposition variables whosevalues might have changed because they depend on oper-ator variables whose values have just changed. HSP1, the

method used by HSP 1.0, alternates partial sweeps over op-erator variables and partial sweeps over proposition vari-ables. Finally, both GD and PINCH order the value updatescompletely and thus do not perform any sweeps. This en-ables GD to update the value of each variable only onceand PINCH to update the value of each variable only twice.PINCH reuses the values of variables from the computationof the previous heuristic value and is an incremental versionof GD.

Page 6: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 6/8

Page 7: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 7/8

1000 2000 3000 4000 5000 6000 7000−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Blocks World

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.5 1 1.5 2 2.5 3 3.5 4

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Blocks World: Random Instances

PINCHGBFHSP1GDIVIVI

PINCH

GBF

500 1000 1500 2000 2500 3000 3500−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Elevator

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Freecell

PINCHGBFHSP1GDIVI

VI

PINCH

GBF

1000 2000 3000 4000 5000 6000 7000 8000 9000−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Logistics

PINCHGBFHSP1GDIVI

VI

PINCH

GBF

1000 1500 2000 2500 3000 3500−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Schedule

PINCHGBFHSP1GDIVI

VI

PINCH

GBF

1000 2000 3000 4000 5000 6000−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Gripper

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Mprime

PINCHGBFHSP1GDIVIVI

PINCH

GBF

0.5 1 1.5 2 2.5 3 3.5 4 4.5

x 104

−100

−80

−60

−40

−20

0

20

40

60

80

100

Problem Size (#P + #O)

   R  e   l  a   t   i  v  e   S  p  e  e   d  u  p   i  n   P   l  a  n  n   i  n  g   T   i  m  e   (   %   )

Mystery

PINCHGBFHSP1GDIVIVI

PINCH

GBF

Figure 6: Relative Speedup in Planning Time over HSP2

ning time of PINCH over HSP2 tends to increase with theproblem size. For example, PINCH outperforms HSP2 byover 80 percent for large problems in several domains. Ac-cording to theoretical results given in (Ramalingam & Reps1996), the complexity of PINCH depends on the numberof operator and proposition variables whose values changefrom the calculation of one heuristic value to the next.Since some methods do not use operator variables, we onlycounted the number of proposition variables whose valueschanged. Table 2 lists the resulting dissimilarity measure

for the Logistics domain. In this domain, #CV/#P is a goodpredictor of the relative speedup in planning time of PINCHover HSP2, with a negative correlation coefficient of -0.96.More generally, the dissimilarity measure is a good predic-tor of the relative speedup in planning time of PINCH overHSP2 in all domains, with negative correlation coefficientsranging from -0.96 to -0.36. This insight can be used toexplain the relatively poor scaling behavior of PINCH in the

Freecell domain. In all other domains, the dissimilarity mea-sure is negatively correlated with the problem size, with neg-ative correlation coefficients ranging from -0.92 to -0.57. Inthe Freecell domain, however, the dissimilarity measure ispositively correlated with the problem size, with a positivecorrelation coefficient of 0.68. The Freecell domain repre-sents a solitaire game. As the problem size increases, thenumber of cards increases while the numbers of columnsand free cells remain constant. This results in additionalconstraints on the values of both proposition and operator

variables and thus in a larger number of value changes fromone calculation of the heuristic value to the next. This in-creases the dissimilarity measure and thus decreases the rel-ative speedup of PINCH over HSP2. Finally, there are prob-lems that HSP 2.0 with PINCH could solve in the 10-minutetime limit that neither the standard HSP 2.0 distribution, norHSP 2.0 with GBF, could solve. The Logistics domain issuch a domain.

Page 8: Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

8/10/2019 Speeding Up the Calculation of Heuristics for Heuristic Search-Based Planning

http://slidepdf.com/reader/full/speeding-up-the-calculation-of-heuristics-for-heuristic-search-based-planning 8/8

Second, GBF is at least as fast as HSP1 and HSP1 is atleast as fast as GD. Thus, the stricter the ordering of the vari-able updates among the three methods with ordered updatesand nonincremental computations, the smaller their relativespeedup over HSP2. This can be explained with the increas-ing overhead that results from the need to maintain the or-dering constraints with increasingly complex data structures.GBF does not use any additional data structures, HSP1 uses

an unordered list, and GD uses a priority queue. (We imple-mented the priority queue in turn as a binary heap, Fibonacciheap, and a multi-level bucket to ensure that this conclusionis independent of its implementation.) Since PINCH with-out value reuse reduces to GD, our experiments demonstratethat it is value reuse that allows PINCH to counteract theoverhead associated with the priority queue, and makes theplanning time of PINCH so competitive.

Conclusions

In this paper, we systematically evaluated methodsfor calcu-lating the heuristic values for HSP 2.0 with the

 

 

  heuris-tics and demonstrated that the resulting planning times dif-fer substantially. We identified two different approaches for

speeding up the calculation of the heuristic values, namelyto order the value updates and to reuse information from thecalculation of previous heuristic values. We then showedhow these two approaches can be combined, resulting inour PINCH (Prioritized, INCremental Heuristics calcula-tion) method. PINCH outperforms both of the other ap-proaches individually as well as the methods used by HSP1.0 and HSP 2.0 for most of the large planning problemstested. In fact, it speeds up the planning time of HSP 2.0 byup to eighty percent in several domains and, in general, theamount of savings grows with the size of the domains. Thisis an important property since, if a method has a high relativespeedup for small problems, it wins by fractions of a secondwhich is insignificant in practice. However, if a method hasa high relative speedup for large problems, it wins by min-utes. Thus, PINCH allows HSP 2.0 to solve larger planningproblems than was possible before in the same amount of time and without changing its operation. We also have pre-liminary results that show that PINCH speeds up HSP 2.0with the  

 

  heuristics (Bonet & Geffner 2001b) by over20 percent and HSP 2.0 with the

 

 

 

  (Haslum & Geffner2000) heuristic by over80 percent for small blocks world in-stances. We are currently working on demonstrating similarsavings for other heuristic search-based planners. For exam-ple, PINCH also applies in principle to the first stage of FF(Hoffmann & Nebel 2001a), where FF builds the planninggraph.

Acknowledgments

We thank Blai Bonet and Hector Geffner for making thecode of HSP 1.0 and HSP 2.0 available to us and for answer-ing our questions. We also thank Sylvie Thiebaux and JohnSlaney for making their blocksworld planning task genera-tor available to us. The Intelligent Decision-Making Groupis partly supported by NSF awards to Sven Koenig undercontracts IIS-9984827, IIS-0098807, and ITR/AP-0113881

as well as an IBM faculty partnership award. The viewsand conclusions contained in this document are those of theauthors and should not be interpreted as representing the of-ficial policies, either expressed or implied, of the sponsoringorganizations, agencies, companies or the U.S. government.

ReferencesBertsekas, D. 2001. Dynamic Programming and Optimal Control.

Athena Scientific, 2nd edition.Bonet, B., and Geffner, H. 1999. Planning as heuristic search:New results. In Proceedings of the 5th European Conference onPlanning, 360–372.

Bonet, B., and Geffner, H. 2001a. Heuristic search planner 2.0. Artificial Intelligence Magazine 22(3):77–80.

Bonet, B., and Geffner, H. 2001b. Planning as heuristicsearch.  Artificial Intelligence – Special Issue on Heuristic Search129(1):5–33.

Bonet, B.; Loerincs, G.; and Geffner, H. 1997. A robust andfast action selection mechanism. In Proceedings of the NationalConference on Artificial Intelligence, 714–719.

Cormen, T.; Leiserson, C.; and Rivest, R. 1990.   Introduction to Algorithms. MIT Press.

Dijkstra, E. 1959. A note on two problems in connection withgraphs.   Numerical Mathematics 1:269–271.

Edelkamp, S. 2001. Planning with pattern databases. In  Proceed-ings of the 6th European Conference on Planning, 13–24.

Haslum, P., and Geffner, H. 2000. Admissible heuristics for opti-mal planning. In  Proceedings of the International Conference on

 Artificial Intelligence Planning and Scheduling, 70–82.

Hoffmann, J., and Nebel, B. 2001a. The FF planning system:Fast plan generation through heuristic search. Journal of Artificial

 Intelligence Research 14:253–302.

Hoffmann, J., and Nebel, B. 2001b. RIFO revisited: Detecting re-laxed irrelevance. In Proceedings of the 6th European Conferenceon Planning, 325–336.

Knuth, D. 1977. A generalization of Dijkstra’s algorithm.  Infor-

mation Processing Letters 6(1):1–5.McDermott, D. 1996. A heuristic estimator for means-ends anal-ysis in planning. In  Proceedings of the International Conferenceon Artificial Intelligence Planning and Scheduling, 142–149.

Moore, A., and Atkeson, C. 1993. Prioritized sweeping: Rein-forcement learning with less data and less time.   Machine Learn-ing 13(1):103–130.

Nguyen, X.; Kambhampati, S.; and Nigenda, R. S. 2002. Plan-ning graph as the basis for deriving heuristics for plan synthesisby state space and csp search. Artificial Intelligence 135(1–2):73–123.

Pearl, J. 1985.   Heuristics: Intelligent Search Strategies for Com- puter Problem Solving. Addison-Wesley.

Ramalingam, G., and Reps, T. 1996. An incremental algorithm

for a generalization of the shortest-path problem.   Journal of Al-gorithms 21:267–305.

Refanidis, I., and Vlahavas, I. 2001. The GRT planning system:Backward heuristic construction in forward state-space planning.

 Journal of Artificial Intelligence Research 15:115–161.