forward-chaining partial-order planning amanda coles, andrew coles, maria fox and derek long (to...

Forward-Chaining Forward-Chaining Partial-Order PlanningPartial-Order Planning

Amanda Coles, Andrew Coles,Amanda Coles, Andrew Coles,

Maria Fox and Derek LongMaria Fox and Derek Long

(to appear, ICAPS 2010)(to appear, ICAPS 2010)

SummarySummary

Forward-chaining planning eliminates the threat Forward-chaining planning eliminates the threat resolution of POP, at the price of resolution of POP, at the price of over-over-commitment.commitment.Issues arise in temporal planning, due to Issues arise in temporal planning, due to needless needless ordering constraintsordering constraints leading to leading to backtrackingbacktracking..Can modify a forward-chaining approach to Can modify a forward-chaining approach to construct a partial-orderconstruct a partial-order, avoiding this., avoiding this.Further, can modify a TRPG heuristic to Further, can modify a TRPG heuristic to encourage search to encourage search to find lower makespan find lower makespan plansplans..Implemented and evaluated in the planner Implemented and evaluated in the planner POPFPOPF

OverviewOverview

(Temporal) Forward-Chaining Planning(Temporal) Forward-Chaining Planning

Issues with using a Total OrderIssues with using a Total Order

Reducing CommitmentReducing Commitment

Heuristic Guidance for Lower Makespan PlansHeuristic Guidance for Lower Makespan Plans

EvaluationEvaluation

ConclusionsConclusions

Forward Chaining Temporal Forward Chaining Temporal PlanningPlanning

A state S is a tuple <F,V,Q,P,C> of:A state S is a tuple <F,V,Q,P,C> of: Propositional Propositional FFactsacts VValues of task variablesalues of task variables A A QQueue of actions that have not yet finishedueue of actions that have not yet finished The The PPlan to reach Slan to reach S The The CConstraints on the steps in Ponstraints on the steps in P

The plan consists of the starts and ends of The plan consists of the starts and ends of actions:actions: AA├├ and A and A ┤┤ denote the start/end of A, resp. denote the start/end of A, resp.

light_match match1

light m1 ¬light m1

mend_fuse fuse1 match1

0: light_match_start match10: light_match_start match1

1: mend_fuse_start fuse1 match11: mend_fuse_start fuse1 match1

2: mend_fuse_end fuse1 match12: mend_fuse_end fuse1 match1

3: light_match_end match13: light_match_end match1

lms

mfs1 mfe1

lme8.0

-8.0

-0.01

- 5.0

5.0-0

.01

‘Epsilon’ separation (0.01)

Simple ExampleSimple Example

OverviewOverview







Issues with Using a Total OrderIssues with Using a Total Order

To resolve threats, F.C. planning uses a To resolve threats, F.C. planning uses a total total orderorder. When applying an action . When applying an action A:A: AA cannot violate preconditions of earlier actions, as it cannot violate preconditions of earlier actions, as it

comes after them (demotion);comes after them (demotion); Subsequent actions cannot delete its preconditions, as Subsequent actions cannot delete its preconditions, as

AA comes sooner (promotion) comes sooner (promotion)

The drawback is that needless ordering The drawback is that needless ordering constraints are added:constraints are added: If A does not interfere with the preceding step, it still If A does not interfere with the preceding step, it still

must come after it.must come after it.

Motivates Motivates partial-order lifting, partial-order lifting, but this first but this first needs a solution to be found.needs a solution to be found.

Total Orders of Start/End ActionsTotal Orders of Start/End Actions

Two actions, A and B:Two actions, A and B: B is longer than A;B is longer than A; No interaction between ANo interaction between A├├ and B and B├;├;

But, But, BB┤┤ must precede A must precede A┤┤

The planner chooses a (partial) plan:The planner chooses a (partial) plan:

AA├ ├ BB├ ├ B B ┤┤

AA├ ├ BB├ ├ B B ┤┤ AA ┤ ┤

-0.01

2

-2

5

-5

-0.01

•Because A├ was added to the plan before B├, theyA├ was added to the plan before B├, they are ordered as shown (in a total-order).are ordered as shown (in a total-order).•ButBut, , A┤will not be applicable until after B┤A┤will not be applicable until after B┤•The planner will have to The planner will have to backtrackbacktrack, over all the intermediate, over all the intermediate decisions, and decisions, and add B├ to the plan earlier than A├add B├ to the plan earlier than A├ .

OverviewOverview








Record additional information at each Record additional information at each state concerning which steps achieve / state concerning which steps achieve / delete / depend on each fact.delete / depend on each fact.

Use this information to commit to Use this information to commit to fewerfewer ordering constraintsordering constraints

Still resolve threats based on the intuition Still resolve threats based on the intuition of of forward-chainingforward-chaining expansion: new expansion: new actions cannot threaten the preconditions actions cannot threaten the preconditions of earlier actions.of earlier actions.

Extending the State: PropositionalExtending the State: Propositional

To capture ordering information we add:To capture ordering information we add:FF++, F, F--, where F, where F++((pp) (F) (F--((pp)) is the index of )) is the index of the of the step that most recently added the of the step that most recently added (deleted) (deleted) ppFP, where FP(p) is a set of pairs <j,d>:FP, where FP(p) is a set of pairs <j,d>: <j,<j,εε> denotes that step i has an instantaneous > denotes that step i has an instantaneous

condition on p (condition on p (at startat start or or at endat end)) <j,0> denotes that step i marks the end of an <j,0> denotes that step i marks the end of an

action with an action with an over allover all condition on condition on pp

Starting an Action A at Step iStarting an Action A at Step i

For each For each at startat start condition p: condition p:t(Ft(F++(p)) + (p)) + εε ≤≤ t(i) t(i)

For each For each at startat start del. effect p, del. effect p, assign assign FF--(p) = i(p) = i,,t(Ft(F++(p)) + (p)) + εε ≤≤ t(i), and t(i), and <j,d> in FP(P), <j,d> in FP(P), t(j) + t(j) + d d ≤≤ t(i) t(i)

For each For each at startat start add effect p, assign add effect p, assign FF++(p) = i(p) = i, and, andif if FF--(p) (p) ≠ i≠ i,, t(F t(F--(p)) + (p)) + εε ≤≤ t(i) t(i)

For each For each over allover all condition p: condition p:If If FF++(p) (p) ≠ i≠ i, , t(Ft(F++(p)) (p)) ≤≤ t(i) t(i)

(To apply the end of an action: similar process, but (To apply the end of an action: similar process, but without over all conditions)without over all conditions)

AA

AA├ ├

BB├ ├ B B ┤┤

AA ┤ ┤

-0.01

2

-2

5

-5

-0.01

0.00: (action B) [5.00]

3.01: (action A) [2.00]

Extending the State: NumericExtending the State: NumericFor numbers we are a little more strict:For numbers we are a little more strict:

VVeffeff, where V, where Veffeff((vv) is the step of the action to most ) is the step of the action to most recently have an effect on recently have an effect on vvVP, where VP(v) contains steps that depend on VP, where VP(v) contains steps that depend on the value of the value of vv, each step , each step ii such that: such that: ii has a precondition on has a precondition on vv, or is the start of an action , or is the start of an action

whose duration constraint contains whose duration constraint contains vv; or,; or, ii has an effect that depends on has an effect that depends on vv

VI, where VI(v) is a set of pairs (s,e), marking the VI, where VI(v) is a set of pairs (s,e), marking the start/end indices of actions in the event queue (Q) start/end indices of actions in the event queue (Q) with an with an over all over all condition depending on condition depending on vv

(Also, V(Also, Vcts cts to handle linear continuous numeric change – to handle linear continuous numeric change – see paper for details.)see paper for details.)

Starting an Action A at Step i:Starting an Action A at Step i:

For each variable For each variable vv relevant to relevant to at startat start conditions, conditions, effects, or the action’s duration:effects, or the action’s duration:

t(Vt(Veffeff(v)) + (v)) + εε ≤≤ t(i) t(i)

For each For each vv on which A has an on which A has an at start at start eff, apply eff, apply the effect to V, and:the effect to V, and:

(s,e) (s,e) inin VI(v), t(s) + VI(v), t(s) + εε ≤≤ t(i) t(i) andand t(i) + t(i) + εε ≤≤ t(e) t(e)

For each variable For each variable vv relevant to an relevant to an over allover all, , add (i,j) add (i,j) to VI(v), and if was not to VI(v), and if was not relevant to the start of A:relevant to the start of A:

t(Vt(Veffeff(v)) + (v)) + εε ≤≤ t(i) t(i)

AA

OverviewOverview







Heuristic GuidanceHeuristic Guidance

Have seen how the search space can be Have seen how the search space can be modified to reduce excessive ordering modified to reduce excessive ordering constraints;constraints;

There is still no pressure to prefer choices that There is still no pressure to prefer choices that lead to a partial-order with a lower makespanlead to a partial-order with a lower makespan Could use partial-order lifting Could use partial-order lifting a posteriori a posteriori for similar for similar

quality results?quality results?

Given we know the makespan implications of Given we know the makespan implications of action choices, how can we factor this into the action choices, how can we factor this into the decision making during search?decision making during search?

Revisiting the Temporal RPGRevisiting the Temporal RPG

The Temporal RPG consists of time-stamped The Temporal RPG consists of time-stamped fact and action layers.fact and action layers.

To evaluate a state S:To evaluate a state S: Fact layer Fact layer f=0.0f=0.0 contains the facts in S; contains the facts in S; Action layer Action layer a=0.00a=0.00 contains actions whose contains actions whose

preconditions are satisfied in preconditions are satisfied in f=0.0;f=0.0; Effects of actions appear in the next layer; the end of Effects of actions appear in the next layer; the end of

an action A is delayed until dur(A) after A start first an action A is delayed until dur(A) after A start first appears.appears.

What about the extra information we now have What about the extra information we now have in S?in S?

Bounding Preconditions and Effects Bounding Preconditions and Effects on Factson Facts

When adding actions to the partial order, for a When adding actions to the partial order, for a proposition proposition pp:: Any action requiring p to satisfy a precondition will Any action requiring p to satisfy a precondition will

need to come after t(Fneed to come after t(F++(p)) and t(F(p)) and t(F--(p))(p)) Any action with an add (delete) effect on p will need to Any action with an add (delete) effect on p will need to

come after t(Fcome after t(F--(p)) ( t(F(p)) ( t(F++(p)) resp.)(p)) resp.)

From checking temporal constraints, we have a From checking temporal constraints, we have a lower-bound on each step, tlower-bound on each step, tminmin(i)(i)

Thus, the earliest point we can use p is:Thus, the earliest point we can use p is:

l(p) = max { tl(p) = max { tminmin(F(F++(p)), t(p)), tminmin(F(F--(p)) + (p)) + εε } }

Bounding (continued)Bounding (continued)

Similarly, for each numeric precondition/effect Similarly, for each numeric precondition/effect referring to a variable set referring to a variable set varsvars, it cannot be used , it cannot be used until:until:

L(vars) = maxL(vars) = maxv in vars v in vars ttminmin(v(veffeff(v))(v))

With these bounds, for any state S, we can build With these bounds, for any state S, we can build a TRPG a TRPG starting at time zero:starting at time zero: Delay fact p until layer L(p)Delay fact p until layer L(p) Delay numeric preconditions/effects until L(vars) for Delay numeric preconditions/effects until L(vars) for

their respective variable setstheir respective variable sets

Then, actions which do not interfere with existing Then, actions which do not interfere with existing choices choices will appear soonerwill appear sooner in the TRPG. in the TRPG.

OverviewOverview








Planner POPF, based on the code for COLIN Planner POPF, based on the code for COLIN (IJCAI’09)(IJCAI’09)

First test:First test: Control: run COLIN , then apply partial-order lifting to Control: run COLIN , then apply partial-order lifting to

the solutionthe solution POPF, but using POPF, but using the original heuristic from COLINthe original heuristic from COLIN..

Second test, also considering domains with Second test, also considering domains with deadlines:deadlines: COLIN then partial-order lifterCOLIN then partial-order lifter POPF, new heuristic.POPF, new heuristic.

Test 1: Time TakenTest 1: Time Taken

Test 1: MakespanTest 1: Makespan

Test 2: Time TakenTest 2: Time Taken

Test 2: MakespanTest 2: Makespan

Test 2: Time Taken (Deadlines)Test 2: Time Taken (Deadlines)


Have shown how a partial-order can be Have shown how a partial-order can be expanded in a forwards direction;expanded in a forwards direction;Adapting the heuristic allows one to trade Adapting the heuristic allows one to trade time performance for a reduction in time performance for a reduction in makespan;makespan; In domains with deadlines, performance is: In domains with deadlines, performance is:

substantially improved (fivefold improvement substantially improved (fivefold improvement in coverage in the Satellite variants).in coverage in the Satellite variants).

In the paper: approach also works with In the paper: approach also works with domains containing linear-continuous domains containing linear-continuous changechange

forward-chaining partial-order planning amanda coles, andrew coles, maria fox and derek long (to...

Documents

p slide

b b slide

p f p

action b

f p i

tf p

effect p

start condition p