final presentation - research school of computer …...title final presentation created date...

Reporter: Xiaodi ZhangSupervisor: Alban Grastien

Improving Sample Strategies forConformant Planning

Content

• Background• Method• Experiment Results• Conclusion

2

Background

• Conformant planning• CPCES

3

Conformant PlanningActions: go_North, go_East, go_West, go_South

Goal: go to “𝑔”

4

Figure 1. An example of conformant planning (Grastien and Scala, 2018).

Go there first.Then go to “𝑔”.

Plan: go_North*4; go_East*4; go_West*2; go_South*2

Conformant Planning• Definition: 𝑃 = < 𝑉, 𝐴, 𝑆), 𝑆* >

• 𝑉: a set of state variables, 𝑣 ∈ 𝑉• 𝐴: a set of actions, 𝑎 ∈ 𝐴

𝑎 = < 𝑝𝑟𝑒, 𝑒𝑓𝑓 >, where 𝑝𝑟𝑒 is precondition, 𝑒𝑓𝑓 is conditionaleffect.

• 𝑆): a set of initial states (belief states)• 𝑆*: a set of goal states

• The solution is a plan (a list of actions): 𝜋 = 𝑎5,… , 𝑎7

5

CPCES• Is created by Grastien and Scala in 2017.• Uses counter-examples (belief states) to avoid invalid

plans.• Example:

6

Empty belief state Find a candidate plan 𝜋5

Find a counter-example 𝐶𝐸5 breaks 𝜋5 Find a candidate plan 𝜋:

Find a counter-example 𝐶𝐸: breaks 𝜋: Find a candidate plan 𝜋;

No counter-example can be found Plan 𝜋; is a valid plan

CPCES Algorithm

7

Counter-example isgenerated randomly

Figure 2. The algorithm of CPCES (Grastien and Scala, 2018).

Method

• Good Counter-example• Context• Tag• Algorithm

8

Good Counter-example• Rather than generate counter-examples randomly, but

always find good counter-examples.• For example: suppose there are 3 variables 𝐿5, 𝐿:, 𝐿; and

4 values 𝐴, 𝐵, 𝐶, 𝐷.

9

Bad counter-examples(10 bad counter-examples)

Good counter-examples(4 good counter-examples)

Figure 3. An example of bad and good counter-example.

Context• We define a subgoal 𝜑 is a conjunct of preconditions of

an action or the goal.• We use (𝑣, 𝑙) to represent an assigned variable, where 𝑣

is the variable and 𝑙 is the value of 𝑣.

• We use 𝑒𝑓𝑓 𝑣, 𝑙 to represent the conditional effect of anaction.

• We say (𝑣, 𝑙) depends on (𝑣L, 𝑙L) if the conditional effect𝑒𝑓𝑓 𝑣, 𝑙 mentions 𝑣L, 𝑙L .

10

Context

11

(𝑑𝑖𝑠𝑝𝑜𝑠𝑒𝑑 𝑜1) depends on (ℎ𝑜𝑙𝑑𝑖𝑛𝑔 𝑜1)

Figure 4. PDDL language of 𝐷𝐼𝑆𝑃𝑂𝑆𝐸 domain.

Context• A context of subgoal 𝑐𝑡𝑥(𝜑) is a set of predicates that

contains all the predicates of a dependency network of a subgoal.

12

Figure 5. Contexts of 𝐷𝐼𝑆𝑃𝑂𝑆𝐸 domain with two objects.

Tag• We use 𝑐 to denote a context, and use 𝑠 to denote a

state.• The tag of a belief state 𝑞 for 𝑐 is 𝑡𝑎𝑔𝑐 𝑞 = 𝑠 ∩ 𝑐.• Example: a state is {𝐴, 𝐵, 𝐶, 𝐷} and a context is {𝐴, 𝐸}.

The tag is the intersection of two sets {𝐴}.• Each tag 𝑡 can be associated with a set of plan Π 𝑡 , so

that for all belief state 𝐵, the following formula holds:

• Therefore we want as many tags in 𝐵 as possible.

13

Use Tag to Compute a Good Counter-example

14

Figure 6. An example of compute an optimal counter-example.

Suppose there are three contexts:𝑐5 = 𝐴, 𝐵, 𝐶 , 𝑐: = 𝐷, 𝐸 , 𝑐; = {𝐹, 𝐺}.

Algorithm (SUPERB+CPCES)

15

Figure 7. The algorithm of SUPERB.

Experiment Results• Definition of horizontal and vertical instances• The number of iterations• Plan length• The actual elapsed time• Counter-example improving time

16

Definition of Horizontal and VerticalInstance• We define an instance is a vertical (non-horizontal)

instance if all contexts have only one initial unknownvariable.

• We define an instance is a horizontal instance if theplanning instance features a context that spans over allstate variables whose initial value is uncertain.

17

Definition of Horizontal and VerticalInstance

18

Vertical instance Horizontal instance

Figure 8. An example of horizontal and vertical instance.

The Importance of Horizontal andVertical Instance

19

• If the problem is horizontal, there is no differencebetween SUPERB and CPCES.

Figure 9. Example of horizontal instance

The Number of Iterations

20

Figure 10. The number of iterations for horizontal, vertical and UTS instances.

Plan Length

21

Figure 11. The plan length for horizontal, vertical and UTS instances.

The Actual Elapsed Time

22

Figure 12. The actual elapsed time for horizontal, vertical and UTS instances.

Counter-example Improving Time

23

Figure 13. Comparing counter-example improving time and actual elapsed time.

Conclusion• Improve CPCES by finding good counter-examples.• A good counter-example contains as many new tags as

possible.• This new algorithm is called SUPERB.

24

Conclusion• SUPERB is more efficient for vertical instances in terms

of the number of iterations.• SUPERB almost has no effect on plan length.• SUPERB decreases the actual elapsed time for vertical

instances, and at the same time the time for horizontalinstances is almost the same as CPCES.

• Counter-example improving time accounts for a smallpart of total time.

25

Acknowledgement

26

References• Grastien, A., & Scala, E. (2018, June). Sampling

Strategies for Conformant Planning. In Twenty-EighthInternational Conference on Automated Planning andScheduling.

27

28

Thank you

• Each tag 𝑡 can be associated with a set of plan Π 𝑡 , sothat for all belief state 𝐵, the following formula holds:

• This is because if a plan Π is invalid, one of subgoalsmust be failed, which suggests one of tags in the context of this subgoal does not satisfy this plan. In other words, if a plan Π is valid, all the tags should satisfy this plan.

29

final presentation - research school of computer …...title final presentation created date...

Documents