final presentation - research school of computer …...title final presentation created date...
TRANSCRIPT
Reporter: Xiaodi ZhangSupervisor: Alban Grastien
Improving Sample Strategies forConformant Planning
Content
• Background• Method• Experiment Results• Conclusion
2
Background
• Conformant planning• CPCES
3
Conformant PlanningActions: go_North, go_East, go_West, go_South
Goal: go to “𝑔”
4
Figure 1. An example of conformant planning (Grastien and Scala, 2018).
Go there first.Then go to “𝑔”.
Plan: go_North*4; go_East*4; go_West*2; go_South*2
Conformant Planning• Definition: 𝑃 = < 𝑉, 𝐴, 𝑆), 𝑆* >
• 𝑉: a set of state variables, 𝑣 ∈ 𝑉• 𝐴: a set of actions, 𝑎 ∈ 𝐴
𝑎 = < 𝑝𝑟𝑒, 𝑒𝑓𝑓 >, where 𝑝𝑟𝑒 is precondition, 𝑒𝑓𝑓 is conditionaleffect.
• 𝑆): a set of initial states (belief states)• 𝑆*: a set of goal states
• The solution is a plan (a list of actions): 𝜋 = 𝑎5,… , 𝑎7
5
CPCES• Is created by Grastien and Scala in 2017.• Uses counter-examples (belief states) to avoid invalid
plans.• Example:
6
Empty belief state Find a candidate plan 𝜋5
Find a counter-example 𝐶𝐸5 breaks 𝜋5 Find a candidate plan 𝜋:
Find a counter-example 𝐶𝐸: breaks 𝜋: Find a candidate plan 𝜋;
No counter-example can be found Plan 𝜋; is a valid plan
CPCES Algorithm
7
Counter-example isgenerated randomly
Figure 2. The algorithm of CPCES (Grastien and Scala, 2018).
Method
• Good Counter-example• Context• Tag• Algorithm
8
Good Counter-example• Rather than generate counter-examples randomly, but
always find good counter-examples.• For example: suppose there are 3 variables 𝐿5, 𝐿:, 𝐿; and
4 values 𝐴, 𝐵, 𝐶, 𝐷.
9
Bad counter-examples(10 bad counter-examples)
Good counter-examples(4 good counter-examples)
Figure 3. An example of bad and good counter-example.
Context• We define a subgoal 𝜑 is a conjunct of preconditions of
an action or the goal.• We use (𝑣, 𝑙) to represent an assigned variable, where 𝑣
is the variable and 𝑙 is the value of 𝑣.
• We use 𝑒𝑓𝑓 𝑣, 𝑙 to represent the conditional effect of anaction.
• We say (𝑣, 𝑙) depends on (𝑣L, 𝑙L) if the conditional effect𝑒𝑓𝑓 𝑣, 𝑙 mentions 𝑣L, 𝑙L .
10
Context
11
(𝑑𝑖𝑠𝑝𝑜𝑠𝑒𝑑 𝑜1) depends on (ℎ𝑜𝑙𝑑𝑖𝑛𝑔 𝑜1)
Figure 4. PDDL language of 𝐷𝐼𝑆𝑃𝑂𝑆𝐸 domain.
Context• A context of subgoal 𝑐𝑡𝑥(𝜑) is a set of predicates that
contains all the predicates of a dependency network of a subgoal.
12
Figure 5. Contexts of 𝐷𝐼𝑆𝑃𝑂𝑆𝐸 domain with two objects.
Tag• We use 𝑐 to denote a context, and use 𝑠 to denote a
state.• The tag of a belief state 𝑞 for 𝑐 is 𝑡𝑎𝑔𝑐 𝑞 = 𝑠 ∩ 𝑐.• Example: a state is {𝐴, 𝐵, 𝐶, 𝐷} and a context is {𝐴, 𝐸}.
The tag is the intersection of two sets {𝐴}.• Each tag 𝑡 can be associated with a set of plan Π 𝑡 , so
that for all belief state 𝐵, the following formula holds:
• Therefore we want as many tags in 𝐵 as possible.
13
Use Tag to Compute a Good Counter-example
14
Figure 6. An example of compute an optimal counter-example.
Suppose there are three contexts:𝑐5 = 𝐴, 𝐵, 𝐶 , 𝑐: = 𝐷, 𝐸 , 𝑐; = {𝐹, 𝐺}.
Algorithm (SUPERB+CPCES)
15
Figure 7. The algorithm of SUPERB.
Experiment Results• Definition of horizontal and vertical instances• The number of iterations• Plan length• The actual elapsed time• Counter-example improving time
16
Definition of Horizontal and VerticalInstance• We define an instance is a vertical (non-horizontal)
instance if all contexts have only one initial unknownvariable.
• We define an instance is a horizontal instance if theplanning instance features a context that spans over allstate variables whose initial value is uncertain.
17
Definition of Horizontal and VerticalInstance
18
Vertical instance Horizontal instance
Figure 8. An example of horizontal and vertical instance.
The Importance of Horizontal andVertical Instance
19
• If the problem is horizontal, there is no differencebetween SUPERB and CPCES.
Figure 9. Example of horizontal instance
The Number of Iterations
20
Figure 10. The number of iterations for horizontal, vertical and UTS instances.
Plan Length
21
Figure 11. The plan length for horizontal, vertical and UTS instances.
The Actual Elapsed Time
22
Figure 12. The actual elapsed time for horizontal, vertical and UTS instances.
Counter-example Improving Time
23
Figure 13. Comparing counter-example improving time and actual elapsed time.
Conclusion• Improve CPCES by finding good counter-examples.• A good counter-example contains as many new tags as
possible.• This new algorithm is called SUPERB.
24
Conclusion• SUPERB is more efficient for vertical instances in terms
of the number of iterations.• SUPERB almost has no effect on plan length.• SUPERB decreases the actual elapsed time for vertical
instances, and at the same time the time for horizontalinstances is almost the same as CPCES.
• Counter-example improving time accounts for a smallpart of total time.
25
Acknowledgement
26
References• Grastien, A., & Scala, E. (2018, June). Sampling
Strategies for Conformant Planning. In Twenty-EighthInternational Conference on Automated Planning andScheduling.
27
28
Thank you
• Each tag 𝑡 can be associated with a set of plan Π 𝑡 , sothat for all belief state 𝐵, the following formula holds:
• This is because if a plan Π is invalid, one of subgoalsmust be failed, which suggests one of tags in the context of this subgoal does not satisfy this plan. In other words, if a plan Π is valid, all the tags should satisfy this plan.
29
30