siddharth srivastava, shlomo zilberstein, neil immerman university of massachusetts amherst hector...
TRANSCRIPT
![Page 1: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/1.jpg)
Qualitative Numeric Planning
Siddharth Srivastava, Shlomo Zilberstein, Neil ImmermanUniversity of Massachusetts Amherst
Hector GeffnerUniversitat Pompeu Fabra
![Page 2: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/2.jpg)
The Story So Far…
Finite sets of states & registers
Actions with unit increments/decrements
[Lambek, ‘61]
Abacus Programs
![Page 3: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/3.jpg)
Abacus Programs
The reachability problem for abacus programs as a method for reasoning about cyclic control flows
But reachability is equivalent to the halting problem for Turing machines ….
Undecidable
Approach: identify subclasses or
less expressive frameworks
… cannot capture TM, but still useful [Srivastava et al.,
ICAPS-10]
![Page 4: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/4.jpg)
ND Quantitative Planning Problems
Consider situations where Actions increase or decrease numeric variables by unpredictable amounts Propositional variables can be added
Plans require cyclic control E.g., delivery problem with unknown ▪ Fuel▪ Distances ▪ Quantities of deliverables
Driving will use unpredictable amount of fuel
![Page 5: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/5.jpg)
Formulation
X: set of positive valued variables, O: set of actions, I: initial state, G: goal condition
States: numeric assignments to variables
Action effects: : increases value of variable
: decreases value of variable
Actions may have multiple effects
Action preconditions & goal condition: or , for some subset of variables
Lower bound specific to execution.Need not be known
![Page 6: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/6.jpg)
Example
A1: <> A2: <> Initial state: x=10, y=5; Goal: x=0 No finite acyclic solution!
Solution (intuitive):repeat (until x=0){ repeat (until y=0) { <>} <>}
![Page 7: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/7.jpg)
ND Quantitative Planning Problems: Solutions
Policy: States Actions
Policy trajectory for :
Solution criterion: Every bounded policy trajectory must
terminate at a goal state in finitely many steps.
But how do we express policies?Cannot map all possible states (real-valued assignments to variables)
![Page 8: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/8.jpg)
Expressing Solutions: Qualitative Formulation
Capture sets of ND numeric planning problems
Abstract/Qualitative states For each , only record or Initial state for previous example abstracted to:
Also represents infinitely many other non-zero assignments to and
Qualitative states capture sets of concrete states
![Page 9: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/9.jpg)
Qualitative Formulation
X: Boolean variables; I: initial state, G: goal condition, O: action operators
State = Boolean assignment to each ()
Action effects (non-deterministic but finite)
Preconditions & goal condition
![Page 10: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/10.jpg)
Solutions to the Qualitative Problem
Solutions represented as policies over qualitative states
Solution criterion: policy must solve every represented quantitative problem Termination of all –bounded trajectories
for all possible problem instantiations Goal achievement in all possible problem
instantiations
![Page 11: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/11.jpg)
All We Need Are Qualitative Policies
A quantitative policy is essentially qualitative iff: Maps all states represented by a qualitative state to the
same action
Very useful: Cannot have explicit policy representations over
quantitative states anyway
Theorem
A non-deterministic quantitative planning problem P has a solution policy
iffP has a policy that is essentially qualitative
![Page 12: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/12.jpg)
Solution Policy: Example
Policy
Transition Graph
![Page 13: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/13.jpg)
Qualitative Solution Tests Can we tell if a policy is correct without ever
having to instantiate the problems?
Define the transition graph for a policy: Nodes = qualitative states Edge iff
Two aspects of the solution criteria: Goal-closed – termination possible only at goal states▪ Traverse the transition graph to check this
Finiteness of all possible instantiated trajectories▪ ??
![Page 14: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/14.jpg)
Sieve Algorithm for Determining Termination
For every SCC:• Identify edges that
cannot be executed infinitely often
• Remove them, signifying stage when there executions have been exhausted
• Recurse on each resulting SCC
• Finally: terminating iff no SCC left on fixed point
![Page 15: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/15.jpg)
Sieve Algorithm: Properties
Completeness: if Sieve algorithm returns non-terminating, an infinite execution is possible Surprising because of similarity to abacus
programs
Theorem
The sieve algorithm for determining termination of a qualitative policy is sound
and complete.
![Page 16: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/16.jpg)
A Generate and Test Planner
Enumerate all possible policies (yes, this is impractical in general!) But computable! Check for
1. Goal-closed (any terminal nodes in transition graph must be goal nodes)
2. Termination using sieve algorithm
![Page 17: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/17.jpg)
Results
Problems Nested variables Snow plow: using
snow blower spills snow onto the driveway
Delivery with fuel, unknown number of objects and truck capacities
Trash-collection
Solution Time (s)
![Page 18: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/18.jpg)
Future Work
Improve generate and test: Start with strong cyclic qualitative
policies
Introduce constant landmarks/intervals of values
Identify limits of sieve algorithm’s applicability in abacus programs
![Page 19: Siddharth Srivastava, Shlomo Zilberstein, Neil Immerman University of Massachusetts Amherst Hector Geffner Universitat Pompeu Fabra](https://reader038.vdocuments.us/reader038/viewer/2022110304/5518c703550346881f8b5891/html5/thumbnails/19.jpg)
Conclusions
QNP gives the first framework for planning with loops where termination and correctness are decidable properties For any class of loops Any number of unbounded variables