learning control knowledge and case-based planning jim blythe, with additional slides from...
TRANSCRIPT
Learning control knowledgeand case-based planning
Jim Blythe, with additional slides from presentations by
Manuela Veloso
2USC INFORMATION SCIENCES INSTITUTE
Motivation
Planning is hard. PSpace-hard.
BUT.. this is a worst-case result
In many domains there may exist efficient strategies for planning
May be able to derive them automatically from experience
3USC INFORMATION SCIENCES INSTITUTE
Controlling search
Every planning algorithm does search
Given a choice point, if makes incorrect choice, needs to backtrack and try other choices
If we can make the right choice the first time…
4USC INFORMATION SCIENCES INSTITUTE
Prodigy
Explicit search control rules can apply to any decision point
Many different learning approaches have been implemented
Relatively old planning approach
5USC INFORMATION SCIENCES INSTITUTE
Learning methods in Prodigy
6USC INFORMATION SCIENCES INSTITUTE
Overview of Prodigy planning algorithm
7USC INFORMATION SCIENCES INSTITUTE
8USC INFORMATION SCIENCES INSTITUTE
Prodigy algorithm
9USC INFORMATION SCIENCES INSTITUTE
Prodigy algorithm, part II
10USC INFORMATION SCIENCES INSTITUTE
Decision points in Prodigy
11USC INFORMATION SCIENCES INSTITUTE
Example domain: process planning
12USC INFORMATION SCIENCES INSTITUTE
Example control rules in Prodigy
13USC INFORMATION SCIENCES INSTITUTE
Review of explanation-based learning
Inputs: Target concept definition Training example Domain theory Operationality criterion
Output: Generalization of the training example that is Sufficient to describe the target concept, and Satisfies the operationality criterion
MV
14USC INFORMATION SCIENCES INSTITUTE
The safe-to-stack example
Input: Target concept: safe-to-stack(x,y)
Training example:
on(obj1, obj2)
isa(obj1, box) isa(obj2, endtable)
color(obj1, red) color(obj2, blue)
volume(obj1, 1) density(obj1, 0.1), …
MV
15USC INFORMATION SCIENCES INSTITUTE
The safe-to-stack example, cont.
Input:Domain theory: Not(fragile(y)) or lighter(x, y) => safe-to-stack(x,y) Volume(x,v) and density(x,d) => weight(x, v*d) Weight(x1, w1) and weight(x2, w2) and less(w1, w2)
=> lighter(x1, x2) Isa(x, endtable) => weight(x, 5) Less(0.1, 5), …
Operationality criterion:Learned description should use terms that describe objects directly, or are ‘easy’ to evaluate, e.g ‘less’
MV
16USC INFORMATION SCIENCES INSTITUTE
The safe-to-stack example
Explain why obj1 is safe-to-stack on obj2 Construct a proof Do goal regression: regress target concept through the proof
structure Proof isolates relevant features
MV
17USC INFORMATION SCIENCES INSTITUTE
Generating operational knowledge
Generalize proof Sometimes, simply replace constants by variables Prove that all identified relevant features are necessary in
general
Output:
volume(x,v1) and density(x,d1) and isa(y, endtable)
and less(v1*d1, 5)
=> safe-to-stack(x,y)
MV
18USC INFORMATION SCIENCES INSTITUTE
Using EBL to improve plan quality
Given: planning domain, evaluation function
planner’s plan, a better plan Learn: control knowledge to produce the better plan
Explanation used: explain why the alternative plan is better
Target concept: control rules that make choices based on the planner state and meta-state
19USC INFORMATION SCIENCES INSTITUTE
EBL in Prodigy
Used by Minton (88) to improve efficiency of planning
Version used in Quality (95) to improve quality of solution
20USC INFORMATION SCIENCES INSTITUTE
Architecture of Quality system
21USC INFORMATION SCIENCES INSTITUTE
Explaining better plans recursively
22USC INFORMATION SCIENCES INSTITUTE
Explaining better plans recursively:target concept: shared subgoal
23USC INFORMATION SCIENCES INSTITUTE
Example from process planning
24USC INFORMATION SCIENCES INSTITUTE
25USC INFORMATION SCIENCES INSTITUTE
Learned rules
26USC INFORMATION SCIENCES INSTITUTE
Discussion
EBL is always correct, but Quality isn’t – only learns why plan B is better than plan A No guarantee of optimality
Linear additive evaluation function – how well does this model metrics we care about?
Generality of control rules