learning control knowledge and case-based planning jim blythe, with additional slides from...

Learning control knowledgeand case-based planning

Jim Blythe, with additional slides from presentations by

Manuela Veloso

2USC INFORMATION SCIENCES INSTITUTE

Motivation

Planning is hard. PSpace-hard.

BUT.. this is a worst-case result

In many domains there may exist efficient strategies for planning

May be able to derive them automatically from experience


Controlling search

Every planning algorithm does search

Given a choice point, if makes incorrect choice, needs to backtrack and try other choices

If we can make the right choice the first time…


Prodigy

Explicit search control rules can apply to any decision point

Many different learning approaches have been implemented

Relatively old planning approach


Learning methods in Prodigy


Overview of Prodigy planning algorithm


Prodigy algorithm


Prodigy algorithm, part II


Decision points in Prodigy


Example domain: process planning


Example control rules in Prodigy


Review of explanation-based learning

Inputs: Target concept definition Training example Domain theory Operationality criterion

Output: Generalization of the training example that is Sufficient to describe the target concept, and Satisfies the operationality criterion

MV


The safe-to-stack example

Input: Target concept: safe-to-stack(x,y)

Training example:

on(obj1, obj2)

isa(obj1, box) isa(obj2, endtable)

color(obj1, red) color(obj2, blue)

volume(obj1, 1) density(obj1, 0.1), …

MV


The safe-to-stack example, cont.

Input:Domain theory: Not(fragile(y)) or lighter(x, y) => safe-to-stack(x,y) Volume(x,v) and density(x,d) => weight(x, v*d) Weight(x1, w1) and weight(x2, w2) and less(w1, w2)

=> lighter(x1, x2) Isa(x, endtable) => weight(x, 5) Less(0.1, 5), …

Operationality criterion:Learned description should use terms that describe objects directly, or are ‘easy’ to evaluate, e.g ‘less’

MV


The safe-to-stack example

Explain why obj1 is safe-to-stack on obj2 Construct a proof Do goal regression: regress target concept through the proof

structure Proof isolates relevant features

MV


Generating operational knowledge

Generalize proof Sometimes, simply replace constants by variables Prove that all identified relevant features are necessary in

general

Output:

volume(x,v1) and density(x,d1) and isa(y, endtable)

and less(v1*d1, 5)

=> safe-to-stack(x,y)

MV


Using EBL to improve plan quality

Given: planning domain, evaluation function

planner’s plan, a better plan Learn: control knowledge to produce the better plan

Explanation used: explain why the alternative plan is better

Target concept: control rules that make choices based on the planner state and meta-state


EBL in Prodigy

Used by Minton (88) to improve efficiency of planning

Version used in Quality (95) to improve quality of solution


Architecture of Quality system


Explaining better plans recursively


Explaining better plans recursively:target concept: shared subgoal


Example from process planning


Learned rules


Discussion

EBL is always correct, but Quality isn’t – only learns why plan B is better than plan A No guarantee of optimality

Linear additive evaluation function – how well does this model metrics we care about?

Generality of control rules

learning control knowledge and case-based planning jim blythe, with additional slides from...

Documents

prodigy slide

prodigy algorithm slide

time slide

experience slide

process planning slide

y mv slide

old planning approach

manuela veloso slide