learning control knowledge and case-based planning jim blythe, with additional slides from...

Post on 18-Dec-2015

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Learning control knowledgeand case-based planning

Jim Blythe, with additional slides from presentations by

Manuela Veloso

2USC INFORMATION SCIENCES INSTITUTE

Motivation

Planning is hard. PSpace-hard.

BUT.. this is a worst-case result

In many domains there may exist efficient strategies for planning

May be able to derive them automatically from experience

3USC INFORMATION SCIENCES INSTITUTE

Controlling search

Every planning algorithm does search

Given a choice point, if makes incorrect choice, needs to backtrack and try other choices

If we can make the right choice the first time…

4USC INFORMATION SCIENCES INSTITUTE

Prodigy

Explicit search control rules can apply to any decision point

Many different learning approaches have been implemented

Relatively old planning approach

5USC INFORMATION SCIENCES INSTITUTE

Learning methods in Prodigy

6USC INFORMATION SCIENCES INSTITUTE

Overview of Prodigy planning algorithm

7USC INFORMATION SCIENCES INSTITUTE

8USC INFORMATION SCIENCES INSTITUTE

Prodigy algorithm

9USC INFORMATION SCIENCES INSTITUTE

Prodigy algorithm, part II

10USC INFORMATION SCIENCES INSTITUTE

Decision points in Prodigy

11USC INFORMATION SCIENCES INSTITUTE

Example domain: process planning

12USC INFORMATION SCIENCES INSTITUTE

Example control rules in Prodigy

13USC INFORMATION SCIENCES INSTITUTE

Review of explanation-based learning

Inputs: Target concept definition Training example Domain theory Operationality criterion

Output: Generalization of the training example that is Sufficient to describe the target concept, and Satisfies the operationality criterion

MV

14USC INFORMATION SCIENCES INSTITUTE

The safe-to-stack example

Input: Target concept: safe-to-stack(x,y)

Training example:

on(obj1, obj2)

isa(obj1, box) isa(obj2, endtable)

color(obj1, red) color(obj2, blue)

volume(obj1, 1) density(obj1, 0.1), …

MV

15USC INFORMATION SCIENCES INSTITUTE

The safe-to-stack example, cont.

Input:Domain theory: Not(fragile(y)) or lighter(x, y) => safe-to-stack(x,y) Volume(x,v) and density(x,d) => weight(x, v*d) Weight(x1, w1) and weight(x2, w2) and less(w1, w2)

=> lighter(x1, x2) Isa(x, endtable) => weight(x, 5) Less(0.1, 5), …

Operationality criterion:Learned description should use terms that describe objects directly, or are ‘easy’ to evaluate, e.g ‘less’

MV

16USC INFORMATION SCIENCES INSTITUTE

The safe-to-stack example

Explain why obj1 is safe-to-stack on obj2 Construct a proof Do goal regression: regress target concept through the proof

structure Proof isolates relevant features

MV

17USC INFORMATION SCIENCES INSTITUTE

Generating operational knowledge

Generalize proof Sometimes, simply replace constants by variables Prove that all identified relevant features are necessary in

general

Output:

volume(x,v1) and density(x,d1) and isa(y, endtable)

and less(v1*d1, 5)

=> safe-to-stack(x,y)

MV

18USC INFORMATION SCIENCES INSTITUTE

Using EBL to improve plan quality

Given: planning domain, evaluation function

planner’s plan, a better plan Learn: control knowledge to produce the better plan

Explanation used: explain why the alternative plan is better

Target concept: control rules that make choices based on the planner state and meta-state

19USC INFORMATION SCIENCES INSTITUTE

EBL in Prodigy

Used by Minton (88) to improve efficiency of planning

Version used in Quality (95) to improve quality of solution

20USC INFORMATION SCIENCES INSTITUTE

Architecture of Quality system

21USC INFORMATION SCIENCES INSTITUTE

Explaining better plans recursively

22USC INFORMATION SCIENCES INSTITUTE

Explaining better plans recursively:target concept: shared subgoal

23USC INFORMATION SCIENCES INSTITUTE

Example from process planning

24USC INFORMATION SCIENCES INSTITUTE

25USC INFORMATION SCIENCES INSTITUTE

Learned rules

26USC INFORMATION SCIENCES INSTITUTE

Discussion

EBL is always correct, but Quality isn’t – only learns why plan B is better than plan A No guarantee of optimality

Linear additive evaluation function – how well does this model metrics we care about?

Generality of control rules

top related