learning control knowledge and case-based planning jim blythe, with additional slides from...

Learning control knowledgeand case-based planning

Jim Blythe, with additional slides from presentations by

Manuela Veloso

2USC INFORMATION SCIENCES INSTITUTE

Motivation

Planning is hard. PSpace-hard.

BUT.. this is a worst-case result

In many domains there may exist efficient strategies for planning

May be able to derive them automatically from experience

Controlling search

Every planning algorithm does search

Given a choice point, if makes incorrect choice, needs to backtrack and try other choices

If we can make the right choice the first time…

Prodigy

Explicit search control rules can apply to any decision point

Many different learning approaches have been implemented

Relatively old planning approach

Learning methods in Prodigy

Overview of Prodigy planning algorithm

Prodigy algorithm

Prodigy algorithm, part II

Decision points in Prodigy

Example domain: process planning

Example control rules in Prodigy

Review of explanation-based learning

Inputs: Target concept definition Training example Domain theory Operationality criterion

Output: Generalization of the training example that is Sufficient to describe the target concept, and Satisfies the operationality criterion

The safe-to-stack example

Input: Target concept: safe-to-stack(x,y)

Training example:

on(obj1, obj2)

isa(obj1, box) isa(obj2, endtable)

color(obj1, red) color(obj2, blue)

volume(obj1, 1) density(obj1, 0.1), …

The safe-to-stack example, cont.

Input:Domain theory: Not(fragile(y)) or lighter(x, y) => safe-to-stack(x,y) Volume(x,v) and density(x,d) => weight(x, v*d) Weight(x1, w1) and weight(x2, w2) and less(w1, w2)

=> lighter(x1, x2) Isa(x, endtable) => weight(x, 5) Less(0.1, 5), …

Operationality criterion:Learned description should use terms that describe objects directly, or are ‘easy’ to evaluate, e.g ‘less’

The safe-to-stack example

Explain why obj1 is safe-to-stack on obj2 Construct a proof Do goal regression: regress target concept through the proof

structure Proof isolates relevant features

Generating operational knowledge

Generalize proof Sometimes, simply replace constants by variables Prove that all identified relevant features are necessary in

general

Output:

volume(x,v1) and density(x,d1) and isa(y, endtable)

and less(v1*d1, 5)

=> safe-to-stack(x,y)

Using EBL to improve plan quality

Given: planning domain, evaluation function

planner’s plan, a better plan Learn: control knowledge to produce the better plan

Explanation used: explain why the alternative plan is better

Target concept: control rules that make choices based on the planner state and meta-state

EBL in Prodigy

Used by Minton (88) to improve efficiency of planning

Version used in Quality (95) to improve quality of solution

Architecture of Quality system

Explaining better plans recursively

Explaining better plans recursively:target concept: shared subgoal

Example from process planning

Learned rules

Discussion

EBL is always correct, but Quality isn’t – only learns why plan B is better than plan A No guarantee of optimality

Linear additive evaluation function – how well does this model metrics we care about?

Generality of control rules

learning control knowledge and case-based planning jim blythe, with additional slides from...

prodigy slide

prodigy algorithm slide

time slide

experience slide

process planning slide

y mv slide

old planning approach

manuela veloso slide

Documents

segway cmbalance robot soccer player - robotics...

1 scs building committee. 2 the scs building committee...

adversarial reinforcement learning › ~cavmj ›...

representation and search - carnegie mellon university ·...

anabel veloso flamenco company

cmrobobits: creating an intelligent aibo robot instructors:...

;r a leÀo veloso - marxists internet archiver a leÀo...

multiagent learning using a variable learning ratemultiagent...

manuela veloso 15-381 - fall 2001 - carnegie mellon school...

play-based team coordination boeing treasure hunt individual...

monografia - carolina barbosa veloso

aaai14 technical conference schedule monday, july 28 6:00...

mike manuela veloso phillips - cs.cmu.edu mellon university...

blythe by alba

multiagent systems: a survey from a machine...

active sensing data collection with autonomous...

confidence based autonomy: policy learning by demonstration...

a report on the ijcai-07 program - carnegie mellon...

a sedução de caetano veloso

25 blythe road