1 update on learning by observation learning from positive examples only tolga konik university of...

26
1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

Upload: trent-christman

Post on 14-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

1

Update on Learning By Observation

Learning from Positive Examples Only

Tolga KonikUniversity of Michigan

Page 2: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

2

GOAL Generate AI agents by observing expert

task execution Engineering Goal

Reduce the cost of agent development Reduce the expertise required to develop

agent development.

AI Goal Agents that improve themselves observing

experts

Page 3: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

3

Learning Framework

Episodic Database

Behavior trace

rules

Annotations

Agent Architecture

Agent Program

BackgroundKnowledge

examples

Expert

Annotated Behavior trace

Behavior Recorder

Environmental Interface

Training Set Generator

Concept Learner

(ILP)

Knowledge Generator

Environment

external

Internal

Page 4: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

4

Learning with Redux

Episodic Database

Behavior trace

rules

Annotations

Agent Architecture

Agent Program

BackgroundKnowledge

examples

Expert

Annotated Behavior trace

Behavior Recorder

Environmental Interface

Training Set Generator

Concept Learner

(ILP)

Knowledge Generator

Environment

external

Internal

Redux

Page 5: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

5

Current Experiments

Episodic Database

Behavior trace

rules

Annotations

Agent Architecture

Agent Program

BackgroundKnowledge

examples

Expert

Annotated Behavior trace

Behavior Recorder

Environmental Interface

Training Set Generator

Concept Learner

(ILP)

Knowledge Generator

Environment

external

Internal

Expert Soar Agent

Page 6: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

7

Experiments in Haunt 2 Domain

Page 7: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

8

d1 d2 d3 d4

Move-to example

move-to-via-nodemove-to-connected-node

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

d5b d6b

r3

move-to-area

Page 8: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

9

move-to-via-node(Node)

move-to-area(Area)

An Example in Haunt Domain

r1

r2 r3

r4d1

d2d3 d4

d5 d6

move-to-connected-node(Node)

Page 9: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

10

r1

r2 r3

r4d1

d2d3 d4

d5 d6

move-to-via-node(Node)

move-to-area(Area)

move-to-connected-node(Node)

An Example in Haunt Domain

Page 10: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

11

r1

r3

d1

Correct selection condition for move-to-via-node

move-to-via-node(Node)

move-to-area(Area)

move-to-connected-node(Node)

An Example in Haunt Domain

Page 11: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

13

Termination(A)

A

positivenegative

Example GenerationOperator Concepts

Page 12: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

14

Selection(A)

A B

positive negative

Example GenerationOperator Concepts

Page 13: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

15

A Positive Example: selection(Sit20, move-to-via-node(d1) )

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

d5b d6b

Learning Examples

Page 14: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

16

General to Special Search with positive and negative examples

Page 15: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

17

General to Special Search with positive and negative examples

Page 16: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

18

General to Special Search with positive and negative examples

Page 17: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

19

General to Special Search with positive and negative examples

Page 18: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

20

General to Special Search with positive and negative examples

Page 19: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

21

move-to-via-node

Selection(move-to-via-node)

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

d5b d6b

move-to-connected-node

Problem in Choosing Parameters

Page 20: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

22

move-to-via-node

Positive Negative

Selection(move-to-via-node)

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

d5b d6b

move-to-connected-node

Problem in Choosing Parameters

Page 21: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

24

General to Specific Learning with Positive Examples Only

Positive

Page 22: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

25

General to Specific Learning with Positive Examples Only

d1

Positive

Page 23: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

26

A Positive Example of move-to-via-node:

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

d5b d6b

Learning Examples

Page 24: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

27

Random Examples of move-to-via-node

r1

r2 r3

r4d1

d2d3 d4

d5 d6 i4

i3

d5b d6b

For each positive example, use the same situation with parameters selected in other situations

Learning Examples

Page 25: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

28

Nuggets

Move-to operators are learned in Haunt domain ~ 3 mins of trace ~ 35000 situations ~ 10 min to prepare examples ~20 min for learning.

Page 26: 1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan

29

Coals

Missing Components It is still research not a tool