by sergey alexandrov icml 2003

8
Ratbert: Nearest Sequence Ratbert: Nearest Sequence Memory Based Prediction Memory Based Prediction Model Model Applied to Robot Applied to Robot Navigation Navigation by Sergey Alexandrov by Sergey Alexandrov iCML 2003 iCML 2003

Upload: abla

Post on 13-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Ratbert: Nearest Sequence Memory Based Prediction Model Applied to Robot Navigation. by Sergey Alexandrov iCML 2003. Defining the Problem. Choosing a navigational action (simplified world: left, right, forward, back) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: by Sergey Alexandrov iCML 2003

Ratbert: Nearest Sequence Ratbert: Nearest Sequence Memory Based Prediction Memory Based Prediction

Model Applied Model Applied to Robot Navigationto Robot Navigation

by Sergey Alexandrovby Sergey Alexandrov

iCML 2003iCML 2003

Page 2: by Sergey Alexandrov iCML 2003

Defining the ProblemDefining the Problem► Choosing a navigational action (simplified world: left, right, Choosing a navigational action (simplified world: left, right,

forward, back)forward, back)► Consequence of action unknown given the immediate state Consequence of action unknown given the immediate state

(expected observation)(expected observation)► How to learn an unknown environment enough to accurately How to learn an unknown environment enough to accurately

predict such consequences?predict such consequences?

► Learning the entire model (POMDP) – for example, Baum-Welch Learning the entire model (POMDP) – for example, Baum-Welch (problem: slow)(problem: slow)

► Goal-finding tasks – learning a path to a specific state Goal-finding tasks – learning a path to a specific state (reinforcement problem) – for example, NSM (Nearest Sequence (reinforcement problem) – for example, NSM (Nearest Sequence Memory)Memory)

► Generalized observation prediction – NSMP (Nearest Sequence Generalized observation prediction – NSMP (Nearest Sequence Memory Predictor)Memory Predictor)

ApproachesApproaches

Page 3: by Sergey Alexandrov iCML 2003

► Experience Experience SeqSeqnn = {( = {(oo11,,aa11)…()…(oonn,,aann)})}

► NSMP(NSMP(SeqSeqnn) = observation predicted by executing ) = observation predicted by executing aann

► Derived by examining Derived by examining kk nearest matches (NNS) nearest matches (NNS)

NSMP in ShortNSMP in Short

oi

?

ai oi+1

o2

o3

o2

o1

Example (k=4):

Page 4: by Sergey Alexandrov iCML 2003

NSMP in Short (Cont.)NSMP in Short (Cont.)

► Based on Based on kkNN applied to sequences of previous experience (NSM)NN applied to sequences of previous experience (NSM)► Find k nearest (here: longest) sequence matches to immediately Find k nearest (here: longest) sequence matches to immediately

prior experienceprior experience

► Calculate weights for each observation reached by the Calculate weights for each observation reached by the kk sequence sequence sections (tradeoff between long matches, and high frequency of sections (tradeoff between long matches, and high frequency of matches)matches)

► Probability of each observation = normalized weightProbability of each observation = normalized weight► Predicted observation is the observation with the highest probabilityPredicted observation is the observation with the highest probability

jkk ooNNSs

k

jkkkj smetricavg

k

ooNNSsscountoweight

,

))((),|(

)(

otherwise

ooaaifssneighbor

ssneighbor

kjkjkj

kj

0

)()(),(1

),(

11

Page 5: by Sergey Alexandrov iCML 2003

TestingTesting► Ratbert: Lego-based robot capable of simple navigation inside a small Ratbert: Lego-based robot capable of simple navigation inside a small

maze. Senses walls in front, left, right, and noisy distance.maze. Senses walls in front, left, right, and noisy distance.

► Software simulation based on Ratbert’s sensor inputs (larger Software simulation based on Ratbert’s sensor inputs (larger environment, greater # of runs, longer sequences)environment, greater # of runs, longer sequences)

► Actions: {left, right, forward, back} Observations: {left, right, front, Actions: {left, right, forward, back} Observations: {left, right, front, distance}distance}

► For both trials, a training sequence was collected via random For both trials, a training sequence was collected via random exploration, then a testing sequence was executed, comparing the exploration, then a testing sequence was executed, comparing the predicted observation with the actual observation. For both, predicted observation with the actual observation. For both, kk was set to was set to 4.4.

► Results compared to bigrams.Results compared to bigrams.

Page 6: by Sergey Alexandrov iCML 2003

ResultsResults

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Training Sequence Length (0-50)

% C

orre

ct

0

10

20

30

40

50

60

70

80

90

Cro

ss

-En

tro

py

NSMP Prediction Rate Bigram Prediction Rate

NSMP Cross-Entropy Bigram Cross-Entropy

0%

10%

20%

30%

40%

50%

60%

70%

80%

Training Sequence Length (5-200)

% C

orr

ec

t

0

20

40

60

80

100

120

140

160

180

200

Cro

ss

-En

tro

py

NSMP Prediction Rate Bigram Prediction Rate

NSMP Cross-Entropy Bigram Cross-Entropy

► Plot: prediction rate vs. Plot: prediction rate vs. training sequence training sequence length.length.

► First graph is for First graph is for Ratbert, second graph Ratbert, second graph is for the software is for the software simulation.simulation.

► NSMP consistently NSMP consistently produced a better, produced a better, although not optimal, although not optimal, prediction rate.prediction rate.

Page 7: by Sergey Alexandrov iCML 2003

Further WorkFurther Work

► Comparison to other probabilistic predictive modelsComparison to other probabilistic predictive models► Determine optimal exploration methodDetermine optimal exploration method► Examine situations that trip up the algorithmExamine situations that trip up the algorithm► Go beyond “gridworld” concepts of Go beyond “gridworld” concepts of

left/right/forward/back to more realistic navigationleft/right/forward/back to more realistic navigation► Work on mapping real sensor data to discrete Work on mapping real sensor data to discrete

classes required by instance-based algorithms such classes required by instance-based algorithms such as NSM/NSMP (for example, using single linkage as NSM/NSMP (for example, using single linkage hierarchical clustering until cluster distance <= hierarchical clustering until cluster distance <= sensor error)sensor error)

Page 8: by Sergey Alexandrov iCML 2003

Thank YouThank You