computed prediction: so far, so good. what now?

Computed PredictionSo far, so good. What now?

Pier Luca Lanzi

Politecnico di Milano, ItalyIllinois Genetic Algorithms Laboratory,University of Illinois at Urbana Champaign, USA

What is the problem?

Environment

st atrt+1st+1

Compute a value function Q(st,at) mapping state-action pairs into expected future payoffs

How much future reward when action at is performed in state st?

What is the expected payoff for st and at?

GOAL: maximize the amount of reward received in the long run

Example: The Mountain Car

Task: drive an underpoweredcar up a steep mountain road

a t = a

st = position, velocity

rt = 0 when goal isreached, -1 otherwise.

Value FunctionQ(st,at)

What are the issues?

Exact representation infeasible Approximation mandatory The function is unknown,

it is learnt online from experience

Learning the unknown payoff functionwhile also trying to approximate it

Approximator works on intermediate estimatesbut it also tries to provide information for the

learning

Convergence is not guaranteed

Classifiers

Learning Classifier Systems

Solve reinforcement learning problems

Represent the payoff function Q(st, at) asa population of rules, the classifiers.

Classifiers are evolved whileQ(st, at) is learnt online

payoff

surface for A

What is a classifier?

IF condition C is true for input sTHEN the payoff of action a is p

payoff

ConditionC(s)=l≤s≤u

General conditionscovering large portionsof the problem space

Accurateapproximations

Generalization depends on how wellconditions can partition the problem space

What is the best representation for theproblem?

Several representations have beendeveloped to improve generalization

payoff

landscape of A

What is computed prediction?

Replace the prediction p bya parametrized functionp(x,w)

payoff

p(x,w)=w0+xw1

ConditionC(s)=l≤s≤u

IF condition C is true for input sTHEN the value of action a isp(x,w)

Which Representation?

Which type ofapproximation?

Computed Prediction:Linear approximation

Each classifier has a vector of parameters wClassifier prediction is computed as,

Classifier weights are updated usingWidrow-Hoff update,

Summary

Typical RL approach:What is the best approximator?

GOAL: Learn thepayoff function

Typical LCS approach asks:What is the best representation

for the problem?

What are the differences?

REPRESENTATION

intervals messy Symbols

Hullsellipsoid0/1/#AP

GradientDescent

Radial Basis

Tile Coding

ComputedPrediction

BooleanRepresentationSigmoidPrediction

BooleanRepresentation

NeuralPrediction

(O’hara & Bull2004)

Real IntervalsNeuralPrediction

Convex HullsLinearPrediction

To represent or to approximate?

Powerful representations allow the solution ofdifficult problems with basic approximators

Powerful approximators may make thechoice of the representation less critical

Experiment

Consider a very powerful approximatorthat we know it can solve a certain RL problem

Use it to compute classifier prediction in an LCSand apply the LCS to solve the same problem

Does genetic search stillprovide an advantage?

Computed prediction with Tile Coding

Powerful approximator developed inthe reinforcement learning community

Tile coding can solve the mountain car problemgiven an adequate parameter setting

Classifier prediction is computed using tile coding Each tile coding has a different parameter settings When using tile coding to compute

classifier prediction, one classifier cansolve the whole problem

What should we expect?

The performance?

Computed prediction can perform as well as theapproximator with the most adequate configuration

The evolution of a population of classifiersprovides advantages over one approximator

Even if the same approximator alonemight solve the whole problem

How do parameters evolve?

What now?

What now?REPRESENTATION

Problem

Whichrepresentation?

Whichapproximator?

Which approximator?

Let evolution decide!

Population of classifiers using differentapproximators to compute prediction

The genetic algorithm selects the bestapproximators for each problem subspace

Evolving the best approximator

What next?REPRESENTATION

Problem

Whichrepresentation?

Whichapproximator?

Which approximator?

Let evolution decide!

Population of classifiers using differentapproximators to compute prediction

Even if the same approximator alonemight solve the whole problem

Evolving Heterogeneous Approximators

HeterogeneousApproximators

Most PowerfulApproximator

What next?

Allow different representationsin the same populations

Let evolution evolve the most adequaterepresentation for each problem subspace

Then, allow different representations anddifferent approximators evolve all together

Probably donefor BooleanConditions

Acknowledgements

Daniele LoiaconoMatteo ZaniniAll the current and former

members of IlliGAL

Thank you!Any question?

computed prediction: so far, so good. what now?

Technology

coronary computed tomographic prediction rule for time-efﬁ...

modeling and prediction of amino acids lipophylicity...

tides & currents · the art and science of tides & tidal...

prof. dr. philippecattin: computed tomography …prof. dr....

automated 5-year mortality prediction using deep...

physics of computed radiography (cr) computed radiography

prediction of covid-19 with computed tomography images...

a (not so) quick introduction to protein function...

prediction of intimal tear site by computed … ·...

neutron computed tomography -...

computed tomography (ct) and computed tomography

computed tomography

computed tomographywiki

services computed tomography and computed …

computed radography

· web viewassessment of abdominal aortic calcification...

computed tomography (ct) scan - gillette children's ......

digital projection radiography computed radiography digital...

services in computed tomography and computed …

single-photon emission computed tomography/computed...