mutation operator evolution for ea-based neural networks

Mutation Operator Evolution Mutation Operator Evolution for EA-Based Neural for EA-Based Neural

NetworksNetworks

By Ryan MeuthBy Ryan Meuth

Reinforcement LearningReinforcement Learning

State

Action

Reward

Environment

Agent

State Value Estimate

Action Policy

Reinforcement LearningReinforcement Learning

Good for On-Line learning where little is known Good for On-Line learning where little is known about environmentabout environment

Easy to Implement in Discrete EnvironmentsEasy to Implement in Discrete Environments Value estimate can be stored for each stateValue estimate can be stored for each state In infinite time, optimal policy guaranteed.In infinite time, optimal policy guaranteed.

Hard to Implement in Continuous EnvironmentsHard to Implement in Continuous Environments Infinite States! Must estimate Value Function.Infinite States! Must estimate Value Function. Neural Networks Can be used for function Neural Networks Can be used for function

approximation.approximation.

Neural Network OverviewNeural Network Overview

Feed Forward Neural NetworkFeed Forward Neural Network Based on biological theories of neuron operationBased on biological theories of neuron operation

Feed-Forward Neural NetworkFeed-Forward Neural Network

Recurrent Neural NetworkRecurrent Neural Network

Neural Network OverviewNeural Network Overview

Traditionally used with Error Back-Traditionally used with Error Back-PropagationPropagation BP uses Samples to Generalize to ProblemBP uses Samples to Generalize to Problem Few “Unsupervised” Learning MethodsFew “Unsupervised” Learning Methods

Problems with No Samples: On-Line Problems with No Samples: On-Line LearningLearning

Conjugate Reinforcement Back Conjugate Reinforcement Back PropagationPropagation

EA-NNEA-NN

Both Supervised and Unsupervised Both Supervised and Unsupervised Learning Method.Learning Method.Uses weight set as genome of individualUses weight set as genome of individualFitness Function is Mean-Squared Error Fitness Function is Mean-Squared Error over target function.over target function.Mutation Operator is a sample from a Mutation Operator is a sample from a Gaussian Distribution.Gaussian Distribution. Possible that mutation operator might not be Possible that mutation operator might not be

best.best.

Uh… Why?Uh… Why?

Could improve EA-NN efficiencyCould improve EA-NN efficiency Faster Online LearningFaster Online Learning Revamped tool for Reinforcment LearningRevamped tool for Reinforcment Learning Smarter Robots.Smarter Robots.

Why Use an EA?Why Use an EA? Knowledge – IndependentKnowledge – Independent

Experimental ImplementationExperimental Implementation

First Tier – Genetic ProgrammingFirst Tier – Genetic Programming Individual is Parse-tree representing Mutation Individual is Parse-tree representing Mutation

operatoroperator Fitness is Inverse of sum of MSE’s from EA TestbedFitness is Inverse of sum of MSE’s from EA Testbed

Second Tier – EA TestbedSecond Tier – EA Testbed 4 EA’s, spanning 2 classes of problems4 EA’s, spanning 2 classes of problems 2 Feed-Forward Non-Linear Approximations2 Feed-Forward Non-Linear Approximations

1 High-Order, 1 Low-Order1 High-Order, 1 Low-Order 2 Recurrent Time Series Predictions2 Recurrent Time Series Predictions

1 Will be Time-Delayed, 1 Not Time-Delayed1 Will be Time-Delayed, 1 Not Time-Delayed

GP ImplementationGP Implementation

Functional Set: {+,-,*,/}Functional Set: {+,-,*,/}Terminal Set:Terminal Set:

Weight to be ModifiedWeight to be Modified Random ConstantRandom Constant Uniform Random VariableUniform Random Variable

Over-Selection: 80% of Parents from top 32% Over-Selection: 80% of Parents from top 32% Rank-Based SurvivalRank-Based SurvivalInitialized by Grow Method (Max Depth of 8)Initialized by Grow Method (Max Depth of 8)Fitness: 1000/(AvgMSE) – num_nodesFitness: 1000/(AvgMSE) – num_nodesP(Recomb) = 0.5; P(Mutation) = 0.5;P(Recomb) = 0.5; P(Mutation) = 0.5;Repair FunctionRepair Function5 runs, 100 generations each.5 runs, 100 generations each.Steady State: Population of 1000 individuals, 20 children per Steady State: Population of 1000 individuals, 20 children per generation.generation.

EA-NN ImplementationEA-NN Implementation

Recombination: Multi-Point CrossoverRecombination: Multi-Point Crossover

Mutation: Provided by GPMutation: Provided by GP

Fitness: MSE over test function (minimize)Fitness: MSE over test function (minimize)

P(Recomb) = 0.5; P(Mutation) = 0.5;P(Recomb) = 0.5; P(Mutation) = 0.5;

Non-Generational: Population of 10 Non-Generational: Population of 10 individuals, 10 children per generationindividuals, 10 children per generation

50 Runs of 50 Generations.50 Runs of 50 Generations.

ResultsResults

This is where results would go.This is where results would go.

Single Uniform Random Variable: ~380Single Uniform Random Variable: ~380

Observed Individuals: ~600Observed Individuals: ~600

Improvement! Just have to Wait and Improvement! Just have to Wait and See…See…

ConclusionsConclusions

I don’t know anything yet.I don’t know anything yet.

Questions?Questions?

Thank You!Thank You!

mutation operator evolution for ea-based neural networks

Documents

neural networkbased

neural network overviewfeed

function approximation

target function

value function

mutation operator evolution

weight set

mutation operatorfitness