modeling with irene

80
Modeling with IRENE Integrated R-code for Engineered Neural Evolution Trevor Grant and Olcay Akman Department of Mathematics Illinois State University

Upload: chika

Post on 23-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Modeling with IRENE. I ntegrated R -code for E ngineered N eural E volution Trevor Grant and Olcay Akman Department of Mathematics Illinois State University. Overview. Neural Evolution What is a Neural Network? Using genetic algorithms to find optimal parameters to nonlinear functions - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Modeling with IRENE

Modeling with IRENEIntegrated R-code for Engineered Neural EvolutionTrevor Grant and Olcay AkmanDepartment of MathematicsIllinois State University

Page 2: Modeling with IRENE

OverviewNeural Evolution

What is a Neural Network? Using genetic algorithms to find optimal

parameters to nonlinear functions Neural evolution

Special Population Attributes Jump Connections User defined libraries and learning functions Mutating learning functions

Engineered Genetic Algorithms

Page 3: Modeling with IRENE

What is a Neural

Network?

Page 4: Modeling with IRENE

Starting out simpleWe begin by modeling the data with a simple linear model. We then look at the sum of the squared residuals (SSR). A value is assigned to the model based on this SSR.

β0β1

β2

β3

Inputs (X1, X2, …, Xn)

Output (Y)

Page 5: Modeling with IRENE

Example1974 Statistics regarding Income

Income:per capita income (1974)

Life Exp:life expectancy in years (1969–71)

Murder:murder and non-negligent manslaughter rate per 100,000 population (1976)HS Grad:percent high-school graduates (1970)

Frost:mean number of days with minimum temperature below freezing (1931–1960) in capital or large city

β0β1

β2

β3

Inputs (X1, X2, …, Xn)

Output (Y)

Income

Life Expectancy

Murder Rate

HS Grad %

Frost

Page 6: Modeling with IRENE

ResidualsThe difference between the estimated value

and the fitted value is known as the residual

Page 7: Modeling with IRENE

Sum of Squared ResidualsHeight

Age

A linear model is estimatedwhich minimizes the sum of squared residuals (SSR). The distance between theestimates and the actualdata points.

Page 8: Modeling with IRENE

RelationshipLinear

Traditionally we estimate linear relationships.

Nonlinear True relationship may be (often is)

non-linear Sometimes we know relationship

and can use nonlinear regression methods such as Neural Networks Nonlinear least squares

Sometimes we don’t know the functional form of the relationship. IRENE explores functional forms

while estimating parameters.

Page 9: Modeling with IRENE

Sum of Squared ResidualsHeight

Age

A nonlinear model reducesthe sum of squared residualsand better models theactual data.

Page 10: Modeling with IRENE

Anatomy of a neural network

LayersNodes

Page 11: Modeling with IRENE

What’s in a node?

A node contains a learning functionThe learning function takes input and parameters converts it to output.

Page 12: Modeling with IRENE

A model has parameter values

α11α12α13

α14

Page 13: Modeling with IRENE

Let’s pretend the first observation contains these values

22

2

-14

.1

Page 14: Modeling with IRENE

Now say a model has these parameters:

22

2

-14

.1

5-4.12

2

Page 15: Modeling with IRENE

And the learning function on this node is exponential

1

2

-14

.1

5-4.12

2

h1

Page 16: Modeling with IRENE

So the value for node h1 for the first observation is .1108

1

2

-14

.1

5-4.12

2

h1.1108

Page 17: Modeling with IRENE

This is repeated for each observation

Drag picture to placeholder or click icon to add

Each model has it’s own unique set of α. The fitted values of the output are functions of

Observation h11 .11082 1.5243 .5294 1.011… …n 1.752

After this is complete a linear model is estimated. The values of the nodes in the last layer are regressed on the output. The sum of the squared residuals is assigned as the model’s value.

Page 18: Modeling with IRENE

The linear model estimated

The sum of the squared residuals of the model (SSR) is referred to as the value of the model. We want a model that minimizes sum of squared residuals (or value).

Page 19: Modeling with IRENE

Linear model estimated in a more complex neural network

h11

h12

h13

h21

h22

NOTE: h11, h12, h13 are not included in the final linear model. Only the nodes in the final layer are included in the linear model

Page 20: Modeling with IRENE

Optimizing Parameters with Genetic Algorithms Step 1: A population of models is created each with

randomly assigned parameters Step 2: Models ‘mate’ in the hope of creating ‘children’

models with better value (lower SSR).

From now on we will refer to each unique set of parameters in a model as a creature. A collection of creatures, models with identical topology but different parameters, is referred to as a species.

Page 21: Modeling with IRENE

Copy this model 200 times, each copy has randomly assigned parameter values

Each individual collection of parameters is referred to as a creature. The collection of creatures for a given topology (arrangement of layers and nodes) is referred to as a species.

CreatureSpecies

Page 22: Modeling with IRENE

Species

A species has a unique arrangement of nodes, layers and learning functions. Even though these creatures have the same arrangement of layers and nodes, they have a different learning function and so they are different species

≠Sigmoid Learning Function Exponential Learning Function

Page 23: Modeling with IRENE

Then each creature has a different computed value (SSR), and assigned ID#, this is saved in a table.

ID # 001

ID # 002

ID # 003

41,240

215,635

3,612

Model ID Sum Squared Resid. (SSR)

Page 24: Modeling with IRENE

Two creatures are selected with probability weighted according to model fitness.

ID # 001

ID # 002

ID # 003

41,240

215,635

3,612

Model ID Sum Squared Resid. (SSR)

Page 25: Modeling with IRENE

Each creature can be represented by DNA

2.512.10551.25-

15.2

α11Model Structure α12 α13 α14

Page 26: Modeling with IRENE

Two methods of matingAverage

The average of each parameter in the mother’s and father’s DNA is averaged in the child’s DNA

Crossover A ‘cut point’ is randomly

determined, every parameter before the cut point is inherited from the father, after the cut point each parameter is inherited from the mother

Page 27: Modeling with IRENE

DNA is selected from the two creatures chosen to mate.

ID # 001

ID # 002

ID # 003

41,240

215,635

3,612

Model ID Sum Squared Resid. (SSR)

α11=2.512Mother α12=.105 α13=51.25 α14=-15.2

α11=3.613 Father α12=26.252 α13=-25.12 α14=104.4

Page 28: Modeling with IRENE

Average MethodΑ11=3.613 Father Α12=26.252 Α13=-25.12 Α14=104.4

α11=2.512Mother α12=.105 α13=51.25 α14=-15.2

Α11=(3.613+2.512)/2

=3.0625

Child Α12=(26.252+.105)/2

=13.1785

Α13=(-25.12+51.25)/2

=13.065

Α14=(104.4-15.2)/2

=44.6

Page 29: Modeling with IRENE

Average MethodΑ11=3.613 Father Α12=26.252 Α13=-25.12 Α14=104.4

α11=2.512Mother α12=.105 α13=51.25 α14=-15.2

Α11=3.0625 Child Α12=13.1785 Α13=13.065 Α14=44.6

Page 30: Modeling with IRENE

Crossover MethodA random number between one and the length

of the parameter sequence is chosen. This is the ‘cut point’. The child inherits

parameters from the father before this point, from the mother after.

Page 31: Modeling with IRENE

Crossover Method: Cut point at position two

Α11=3.613Father Α12=26.252 Α13=-25.12 Α14=104.4

α11=2.512Mother α12=.105 α13=51.25 α14=-15.2

Child

Α11=3.613 Α12=26.252

α13=51.25 α14=-15.2

Page 32: Modeling with IRENE

The least fit creatures are killed to make room for the new children

ID # 001

ID # 002

ID # 003

41,240

3,289

215,635

Model ID Sum Squared Resid. (SSR)

Page 33: Modeling with IRENE

The least fit creatures are killed to make room for the new children

ID # 001

ID # 002

41,240

3,289

Model ID Sum Squared Resid. (SSR)

Page 34: Modeling with IRENE

The least fit creatures are killed to make room for the new children

ID # 001

ID # 002

41,240

3,289

Model ID Sum Squared Resid. (SSR)

Α11=3.0625Model Structure Α12=13.1785 Α13=13.065 Α14=44.6

Page 35: Modeling with IRENE

The children are assigned new ID numbers and their value (SSR) is computed

ID # 001

ID # 002

41,240

3,289

Model ID Sum Squared Resid. (SSR)

ID # 004 6,755

Page 36: Modeling with IRENE

This process repeats several times

ID # 001

ID # 002

41,240

3,289

Model ID Sum Squared Resid. (SSR)

ID # 004 6,755

Page 37: Modeling with IRENE

This process repeats several times

ID # 005

ID # 002

4,242

3,289

Model ID Sum Squared Resid. (SSR)

ID # 004 6,755

Page 38: Modeling with IRENE

This process repeats several times

ID # 005

ID # 002

4,242

3,289

Model ID Sum Squared Resid. (SSR)

ID # 007 3,111

Page 39: Modeling with IRENE

This process repeats several times

ID # 008

ID # 002

4,841

3,289

Model ID Sum Squared Resid. (SSR)

ID # 007 3,111

Page 40: Modeling with IRENE

Eventually there is convergence at an optimum (either local or global)

ID # 239

ID # 159

2,015

2,015

Model ID Sum Squared Resid. (SSR)

ID # 412 2,015

Page 41: Modeling with IRENE

At convergence we kill all the extra creatures in the species (to free up memory)

ID # 239

ID # 159

2,015

2,015

Model ID Sum Squared Resid. (SSR)

ID # 412 2,015

Page 42: Modeling with IRENE

What is neural evolution?Neural evolution: simultaneously explore new

topologies while optimizing existing topologies. New species are born out of old species.

Page 43: Modeling with IRENE

‘Growing’ new nodes

(We don’t always wait for convergence to add new layers and nodes…)

Page 44: Modeling with IRENE

We call each arrangement of layers, nodes and learning functions a species.

Page 45: Modeling with IRENE

Who lives? Who dies? After each

generation a roster of all creatures is created and ordered according to value.

Species ID

Creature ID Value (SSR)

003 043 12123003 021 12552002 231 13241003 054 15125001 152 20150005 024 25124003 122 35102002 105 53039… … …001 412 124310151

Page 46: Modeling with IRENE

Who lives? Who dies? If there is at least one creature of species in

the top 60%* of a list of all creatures the species survives. Otherwise the entire species is eradicated.

*60% is arbitrary. We can set that to other proportions. We’ll talk about this more in engineered genetic algorithms.

Drag picture to placeholder or click icon to add

Page 47: Modeling with IRENE

Example:Species2233223113

Species 2

Species 1

Species 3

60%

Survivors: No creature of Species 1 is among them

Page 48: Modeling with IRENE

While each species searches for optimums, new ones are born and others dies out.

Page 49: Modeling with IRENE

We could search forever, but we stop our search based on time or generations elapsed.

Page 50: Modeling with IRENE

Special Population

AttributesJump connections, user defined libraries and

learning functions, and mutating functional forms.

Page 51: Modeling with IRENE

Jump Connections

With jump connections, all nodes and input are regressed on the output.

In a standard neural network, only the nodes in the final layer is regressed on the output.

Page 52: Modeling with IRENE

Jump Connections

h11

h22

h23

x1

h12

h21

x2

x3

x4

Page 53: Modeling with IRENE

Colinearity If jump connections are used and the learning

function is linear then the final linear model will have perfect colinearity. (The computer won’t be able to estimate the final model, this is bad and a failsafe is built in to prevent this from happening)

Page 54: Modeling with IRENE

Colinearity

Page 55: Modeling with IRENE

Libraries of Learning Functions Each time a node is

created a learning function is randomly selected from the library.

Function1: Exponential

Function 2: Sigmoid

Function 3: Logit

Function 4: Step Function

Page 56: Modeling with IRENE

User Defined Functions Suppose theory dictates that a particular nonlinear

relationship possibly exists. For example consider Michaelis-Menten kinetics model of enzyme-kinetics.

The researcher can add this functional form to the library to be selected as a possible learning function for nodes.

The standard library contains common functional forms, however certain cases may require special functional forms which can be added by the researcher as needed.

Page 57: Modeling with IRENE

Mutating learning functions Function 3: Sigmoid:

Function 5: Exponential

New Function: Composite:

The researcher can also choose to allow for mutating learning functions.

New composite learning function

Page 58: Modeling with IRENE

Populations

Each collection of species is called a population.

Page 59: Modeling with IRENE

Population attributesMax creatures in speciesLibraryAllow functional mutationsMaximum layersMaximum nodesMutation ratesAllow jump connectionsetc.

Page 60: Modeling with IRENE

Determining population attributesHow many generations should a population be

allowed to run? Should Jump connections be allowed?What portion of the roster should be the cut off

point for determining species survival?Should function mutations be allowed?And other settable attributes…

Page 61: Modeling with IRENE

Populations can be represented with DNA tooPopulation 1

Max creatures 200

Library StdLib

Maximum Layers

3

Maximum Generations

5000

Allow Jump Connections

YES

Population 2

Max creatures 150

Library UsrDef

Maximum Layers

4

Maximum Generations

7000

Allow Jump Connections

NO

Page 62: Modeling with IRENE

Engineered Genetic Algorithms

Engineered Genetic Algorithms refers to using genetic algorithms to find the optimal

population settings for a neural evolution algorithm.

Page 63: Modeling with IRENE

Parsing the data set

Data set of observations

Page 64: Modeling with IRENE

Parsing the data set

Training Data Set Validation Data Set Second ValidationData Set

Page 65: Modeling with IRENE

Evaluating PopulationsCreatures are evaluated on how well they fit

the training data. Creatures that minimize SSE in training data set are considered most fit

Populations are evaluated on how well they predict out of sample. The best creature the population is able to produce is evaluated in the validation data set and SSE is computed. Population that produces creature that minimizes SSE in validation data set is considered most fit.

Page 66: Modeling with IRENE

Example with 3 populations

Pop1

Pop3

Pop2

Each population chooses it’schampion.

Page 67: Modeling with IRENE

The Champion

Recall: each species is comprised of several creatures. The champion is the optimal creature of the optimal species in the populations. I.e. the creature that best minimizes SSE in the entire population.

Page 68: Modeling with IRENE

The Champions CompeteValidation Data Set

Pop1 Pop2 Pop3

Page 69: Modeling with IRENE

Out of Sample Evaluation

0123456

Real DataPop1 PredictionPop2 PredictionPop3 Prediction

Validation Data Set

In this example, Pop3 preforms best, Pop1 is worst. Pop3 and Pop2 are most likely to be selected for mating

Page 70: Modeling with IRENE

Population parameters come in two varietiesNumerical Continuous Examples:

Max Layers (3) Initial Species

Population (300) Mating may be either

crossover or averaging Need to round if

averaging

Switches Examples

Allow Jump Connections(TRUE/FALSE) Mating Rule ( Average / Crossover / Both )

Mating must be crossover with higher probability of selecting father’s (higher value model’s) traits.

Page 71: Modeling with IRENE

Population mating

Father (Pop3)Jump ConnectionsInitial Species PopulationMax Layers

Max nodes per layerMutation Rate

YES30037.15

Child (Pop4)Jump ConnectionsInitial Species PopulationMax Layers

Max nodes per layerMutation Rate

225

.10

Mother (Pop2)Jump ConnectionsInitial Species PopulationMax Layers

Max nodes per layerMutation Rate

NO15024.05

YES300 1503

4.15 .05

Page 72: Modeling with IRENE

And then the new population (Pop4) searches for its champion who will then go compete

Page 73: Modeling with IRENE

Museums

Page 74: Modeling with IRENE

Recall Previous Example:Species4242442323

Species 2

Species 4

Species 3

60%

Survivors: No creature of Species 3 is among them

The optimal creature of the now extinct Species 3 is saved

Page 75: Modeling with IRENE

Recall Previous Example:Population 1Museum of “Natural” History

This creature is saved in the Museum.The optimal creature of each speciesas it goes extinct is also saved.When the population has completed its specified number of generations theoptimal creature from each remainingspecies is also saved to the museum.

Page 76: Modeling with IRENE

Population 1Museum of “Natural” History

Why have a Museum?Neural Networks may ‘over fit’ trainingdata. A good predictive model maygo extinct.

Validation Data Set

Evaluate models in the museum to make sure we didn’t miss a good predictive model.

Page 77: Modeling with IRENE

The End… (not of the slide show, don’t get up yet)Creature Value

At a specified ‘end’ of the algorithmall creatures from all museums are collectedinto a master list.

Validation Data Set

Each creature in the list is evaluated on the validation data set.2412

351612302984151205123110236123191241

Page 78: Modeling with IRENE

The End… (not of the slide show, don’t get up yet)Creature Value

Validation Data Set

The best model is selected. If it passes a second round of validation, it is selected. If it doesn’t pass the second round of validation, the next best model is selected.

2412351612302984151205123110236123191241 Second Validation Data Set

SUCCESS!

Page 79: Modeling with IRENE

And so the final Model is selected.

This is the predictive model the algorithm returns.

Page 80: Modeling with IRENE

References Barrat, Alain, Marc Barthélemy, and Alessandro Vespignani. "Weighted evolving

networks: coupling topology and weight dynamics." Physical review letters 92.22 (2004): 228701.

Maniezzo, Vittorio. "Genetic evolution of the topology and weight distribution of neural networks." Neural Networks, IEEE Transactions on 5.1 (1994): 39-53.

Barrat, Alain, Marc Barthélemy, and Alessandro Vespignani. "Modeling the evolution of weighted networks." Physical Review E 70.6 (2004): 066149.

Sher, Gene. "DXNN: evolving complex organisms in complex environments using a novel tweann system." Proceedings of the 13th annual conference companion on Genetic and evolutionary computation. ACM, 2011.

Sher, Gene I. "Discover & eXplore Neural Network (DXNN) Platform, a Modular TWEANN." arXiv preprint arXiv:1008.2412 (2010).

Michalewicz, Zbigniew, Cezary Z. Janikow, and Jacek B. Krawczyk. "A modified genetic algorithm for optimal control problems." Computers & Mathematics with Applications 23.12 (1992): 83-94.

Wang, Ling, and D-Z. Zheng. "A modified genetic algorithm for job shop scheduling." The International Journal of Advanced Manufacturing Technology 20.1 (2002): 72-76.