evolutionary algorithms: lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpge:...

29
Evolutionary Algorithms: Lecture 3 Jiˇ ı Kubal´ ık Department of Cybernetics, CTU Prague http://labe.felk.cvut.cz/~posik/xe33scp/

Upload: others

Post on 23-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

Evolutionary Algorithms: Lecture 3

Jirı KubalıkDepartment of Cybernetics, CTU Prague

http://labe.felk.cvut.cz/~posik/xe33scp/

Page 2: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pContents

� Genetic Programming cont.:

− Examples – trigonometric identity, symbolic regression.

− Closure property

∗ Simple blind variation operators tend to generate illegal trees.

∗ Strongly typed GP.

� Grammatical Evolution:

− Linear string (binary, integer) representation.

− Utilizes simple GA crossover and mutation operators.

− Use of a context-free grammar to generate trees.

− Examples.

� Application of GP to

− induction of Multivariate Decision Trees,

− construction of Forest of decision trees.

� � Evolutionary Algorithms: Lecture 3

Page 3: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP: Trigonometric Identity

:: Task is to find an equivalent expression to cos(2x).

:: GP implementation:

� Terminal set T = {x, 1.0}.

� Function set F = {+,−, ∗, %, sin}.

� Training cases: 20 pairs (xi, yi), where xi are values evenly distributed in interval (0, 2π).

� Fitness: Sum of absolute differences between desired yi and the values returned by generated

expressions.

� Stopping criterion: A solution found that gives the error less than 0.01.

� � � Evolutionary Algorithms: Lecture 3

Page 4: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP: Trigonometric Identity cont.

:: 1. run, 13th generation

(−(−1(∗(sinx)(sinx))))(∗(sinx)(sinx)))

which equals (after editing) to 1–2 ∗ sin2x

:: 2. run, 34th generation(−1(∗(∗(sinx)(sinx))2))

which is just another way of writing the same expression.

:: 3. run, 30th generation

(sin (−(−2(∗x2))

(sin(sin(sin(sin(sin(sin(∗(sin (sin1))

(sin1))

)))))))))

Note that the expression on the second and third row is almost equal to π/2 so the discovered

identity is

cos(2x) = sin(π/2–2x)

.

� � � � Evolutionary Algorithms: Lecture 3

Page 5: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP: Symbolic Regression

:: Task is to find a function that fits to training data evenly sampled from interval < −1.0, 1.0 >.

:: GP implementation:

� Terminal set T = {x}.� Function set F = {+,−, ∗, %, sin, cos}.� Training cases: 20 pairs (xi, yi), where xi are values evenly distributed in interval (−1, 1).

� Fitness: Sum of errors calculated over all (xi, yi) pairs.

� Stopping criterion: A solution

found that gives the error less than

0.01.

:: Besides the desired functionother three were found

� with a very strange behavior outside

the interval of training data,

� though optimal with respect to the

defined fitness.

� � � � � Evolutionary Algorithms: Lecture 3

Page 6: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP: Evolving Fuzzy-rule based Classifier

:: Classifier consists of fuzzy if-then rules of type

IF (x1 is medium) and (x3 is large) THEN class = 1 with cf = 0.73

:: Linguistic terms – small, medium small, medium, medium large, large.

:: Fuzzy membership functions – approximate the confidence in that the crisp value is

represented by the linguistic term.

� � � � � � Evolutionary Algorithms: Lecture 3

Page 7: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pRule Base Example

:: Three rules connected by OR.

� � � � � � � Evolutionary Algorithms: Lecture 3

Page 8: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pIllegal Tree

:: Tree does not represent a correct rule base.

� obviously due to the fact that the closure property does not hold here.

:: This might happen very often since crossover, mutation etc. are just blind operators.

What can we do?

� � � � � � � � Evolutionary Algorithms: Lecture 3

Page 9: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pStrongly typed GP

:: Strongly typed GP

� prevents generating illegal individuals,

� quite a big overhead =⇒ inefficient for large trees.

:: Can be done any better?

� � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 10: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGrammatical Evolution (GE)

:: Designed to evolve programs in any language, that can be described by a context free grammar.

:: Evolves tree structures, but operates on simple linear string chromosomes :-o

:: Backus-Naur Form (BNF)

� production rules P ,

� terminals T – non-expandable items,

� non-terminals N — can be expanded into one or more items ∈ N ∪ T .

� � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 11: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGE: Representation

:: GE does not work with a natural tree representation.

� it operates with binary/integer strings.

:: Genotype -– phenotype mapping

1. Binary string is translated into a sequence of integers – codons

2. Each codon specifies the production rule to be applied for currently expanded non-terminal

3. Mapping finishes when all of the nonterminals have been expanded� multiple codon values can determine the same rule,

� useful redundancy in genetic code.

!!! Only syntactically correct programs can be generated !!!

� � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 12: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGE: Construction of Program Tree

:: Grammar in Backus-Naur Form :: Chromosome: 6 4 9 3 5 8 8 6 2

� � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 13: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGE: Construction of Program Tree – Ant Example

� � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 14: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGE: Crossover

:: 1-point (riple) crossover

:: Effect: The head sequence of codons does not change its meaning, while the tale sequence

may or may not change its interpretation:

� good generative and explorative characteristics.

� on the other hand, most of the time it is just a big mutation!

� � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 15: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP: Decision Trees

:: Classical decision trees (DT)� Inner nodes represent simple decisions of type

att > const, att = const

⇒ axis-parallel splits, in some cases not very efficient way to partition the attribute space.

� Leaf node indicates a class to which the sample, which corresponds to the decisions made

along the branch from root to the leaf, belongs.

� Standard learning algorithms – Quinlan’s ID3 (Iterative Dichotomiser 3).

� � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 16: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP: Decision Trees cont.

:: Oblique/Multivariate DT

� Inner (decision) nodes represent complex rules (functions).

� More flexible splits are possible.

� But may be hard to understand and interpret.

:: This looks much better, but how to find the rules?

� � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 17: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pMultivariate DT: Rule Structure

:: Root node returns boolean value {true, false}

� � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 18: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pIntertwined Spirals Problem

� � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 19: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pMultivariate DT: Effect of Terminals and Functions Selection

:: Without any prior knowledge A

� Terminals: Ta = {x, y, R},

� Functions: Fa = {+,−, ?, (> 0)}.

:: With prior knowledge about the circular characteristics of data

� Polar coordinates.

� Terminals: Tb = {%, ϕ, R}, where % and ϕ are radius and phase of data points,

% =√

x2 + y2 and ϕ = arctan yx

� Functions: Fb = {+,−, ∗, (> 0), sin}.

� � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 20: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pMultivariate DT: Classifiers Evolved with {Ta, Fa}

This does not look nice.

� � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 21: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pMultivariate DT: Classifiers Evolved with {Ta, Fa} cont.

Neither does this one.

� � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 22: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pMultivariate DT: Classifiers Evolved with {Tb, Fb}

This is better already.

� � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 23: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pMultivariate DT: Classifiers Evolved with {Tb, Fb} cont.

Wow, this one is really nice!!!

� � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 24: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pDecision Forests

:: Decision forest combines the predictions of multiple independent DT models

� How to make a collective decision to determine a class attributed to a new observed

data sample?

− Majority voting – a class that is preferred by the majority of DTs (with equal contributions)

is chosen.

− Weighted majority voting – number of mistakes accomplished by each DT on training

sample is taken into account. The fewer mistakes, the greater contribution of the DT.

� When models of similar predictive quality are combined using the Decision Forest method,

quality compared to the individual models is consistently and significantly improved on both

training and testing data.

� The stochastic element in the decision tree forest algorithm makes it highly resistant to

over fitting.

� The primary disadvantage of decision tree forests is that the model is complex and

cannot be visualized like a single tree.

� � � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 25: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pGP and Decision Forests

:: Generally, EAs are able to evolve different trees on the same training data.

� However, without any support by designer the differences might be negligible.

� � � � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 26: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pEvolution of Different Root Splits: Strategy

:: The most influential is the root decision/split.

:: Strategy for generating maximally different DTs:� First, tree is induced using pure information gain as fitness.

� Next, we want to produce a significantly different tree.

− The root is therefore constructed to maximize information gain on uncertainty other than

that eliminated by root tests of previous tree(s).

This can be interpreted as ”don’t do the same as the previous root test(s)”.

Approach: Fitness function is defined so that root nodes are evolved to eliminate uncer-

tainty not eliminated by either of the root tests of the previously evolved tree(s).

− The rest of the tree is produced again with standard information gain.

:: Fitness of i-th root node test ti is calculated as

J(ti) = minj≤i−1{I(MPj) + I(MNj)}, i = 2 . . . R,

� MPj, MNj are positive and negative sets defined by the root node test of j-th tree. Index j

runs over all tests evolved before ti.

� I(MPj) and I(MNj) is the information gain of test ti (i.e. the amount of uncertainty (entropy)

eliminated by the test) obtained on the set MPj and MNj, respectively.

� � � � � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 27: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pEvolution of Different Root Splits: Illustration of Algorithm

:: Choosing the root split of the second tree given the partitions of the training set determined

by the root split of the first tree

Out of the three candidates the RS22 has been chosen since it eliminates the biggest amount of

uncertainty from the partition-pair (MP1, MN1).

� � � � � � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 28: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pEvolved Root Splits

� � � � � � � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3

Page 29: Evolutionary Algorithms: Lecture 3labe.felk.cvut.cz/~posik/xe33scp/xe33scp_ea3.pdfpGE: Representation:: GE does not work with a natural tree representation. it operates with binary/integer

pEvolved Forest of Decision Trees: Intertwined Spirals

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � Evolutionary Algorithms: Lecture 3