iwlcs'2007: substructural surrogates for learning decomposable classification problems:...

27
Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results Implementation and First Results Albert Orriols-Puig 1,2 Kumara Sastry 2 Albert Orriols Puig Kumara Sastry David E. Goldberg 2 Ester Bernadó-Mansilla 1 1 Research Group in Intelligent Systems 1 Research Group in Intelligent Systems Enginyeria i Arquitectura La Salle, Ramon Llull University 2 Illinois Genetic Algorithms Laboratory Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana Champaign

Upload: albert-orriols-puig

Post on 24-Jan-2015

275 views

Category:

Education


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Substructural Surrogates for Learning g gDecomposable Classification Problems:

Implementation and First ResultsImplementation and First Results

Albert Orriols-Puig1,2 Kumara Sastry2Albert Orriols Puig Kumara SastryDavid E. Goldberg2 Ester Bernadó-Mansilla1

1Research Group in Intelligent Systems1Research Group in Intelligent SystemsEnginyeria i Arquitectura La Salle, Ramon Llull University

2Illinois Genetic Algorithms Laboratoryg yDepartment of Industrial and Enterprise Systems Engineering

University of Illinois at Urbana Champaign

Page 2: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Motivation

Nearly decomposable functions in engineering systems• (Simon 69; Gibson 79; Goldberg 02)(Simon,69; Gibson,79; Goldberg,02)

Design decomposition principle in:Design decomposition principle in:

– Genetic Algorithms (GAs)Genetic Algorithms (GAs)• (Goldberg,02; Pelikan,05; Pelikan, Sastry & Cantú-Paz, 06)

Genetic Programming (GPs)– Genetic Programming (GPs)• (Sastry & Goldberg, 03)

– Learning Classifier Systems (LCS)• (Butz et al., 06; Llorà et al., 06)

Slide 2GRSI Enginyeria i Arquitectura la Salle

Page 3: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Aim

Our approach: build surrogates for classification.– Probabilistic models Induce the form of a surrogate– Probabilistic models Induce the form of a surrogate– Regression Estimate the coefficients of the surrogate– Use the surrogate to classify new examplesUse the surrogate to classify new examples

Surrogates used for efficiency enhancement of:Surrogates used for efficiency enhancement of:– GAs:

• (Sastry, Pelikan & Goldberg, 2004)(Sastry, Pelikan & Goldberg, 2004)• (Pelikan & Sastry, 2004)• (Sastry, Lima & Goldberg, 2006)

– LCSs:• (Llorà, Sastry, Yu & Goldberg, 2007)

Slide 3GRSI Enginyeria i Arquitectura la Salle

Page 4: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Outline

1. Methodology

2. Implementation

3 T t P bl3. Test Problems

4 Results4. Results

5. Discussion

6. Conclusions

Slide 4GRSI Enginyeria i Arquitectura la Salle

Page 5: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Overview of the Methodology1. Methodology2. Implementation3. Test Problems3. Results4 Di i

gy4. Discussion5. Conclusions

Structural Surrogate ClassificationModel Layer Model Layer Model Layer

•Infer structural form and the coefficients

• Extract dependencies bet een attrib tes

•Use the surrogate function to predict theand the coefficients

based on the structural model

between attributes • Expressed as:

• Linkage Groups• Matrices

function to predict the class of new examples

(Sastry, Lima & Goldberg, 2006)• Matrices• Bayesian Networks

• Example: x-or

Slide 5GRSI Enginyeria i Arquitectura la Salle

Page 6: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Surrogate Model Layer1. Methodology2. Implementation3. Test Problems3. Results4 Di i

g y4. Discussion5. Conclusions

Input: Database of labeled examples with nominal attributes

Substructure identified by the structural model layer– Example: [0 2] [1]

Surrogate form induced from the structural model:

f( ) + + + +

C ffi i t f t li i th d

f(x0, x1, x2) = cI + c0x0 + c1x1 + c2x2 + c0,2x0x2

Coefficients of surrogates: linear regression methods

Slide 6GRSI Enginyeria i Arquitectura la Salle

Page 7: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Surrogate Model Layer1. Methodology2. Implementation3. Test Problems3. Results4 Di i

g y4. Discussion5. Conclusions

Map input examples to schema-index matrix– Consider all the schemas:Consider all the schemas:

I l th b f h t b id d i

{0*0, 0*1, 1*0, 1*1, *0*, *1*}[0 2] [1]

– In general, the number of schemas to be considered is:

Number of linkagegroups

Cardinality of the variable jthof the ith linkage group

Slide 7GRSI Enginyeria i Arquitectura la Salle

Page 8: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Surrogate Model Layer1. Methodology2. Implementation3. Test Problems3. Results4 Di i

g y4. Discussion5. Conclusions

Map each example to a binary vector of size msch

– For each building block, the entry corresponding to the schemata containedFor each building block, the entry corresponding to the schemata contained in the example is one. All the other are zero.

[0 2] [1]

0*0 0*1 1*0 1*1 *0* *1*

[0 2] [1]

010 1 0 0 0 0 1

100 0 0 1 0 1 0

111 0 0 0 1 0 1

Slide 8GRSI Enginyeria i Arquitectura la Salle

Page 9: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Surrogate Model Layer1. Methodology2. Implementation3. Test Problems3. Results4 Di i

g y4. Discussion5. Conclusions

This results in the following matrix

n inputn input examples

Each class is mapped to an integerEach class is mapped to an integer

Slide 9GRSI Enginyeria i Arquitectura la Salle

Page 10: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Surrogate Model Layer1. Methodology2. Implementation3. Test Problems3. Results4 Di i

g y4. Discussion5. Conclusions

Solve the following linear system to estimate the coefficients of the surrogate model

Input instances mappedto the schemas matrix

Class or label of eachinput instance

Coefficients of thesurrogate model

p

To solve it, we use multi-dimensional least squares fitting approach:

– Minimize:

Slide 10GRSI Enginyeria i Arquitectura la Salle

Page 11: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Overview of the Methodology1. Methodology2. Implementation3. Test Problems3. Results4 Di i

gy4. Discussion5. Conclusions

Structural Surrogate ClassificationModel Layer Model Layer Model Layer

•Infer structural form and the coefficients

• Extract dependencies bet een attrib tes

•Use the surrogate function to predict theand the coefficients

based on the structural model

between attributes • Expressed as:

• Linkage Groups• Matrices

function to predict the class of new examples

(Sastry, Lima & Goldberg, 2006)• Matrices• Bayesian Networks

• Example: x-or

Slide 11GRSI Enginyeria i Arquitectura la Salle

Page 12: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Classification Model Layer1. Methodology2. Implementation3. Test Problems3. Results4 Di i

y4. Discussion5. Conclusions

Use the surrogate to predict the class of a new input instance

New input Map the example

to a vector e of length msch Output = round (x·e)pexample

to a vector e of length mschaccording to the schemas

Output round (x e)

Coefficients of thesurrogate model

0*0 0*1 1*0 1*1 *0* *1*0*0 0*1 1*0 1*1 *0* *1*0 0 1 0 1 0001 Output = x·(001010)’

Slide 12GRSI Enginyeria i Arquitectura la Salle

Page 13: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Overview of the Methodology1. Methodology2. Implementation3. Test Problems3. Results4 Di i

gy4. Discussion5. Conclusions

Structural Surrogate ClassificationModel Layer Model Layer Model Layer

•Infer structural form and the coefficients

• Extract dependencies bet een attrib tes

•Use the surrogate function to predict theand the coefficients

based on the structural model

between attributes • Expressed as:

• Linkage Groups• Matrices

function to predict the class of new examples

(Sastry, Lima & Goldberg, 2006)• Matrices• Bayesian Networks

• Example: x-or

Slide 13GRSI Enginyeria i Arquitectura la Salle

Page 14: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

How do we obtain the structural model?

1. Methodology2. Implementation3. Test Problems3. Results4 Di imodel?

Greedy extraction of the structural model for classification

4. Discussion5. Conclusions

Greedy extraction of the structural model for classification (gESMC): Greedy search à la eCGA (Harik, 98)1 Start considering all variables independent sModel = [1] [2] [n]1. Start considering all variables independent. sModel = [1] [2] … [n]

2. Divide the dataset in ten-fold cross-validation sets: train and test

33. Build the surrogate with the train set and evaluate with the test set

4. While there is improvement1. Create all combinations of two subset merges

Eg. { [1,2] .. [n]; [1,n] .. [2]; [1]..[2,n]}

2. Build the surrogate for all candidates. Evaluate the quality with the test set

3. Select the candidate with minimum error

4 If th lit f th did t i b tt th th t t th d l4. If the quality of the candidate is better than the parent, accept the model (update smodel) and go to 4.

5. Otherwise, stop. sModel is optimal.

Slide 14GRSI Enginyeria i Arquitectura la Salle

, p p

Page 15: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Test Problems1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Two-level hierarchical problems (Butz, Pelikan, Llorà & Goldberg, 2006)

Slide 15GRSI Enginyeria i Arquitectura la Salle

Page 16: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Test Problems1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Problems in the lower level

– Position Problem: 000110 :3Position of the left-most

one-valued bit

Condition length (l)

Condition

– Parity Problem: 01001010length (l)

:1 Number of 1 mod 2

Slide 16GRSI Enginyeria i Arquitectura la Salle

Page 17: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Test Problems1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Problems in the higher level

– Decoder Problem: :5 Integer value of the input000110Condition length (l)

Condition

– Count-ones Problem: :2 Number of ones of the input000110Condition length (l)

Combinations of both levels:– Parity + decoder (hParDec) - Parity + count-ones (hParCount)y ( ) y ( )

– Position + decoder (hPosDec) - Position + count-ones (hPosCount)

Slide 17GRSI Enginyeria i Arquitectura la Salle

Page 18: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Results with 2-bit lower level blocks1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

First experiment:– hParDec hPosDec hParCount hPosCount– hParDec, hPosDec, hParCount, hPosCount– Binary inputs of length 17. – The first 6 bits are relevant The others are irrelevantThe first 6 bits are relevant. The others are irrelevant– 3 blocks of two-bits low order problems– Eg: hPosDecEg: hPosDec

10 01 00 00101010101 Interacting variables:10 01 00 00101010101irrelevant bits

Interacting variables:[x0 x1] [x2 x3] [x4 x5]

2 1 0Output = 21

Slide 18GRSI Enginyeria i Arquitectura la Salle

Page 19: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Results with 2-bit lower level blocks1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Performance of gESMC compared to SMO and C4.5:

Slide 19GRSI Enginyeria i Arquitectura la Salle

Page 20: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Results with 2-bit lower level blocks1. Methodology2. Implementation3. Test Problems3. Results4 Di i

F d ffi i t f th t d l

4. Discussion5. Conclusions

Form and coefficients of the surrogate models– Interacting variables [x0 x1] [x2 x3] [x4 x5]

– Linkages between variables are detected– Easy visualization of the salient variable interactions

All models only consist of the six relevant variables

Slide 20GRSI Enginyeria i Arquitectura la Salle

– All models only consist of the six relevant variables

Page 21: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Results with 2-bit lower level blocks1. Methodology2. Implementation3. Test Problems3. Results4 Di i

C4 5 t d t

4. Discussion5. Conclusions

C4.5 created trees:– HPosDec and HPosCount

• Trees of 53 nodes and 27 leaves on average• Trees of 53 nodes and 27 leaves on average• Only the six relevant variables in the decision nodes

– HParDec• Trees of 270 nodes and 136 leaves on average• Irrelevant attributes in the decision nodes

HP C t– HParCount• Trees of 283 nodes and 142 leaves on average• Irrelevant attributes in the decision nodes

SMO built support vector machines:– 351, 6, 6, and 28 machines with 16 weights each one were

created for HPosDec, HPosCount, HParDec, and HParCount

Slide 21GRSI Enginyeria i Arquitectura la Salle

Page 22: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Results with 3-bit lower level blocks1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Second experiment:– hParDec hPosDec hParCount hPosCount– hParDec, hPosDec, hParCount, hPosCount– Binary inputs of length 17. – The first 6 bits are relevant The others are irrelevantThe first 6 bits are relevant. The others are irrelevant– 2 blocks of three-bits low order problems– Eg: hPosDecEg: hPosDec

000 100 00101010101 Interacting variables:000 100 00101010101irrelevant bits

Interacting variables:[x0 x1 x2] [x3 x4 x5]

0 3Output = 3

Slide 22GRSI Enginyeria i Arquitectura la Salle

Page 23: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Results with 3-bit lower level blocks1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Second experiment:– 2 blocks of three-bits low order problems

Interesting observations:

•As expected, the method stops the search when no improvement is found M1: [0][1][2][3][4][5]

•In HParDec, half of the times the system gets a model that represents the salient interacting variables due to the stochasticity of the 10-fold CV

M2: [0 1][2][3][4][5]

M3: [0 1 2][3][4][5]

Slide 23GRSI Enginyeria i Arquitectura la Salle

estimation. Eg. [0 1 2 8] [3 4 9 5 6]

Page 24: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Discussion1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Lack of guidance from low-order substructures– Increase the order of substructural merges– Increase the order of substructural merges

– Select randomly one of the new structural modelsSelect randomly one of the new structural models• Follow schemes such as simulated annealing (Korst & Aarts, 1997)

Slide 24GRSI Enginyeria i Arquitectura la Salle

Page 25: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Discussion1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Creating substructural models with overlapping structuresstructures– Problems with overlapping linkages

– Eg: multiplexer problems

– gESMC on the 6-bit mux results in: [0 1 2 3 4 5 ]

– What we would like to obtain is: [0 1 2] [0 1 3] [0 1 4] [0 1 5]

– Enhance the methodology permitting to represent overlapping– Enhance the methodology permitting to represent overlapping linkages, using methods such as the design structure matrix genetic algorithm (DMSGA) (Yu, 2006)

Slide 25GRSI Enginyeria i Arquitectura la Salle

Page 26: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Conclusions1. Methodology2. Implementation3. Test Problems3. Results4 Di i4. Discussion5. Conclusions

Methodology to learn from decomposable problems– Structural model– Surrogate model– Classification model

First implementation approach– greedy search of the best structural model

Main advantage of gESMC over other learners– It provides the structural model jointly with the classification p j y

model.Some limitations of gESMC:– It depends on the guidance on the low-order structure– Several approaches outlined as further work.

Slide 26GRSI Enginyeria i Arquitectura la Salle

Page 27: IWLCS'2007: Substructural Surrogates for Learning Decomposable Classification Problems: Implementation and First Results

Substructural Surrogates for Learning g gDecomposable Classification Problems:

Implementation and First ResultsImplementation and First Results

Albert Orriols-Puig1,2 Kumara Sastry2Albert Orriols Puig Kumara SastryDavid E. Goldberg2 Ester Bernadó-Mansilla1

1Research Group in Intelligent Systems1Research Group in Intelligent SystemsEnginyeria i Arquitectura La Salle, Ramon Llull University

2Illinois Genetic Algorithms Laboratoryg yDepartment of Industrial and Enterprise Systems Engineering

University of Illinois at Urbana Champaign