knowledge integration by genetic algorithms prof. tzung-pei hong department of electrical...

126
Knowledge Knowledge Integration by Integration by Genetic Algorithms Genetic Algorithms Prof. Tzung-Pei Hong Prof. Tzung-Pei Hong Department of Electrical Enginee Department of Electrical Enginee ring National University of kaoh ring National University of kaoh siung siung

Upload: randell-dean

Post on 25-Dec-2015

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

Knowledge Knowledge Integration by Integration by

Genetic Algorithms Genetic Algorithms

Knowledge Knowledge Integration by Integration by

Genetic Algorithms Genetic Algorithms

Prof. Tzung-Pei HongProf. Tzung-Pei HongDepartment of Electrical Engineering NatDepartment of Electrical Engineering Nat

ional University of kaohsiungional University of kaohsiung

Page 2: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

22

Outline

Introduction Review

GAs Fuzzy Sets Related Studies

Knowledge Integration StrategiesClassification RulesAssociation Rules

Conclusions

Page 3: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

33

Why Knowledge Integration

Four Reasons

Expert System

4. Reduce the effort on developing an expert system or decision support system

… …

1. Knowledge is distributed among sources

RB1 RBi RBn

GRB

User Interface

Integration

3. Knowledge can be reused

2. It Increases reliability of knowledge-based systems

Page 4: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

44

Why Using GAs ?

Integration

… …RB1 RBi RBn

Integration must satisfy

1.Completeness 2.Correctness 3.Consistency 4.Conciseness

Multi-objective optimization problem

GAs finding optimal or nearly optimal solutions

Page 5: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

55

Vague Knowledge

In Real-World Applications

… …RB1 RBi RBn

knowledge sources or data linguistic or ambiguous information

Vagueness greatly influences the resulting knowledge base

Page 6: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

66

Benefits

Medsker [95]Knowledge integrated from different sources has good validityIntegrated knowledge can deal with more complex problemsKnowledge integration may improve the performance of the knowledge baseIntegrating would facilitate building bigger and better systems cheaply

Page 7: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

77

Traditional Knowledge Integration

ProblemsWhen conflict occurs

Domain experts must intervene in the integration process

SubjectiveTime consumingLimited Integration

A small number of knowledge sources

more knowledge sourcesMore difficult and complex

Page 8: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

88

Our Goals

Solve potential conflicts and contradictions

Integrate knowledge without human expert’s intervention

Improve the integration speed

Make the scale of knowledge sources

Page 9: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

99

History of GAs

GA: Genetic AlgorithmHistory

John Holland 1975

K. A. De Jong D. E. Goldberg

Page 10: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1010

Idea of GA

Survival of the fittest Iterative Procedure Genetic operators

ReproductionCrossoverMutation

Near optimal solution

Page 11: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1111

Simple Genetic Algorithms

Quit if : 1) Maximum generations are reached 2) Time limit is reached 3) Population is converged

Start

Initialize apopulation of individuals

Evaluate eachindividual's fitness value

Select the superior individuals for reproduction

Apply crossover and perhaps mutation

Evaluate new individual's fitness value

Quit ?YesNo

stop

Page 12: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1212

An Example

A Function

Find the max

)t5(6sin)1.0t)(2(ln125.3e)t(f2

]1,0[t

Page 13: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1313

Step1

Define a suitable representation Each Chromosome

12 bits e.g.

t = 0 000000000000 t = 1 111111111111 t = 0.680 101011100001

Page 14: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1414

Step2

Create an initial population of N N Population size Assume N = 40

Page 15: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1515

Step3

Define a suitable fitness function f to evaluate the individuals

Fitness function f(t)e.g. The first six individuals

No. bit string t f(t)1 0001100000001 0,094 0.974

2 010011001101 0.300 0.917

3 000111111100 0.124 0.644

4 101101000111 0.705 0.444

5 111011000100 0.923 0.154

6 011100111111 0.453 0.125

Page 16: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1616

Step 4

Perform the crossover and the mutation operations to generate the possible offsprings

Page 17: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1717

Crossover

Offsprings:Inheriting some characteristics of their parents

e.g. Parent 1 : 00011 0000001Parent 2 : 01001 1001101

Child 1 : 000111001101Child 2 : 010010000001

Page 18: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1818

Mutation

Offspringspossessing different characteristics from their ascendents Preserving a reasonable level of population diversity

e.g. Bit change

e.g. Inversion1 1 1 1 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0

1 1 1 1 0 1 0 0 0 1 0 0 1 1 1 0 1 1 0 0 0 1 0 0

Page 19: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

1919

New Offsprings

The new offsprings produced by the operators

No. bit string t f(t)1 000110011110 0,101 0.999

2 000110000001 0.094 0.974

3 010011001101 0.300 0.917

4 000111111100 0.124 0.644

5 101101000111 0.705 0.444

Page 20: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2020

Step 5

Replace the individuale.g. The first six individuals

No. bit string t f(t)1 000110011110 0,101 0.999

2 000110000001 0.094 0.974

3 010011001101 0.300 0.917

4 000111111100 0.124 0.644

5 101101000111 0.705 0.444

6 011100111111 0.453 0.125

NEW

Page 21: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2121

Step 6

If the termination criteria are not satisfied, go to Step 4; otherwise, stop the genetic algorithm

The termination criteria

The maximum number of generations

The time limit

The population converged

Page 22: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2222

Experiment

Page 23: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2323

Fuzzy Sets傳統電腦決策

不是對 (1) 就是錯 (0)例如: 25 歲以上是青年,那 26歲就是中年 ?

60 分以上是及格,那 60分以下就是不及格

何謂模糊在對 (1) 與錯 (0) 之間,再多加幾個等級

幾乎對 (0.8)可能對 (0.6)可能錯 (0.4)幾乎錯 (0.2)

Page 24: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2424

Fuzzy Sets

Question:168 公分到底算不算高 ?

身高 (Cm)

中矮 高

170 180160

隸屬度

再多分成幾級 連續

Page 25: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2525

Example:“Close to 0”e.g.

μA(3) = 0.01μA(1) = 0.09μA(0.25) = 0.62μA(0) = 1

Define a Membership Function:

μA(x) = 2x101

1

Page 26: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2626

Example:“Close to 0”

Very Close to 0:

μA(x) = 22

)x101

1(

Page 27: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2727

Fuzzy Set (Cont.)

Membership function [0, 1]

e.g.sunny : x → [0, 1]

0.6 sunny

0.8 sunny

0.1 sunnyx

Page 28: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2828

Fuzzy Set

SimpleIntuitively pleasingA generalization of crisp set

Vague member → non-member

Sunny Not sunny

1 0.8 0.6 0.4 0.2 0

0 or 1Non-member member

gradual

Page 29: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

2929

Fuzzy Operations

交集 (AND)取較小的可能性EX: 學生聰明 (0.8) 而且 用功 (0.6) 則是模範生(0.6)

聯集 (OR)取較大的可能性EX: 學生聰明 (0.8) 或者 用功 (0.6) 則是模範生(0.8)

反面 (NOT)取與 1的差EX: 學生聰明是 0.8, 則學生不聰明 0.2

Page 30: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3030

Fuzzy Inference Example

洪老師找小老婆的條件( 大眼睛而且小嘴巴 )或者是身材好

Question : 誰是最佳女主角

大眼睛 小嘴巴 身材好陶晶瑩 0 0.8 0.3張惠妹 1 0.6 0.8李 玟 0 0.3 0.9李心潔 0.7 0.1 0.5蔡依林 0.8 0.5 0.3

Page 31: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3131

Answer對陶晶瑩 = (0 AND 0.8) OR 0.3 = 0 OR 0.3 = 0.3對張惠妹 = (1 AND 0.6) OR 0.8 = 0.8對李 玟 = (0 AND 0.3) OR 0.9 = 0.9對李心潔 = (0.7 AND 0.1) OR 0.5 = 0.5對蔡依林 = (0.8 AND 0.5) OR 0.3 = 0.5

李 玟 為最佳選擇 ! 謝謝 !

Page 32: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3232

Fuzzy Decision

A = {A1, A2, A3, A4, A5}A set of alternatives

C = {C1, C2, C3}A set of criteria

C1 (big eyes)

C2 (small mouth)

C3 (good shape)

A1 (Mary) 0 0.8 0.3

A2 (Judy) 1 0.6 0.8

A3 (Jan) 0 0.3 0.9

A4 (Mandy)

0.7 0.1 0.5

A5 (Nancy) 0.8 0.5 0.3

Page 33: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3333

Example (Cont.)

Assume : C1 and C2 or C3E (Ai) : evaluation function

E (A1) = (0 0.8) 0.3 = 0 0.3 = 0.3E (A2) = (1 0.6) 0.8 = 0.6 0.8 = 0.8E (A3) = (0 0.3) 0.9 = 0 0.9 = 0.9 the best choiceE (A4) = (0.7 0.1) 0.5 = 0.1 0.5 = 0.5E (A5) = (0.8 0.5) 0.3 = 0.5 0.3 = 0.5

C1 (big eyes) C2 (small mouth)

C3 (good shape)

A1 (Mary) 0 0.8 0.3

A2 (Judy) 1 0.6 0.8

A3 (Jan) 0 0.3 0.9

A4 (Mandy)

0.7 0.1 0.5

A5 (Nancy) 0.8 0.5 0.3

Page 34: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3434

Review of Knowledge Integration

KnowledgeIntegration

Cooperative Approach

Centralized Approach

BlackboardBlackboard LPC ModelRepertory Grid

IntegrityConstraints

Decision Table

Genetic Algorithm

Page 35: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3535

GA-Based Classifier Systems

GA-BasedClassifier Systems

Michigan Approach

Pittsburgh Approach

rule 1 xxxxxxx....rule 2 yyyyyyy....

rule n nnnnnn....

rule set 1rule set 2

rule set m

rrrrrrrrr....zzzzzzzzzzzz....

mmmm.......

Page 36: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3636

Genetic Knowledge Integration

TPGKIApproach

TPGFKIApproach

GKIDSOApproach

GFKIGMApproach

GFKILMApproach

MGKIApproach

MGFKIApproach

PittsburghApproach

Michigan Approach

Vague Knowledge

Page 37: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3737

Integration of Classification Rules

Four Methods GKIDSO

Genetic Knowledge-Integration approach with Domain-Specific Operators

TPGKITwo-Phase Genetic Knowledge Integration

GFKILMGenetic-Fuzzy Knowledge-Integration with several sets of Local Membership functions

GFKIGMGenetic-Fuzzy Knowledge-Integration with a set of Global Membership functions

Page 38: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3838

Genetic Knowledge-Integration Framework

Intermediary representation

GA-BasedKnowledge Integration

Global Feature Set &Class Set

Encoding

Integrating

M.LMethod 1

K.A. Tool 1

Expert Group 1

Training Data Set 1 Expert

Group n

Training Data Set m

K.A. Tool n

M.LMethod m

Case Set

Dictionary DictionaryDictionary DictionaryRule Set Rule Set Rule Set Rule Set

Dictionary

Intermediary representation

Intermediary representation

Intermediary representation

Knowledge Base

Page 39: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

3939

Knowledge Integration

Rule Set Knowledge Input

Knowledge Encoding

GeneticKnowledge Integration

KnowledgeIntegration

Knowledge DecodingData Set

Knowledge Verification

Knowledge Base

Rule Set Rule Set

Page 40: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4040

GKIDO Approach

Genetic Knowledge-Integration approach with Domain-Specific Operators Consists of two parts

EncodingIntegration

Knowledge encoding Knowledge integration

RS

RS

RS

RS

Initial population Generation 0 Generation k

1

2

3

m

Chromosome

Chromosome

Chromosome

Chromosome

1

2

3

m

Chromosome

Chromosome

Chromosome

Chromosome

1

2

3

m

Chromosome

Chromosome

Chromosome

Chromosome

1

2

3

m

genetic

operators

Page 41: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4141

Knowledge Encoding

Rule Set

Intermediary Rule Intermediary Rule

Fixed-Length Rule String

Variable-Length Rule-Set String

Fixed-Length Rule String

Page 42: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4242

Example: Brain Tumor

Two classes: {Adenoma, Meningioma}Three features:

{Location, Calcification, Edema}Feature values for Location

{brain surface, sellar, brain stem}Feature values for Calcification

{no, marginal, vascular-like, lumpy}Feature values for Edema

{no, < 2 cm, < 0.5 hemisphere}

Page 43: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4343

Intermediary Rules

Two RulesR1:IF (Location=sellar) and (Calcification=no)

then AsenomaR2:IF (Location=brain surface) and (Edema< 2cm)

then Meningioma

R1:IF(Location=sellar) and (Calcification=no) and (Edema= no , or < 2 cm , or < 0.5 hemisphere) then AsenomaR2:IF(Location=brain surface) and (Calcification= no or marginal or vascular-like or lumpy) and (Edema< 2cm) then Meningioma

dummy test

Page 44: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4444

Fixed-Length Rule String

R1:IF(Location=sellar) and (Calcification=no) and (Edema= no , or < 2 cm , or < 0.5 hemisphere) then AsenomaR2:IF(Location=brain surface) and (Calcification= no or marginal or vascular-like or lumpy) and (Edema< 2cm) then Meningioma

R1 : 010 1000 111 10

R2 : 100 1111 010 01

Location Calcification Edema Classes

Page 45: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4545

Knowledge Integration

Initial Population

Rule Set 1

Rule Set 2

Rule Set n

CrossoverMutationFusionFission

Fitness Function

Rule Set 1

Rule Set 2

Rule Set n

Generation 1

Genetic Operation

Page 46: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4646

Fitness Function

Formally

where

fitness(RS)=Accuracy(RS)

Complexity RS( )

- is a control parameter

Accuracy =the total number of measure instances correctly matched by RS

the total number of measure instances( )RS

Complexity RSNumber of rules within the integrated rule set RS

Number of rules within initial RS mii

m( )

[ ( )] /

1

-

-

Page 47: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4747

Crossover

2cp

1

11

7

1 1

2

21

7

2 2

1

2

100110110 001 01001 0101010 0010101011 00

0100110011 00 11011 1010101 1000110011 01

1001101100 01 01001

0100110011 00 11011

RS

r r r

RS

r r r

O

O

bits

i n

bits

j m

:

:

:

:

1010101 1000110011 01

0101010 0010101011 00

1cp

crossover

Page 48: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4848

Fusion

Eliminate redundancy and subsumption

RedundancyR1: if A then BR2: if A then B

SubsumptionR1: if A and C then BR2: if A then B

Page 49: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

4949

Fusion (Cont.)

Eliminate redundancy

Eliminate subsumption

k

k ki kj

RS

r r r

Ok

:

:

100110110001

100110110001

1

010010101010 010010101010

010010101010Fusion

Fusion

q

q qi qj

RS

r r r

Oq

:

:

100110110001

100110110001

1

110010101010 010010101010

110010101010

Page 50: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5050

Fission

Eliminate misclassification and contradiction

Misclassificatione: (A, C)

R: if A then B

ContradictionR: if A then B or C

R1: if A then B R2: if A then C

Page 51: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5151

Fission (Cont.)

Eliminate misclassificationSelect the "closest" near-miss rule to the wrong classified test instance for specializing

Fission

Insert

k

k ki kn

k ki kn

RS

r r r

O

r r I r

k

:

:

"

1001101100 01 1001001 100 0010101011 00

1001101100 01 1001001 100 1001001 010 0010101011 00

1

1

11

01 10

Page 52: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5252

Fission (Cont.)

Eliminate contradiction

Fission

k

k ki kn

k ki ki kn

RS

r r r

O

r r r r

k

:

:

100110110001 100100110 001010101100

100110110001 100100110 100100110 001010101100

1

11 2

110

100 010

Page 53: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5353

Experiments- Breast Cancer Diagnosis

Six knowledge sources are integrated699 cases used in the experiment

524 cases for integrating175 cases for testing

9 attributes and 2 classes Benign : 458 casesMalignant : 241 cases

Each rule is encoded into a bit string of 92 bits long

Page 54: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5454

Result

0 1 6 23 39 32 56 59 80 97100

Generation CPU Time

00:00:0000:00:0100:00:0500:00:2200:00:2800:00:3100:00:5500:00:5800:01:1900:01:3600:01:40

Accuracy

0.77200.81170.88240.90680.92280.94220.94870.95330.95560.95680.9619

Fitness

0.74950.78560.86500.87190.89590.92370.93010.93460.93680.94730.9523

Page 55: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5555

Result (Cont.)

Test cases: 175

classes case no.correctly

classification misclassification unknown

Benign

Malignant

132

43

128

41

2

2

2

0

Page 56: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5656

Experiments- Breast Cancer Diagnosis

72757881848790939699

0 1 6 23 29 32 56 59 80 97 100

Generation (α=0.125)

fitn

ess

Page 57: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5757

Ten knowledge sources are integrated504 actual cases used in the application

378 cases for integrating126 cases for testing

12 attributes and 6 classes

Each rule is encoded into a bit string of 105 bits long

Application- Brain Tumor Diagnosis

Pituitary Adenoma: 85

Meningioma: 119Medulloblastoma: 68

Glioblastoma: 54

Astrocytoma: 122

Protoplasmic Astrocytoma: 56

Page 58: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5858

Application - Brain Tumor Diagnosis (Cont.)

0 150 300 450 600 750 900120013501500165018002000

Generation CPU Time00:00:0000:19:0500:38:2400:57:2201:16:2801:35:3101:54:5502:32:5802:51:1903:10:3603:29:4003:49:3104:14:24

Accuracy

0.79810.81170.82640.83210.85230.86010.87030.87910.87980.88300.88770.89070.9142

Fitness

0.53300.57010.60700.61250.62300.63370.63700.66010.66730.67100.70220.73730.7590

Page 59: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

5959

525558616467707376

0 600 1200 1800 2400 3000

Generation (α=0.125)

fitn

ess

Application - Brain Tumor Diagnosis

Page 60: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6060

TPGKI Approach

TPGKITwo-Phase Genetic Knowledge Integration

Consisting of two phasesKnowledge integrationKnowledge refinement

Integrating multiple rule sets by pure genetic operators

Domain-specific genetic operators need not intervene in the integration

Page 61: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6161

Two Phases

Integration phase & Refinement phases

r11

r1x

rm1

rmy

r11

r1z

rm1

rmw

RS

RS

RS

RS

1

2

3

m

RS

RS

RS

RS

1

2

3

m

RS

RS

RS

RS

1

2

3

m

PhaseIntegration

Phase

Refinement

PhaseIntegration

Phase

Refinement

PhaseIntegration

Select the best

Page 62: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6262

Knowledge-Integration Phase

Initial Population

Genetic Operation

Rule Set 1

Rule Set 2

Rule Set n

CrossoverMutation

Fitness Function

Rule Set n

Generation 1

Rule Set 1

Rule Set 2

Page 63: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6363

Knowledge-Refinement Phase

Initial Population

Rule Set 1

Rule Set i

Rule Set n

CrossoverMutation

Fitness Function

Rule 1

Rule 2

Rule m

Rule 1

Rule 2

Redundancy

SubsumptionContradiction

Genetic Operation

Generation 1

Page 64: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6464

Fitness Function

Accuracy r i

U

U Ui

i i

r

r r

( )| |

| | | |

Necessity rr

ii

r RSe U

e

r e( )

( , )

( , )

( , )

, ;

, .

r e

if e is correctly classified by a rule r

otherwise

1

0

Coverage ri rU

rU

i i( ) | | | |

Page 65: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6565

Evaluation Process

Let U be the object set Calculate Accuracy ir( )

Calculate Necessity ir( )

Calculate Coverage jr( )

Sort rules by Accuracy* Necessity

Fitness=Accuracy*Necssity*Coverage

jrUU=U-Remove jr

Empty STOP

Page 66: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6666

Experiments- Breast Cancer Diagnosis

0 1 4 5 8 26 34 44 45 93 95100

Generation CPU Time

00:00:0000:00:0200:00:1000:00:1300:00:2000:01:0400:01:2500:01:5000:01:5200:03:5100:03:5700:04:12

Accuracy

0.77200.81910.85810.92060.94770.94830.95250.95600.96570.96590.96740.9793

Fitness

0.74950.78750.82500.91120.91190.92470.92810.93750.94280.94690.94840.9502

Page 67: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6767

72757881848790939699

0 1 4 5 8 26 34 44 45 93 95 100

Generation (α=0.125)

fitn

ess

Experiments- Breast Cancer Diagnosis

Page 68: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6868

Application- Brain Tumor Diagnosis

0 150 300 450 600 750 900120013501500165018002000

Generation CPU Time

00:00:0000:31:0501:02:1301:33:4202:04:3302:35:3103:07:5504:09:3804:41:0305:12:3905:43:4006:25:3106:59:05

Accuracy

0.79810.81910.82960.84720.85830.87530.89030.89890.90120.90570.91070.91620.9257

Fitness

0.57440.58010.60700.72450.80150.81780.83270.85010.85230.85410.85830.86210.8700

Page 69: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

6969

565962656871747780838689

0 600 1200 1800 2400 3000

Generation (α=0.125)

fitn

ess

Application - Brain Tumor Diagnosis

Page 70: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7070

Comparison of GKIDSO and TPGKI

Experiment: Breast Cancer Diagnosis

Application: Brian Tumor Diagnosis

GKIDSO

TPGKI

Approach CPU Time Accuracy Rule No.

1096.10%100

97.93% 7252(100 generations)

(100 generations)

GKIDSO

TPGKI

Approach CPU Time Accuracy

9291.42%15264

92.57% 8625145(2000 generations)

(2000 generations)

Page 71: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7171

Genetic-Fuzzy Knowledge-Integration

GFKILMGenetic-Fuzzy Knowledge-Integration with several sets of Local Membership functionsAssociated with several sets of local membership functions

GFKIGM ApproachGenetic-Fuzzy Knowledge-Integration with a set of Global Membership functionsAssociated with a set of global membership functions

Page 72: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7272

Genetic-Fuzzy Knowledge-Integration Framework

Intermediary representation

Knowledge Integration

Expert Group 1

Training Set 1 Expert Group n

M.LMethod 1

K.A. Tool 1

Training Set m

K.A. Tool n

M.LMethod m

Intermediary representation

Intermediary representation

Intermediary representation

Encoding

Integrating

Functions Fuzzy Rule Set Membership

Genetic Fuzzy

Records

FunctionsMembership

FunctionsMembershipMembership

Functions Fuzzy Rule Set

Fuzzy Rule Set

Fuzzy Rule Set

Instances

Testobjects

Fuzzy Rule Set+

Membership Functions

Page 73: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7373

GFKILM Approach

GFKILM approach consists of two partsEncodingIntegration

Knowledge encoding Knowledge integrationInitial population Generation 0 Generation k

Chromosome

Chromosome

Chromosome

Chromosome

1

2

3

m

Chromosome

Chromosome

Chromosome

Chromosome

1

2

3

m

genetic

operators

Chromosome

Chromosome

Chromosome

Chromosome

1

2

3

m

+MFS11~ ~RS

+MFS22~ ~RS

+MFS33~ ~RS

+MFSmm~ ~RS

Page 74: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7474

Knowledge Encoding

Rule Set+MFS

Intermediary Rule+MFS

Fixed-Length Rule StringAssociated with MFS

Associated with MFS

Intermediary Rule+MFS

Fixed-Length Rule StringAssociated with MFS

Variable-Length Rule-Set String

Page 75: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7575

Examples: IRIS Flowers

花萼長度

u(S.L.)

4.3 7.95.2 6.1 7.0

Short Medium Long

S.L.

花萼寬度

u(S.W. )

Medium

2.0 4.42.6 3.2 3.8

Narrow Wide

S.W.

花瓣長度 花瓣寬度

u(P.L. )

1.0 6.92.4 3.9 5.4

Short Medium Long

P.L.

u(P.W. )

Medium

01 2.50.7 1.3 1.9

Narrow Wide

P.W.

Setosa =1, Versicolor=2, Virginica=3

Page 76: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7676

Examples

IF P.L.=Short Then Setosa

IF S.L.=(Short or Medium or Long) and S.W.=(Narrow or Medium or Wide)and P.L.=Short and P.W.=(Narrow or Medium or Wide) Then Setosa

Intermediary Representation

Membership functions + Fuzzy Rules

ClassWPLP

WSLS

qr

100.

6.0,9.1,6.0,3.1,6.0,7.0.

5.1,4.5,5.1,9.3,5.1,4.2

.6.0,8.3,6.0,2.3,6.0,6.2

.9.0,0.7,9.0,1.6,9.0,2.5:1

~

Page 77: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7777

Knowledge Integration

Initial Population

Genetic Operation

RS1+MFS

RS2+MFS

RSn+MFS

CrossoverMutationFusion

Fitness Function

Generation 1

RS1+MFS

RS2+MFS

RSn+MFS

Page 78: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7878

Crossover

1 52 0 9 61 0 9 7 0 0 9 2 6 0 6 32 0 6 38 0 6 2 4 16 39 14 54 15 0 6 0 6 12 0 6 19 0 6~~ : ... . , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

RS

S L S W P L PW

001

Class....

2 5 2 0 8 61 0 7 7 0 0 8 2 0 0 3 2 5 0 4 38 0 9 2 4 16 4 0 14 5 4 15 0 6 0 7 12 0 6 19 0 6~~ : ... . , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

RS

S L S W P L P W

010

Class....

crossover point

1 5 2 0 9 61 0 9 7 0 0 9 2 6 0 6 2 5 0 4 38 0 9 2 4 16 4 0 14 5 4 15 0 6 0 7 12 0 6 19 0 6'~ ~ : ... . , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

RS

S L S W P L P W

010

Class....

2 5 2 0 8 61 0 7 7 0 0 8 2 0 0 3 3 2 0 6 38 0 6 2 4 16 3 9 14 5 4 15 0 6 0 6 12 0 6 19 0 6'~~ : ... . , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

RS

S L S W P L P W

001

Class....

out of sequence

crossover

1 5 2 0 9 6 1 0 9 7 0 0 9 2 5 0 4 2 6 0 6 38 0 9 2 4 16 4 0 1 4 5 4 15 0 6 0 7 1 2 0 6 19 0 6'~ ~ : ... . , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

RS

S L S W P L P W

010

Class....

rearrange

2 5 2 0 8 61 0 7 7 0 0 8 2 0 0 3 3 2 0 6 38 0 6 2 4 16 3 9 14 5 4 15 0 6 0 6 12 0 6 19 0 6'~~ : ... . , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

. , . , . , . , . , ..

RS

S L S W P L P W

001

Class....

1~ :O

2~ :O

1~ :O

2~ :O

Page 79: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

7979

Mutation

mutation point

out of sequence

mutation

rearrange

1 52 9 1 09 0 0 9 2 6 6 32 0 6 38 6 2 416 39 14 54 15 0 6 612 619 6 001~ ~ : ... . ,0. ,6. , . ,7. , .

.. ,0. , . , . , . ,0.

.. , . , . , . , . , .

.. ,0. , . ,0. , . ,0.

..RS

S L S W P L PW Class

...

1 52 9 1 09 0 0 9 2 6 6 5 0 6 38 6 2 4 16 39 14 54 15 0 6 612 619 6 001'~ ~ : ... . ,0. ,6. , . ,7. , .

.. ,0. ,2. , . , . ,0.

.. , . , . , . , . , .

.. ,0. , . ,0. , . ,0.

..RS

S L S W P L PW Class

...

1 52 9 1 09 0 0 9 2 5 0 6 6 6 38 6 2 416 39 14 54 15 0 6 612 619 6 001'~ ~ : ... . ,0. ,6. , . ,7. , .

.. , . ,2. ,0. , . ,0.

.. , . , . , . , . , .

.. ,0. , . ,0. , . ,0.

..RS

S L S W P L PW Class

...

1~ :O

1~ :O

Page 80: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8080

Fusion

kir~

kjr~

: IF (P.L.=Short) Then Class is Setosa

: IF (P.L.=Short) Then Class is Setosa

k

S L S W P L P W

RS~ ~ : . , . , . , . . , . . , . , . , . , . , . . , . , . , . , . , . . , . , . , . , . , .

. . . . . . . .

51 0 8 6 0 0 87110 2 5 0 7 31 0 6 39 0 7 2 314 37 15 53 16 0 7 0 8 13 0 7 18 0 6 100

Class

ki

S L S W P L

r~

. , . , . , . , . , . . , . , . , . , . , . . , . , . , . , . , . .. . . . . .

5 2 0 9 6110 7 0 0 9 2 6 0 8 32 0 7 38 0 6 2 4 16 39 14 5 4 15 0

6 0 6 12 0 6 19 0 6 100, . , . , . , . , .

. .

~

P W Class

kjr

Page 81: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8181

Fusion

k

S L S W P L P W Class

ki

S L S W

RS

r

~ ~

~

: , , , , , , , , , , , , , , , , , , , ,

, , , , , , , , , ,

. . . . . . . .

. . . .

111111111111111 0 1 0111111 100

1111111111111

, , , , , , , , , ,

: , , , , , , , , , , , , , , , , , , , ,

. . . .

. . . . . . . .

~

~ ~

11 0 1 0111111 100

111111111111111 0 1 0111111 100

P L P W Class

kj

k

S L S W P L P W Class

r

RS

ki

S L S W P L P W Class

kj

k

r

r

RS

~

~

~ ~

, , , , , , , , , , , , , , , , , , , ,

: . , . , . , . . , .

. . . . . . . .

111111111111111 0 1 0111111 100

51 0 8 6 0 0 8711

0 2 5 0 7 31 0 6 39 0 7 2 314 37 15 53 16 0 7 0 8 13 0 7 18 0 6 100S L S W P L P W Class

kir. . . . . . . .

. , . , . , . , . , . . , . , . , . , . , . . , . , . , . , . , .

~

Fusion If accuracy accuracy dropki kj kjr r r( ) ( ),~ ~ ~

Page 82: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8282

Fusion (Subsumption)

: IF (P.L.=Short) Then Class is Setosa

: IF (P.L.=Short) and (P.W.=Narrow) Then Class is Setosa

kir~

kjr~

k

S L S W P L P W

RS~ ~ : . , . , . , . . , . . , . , . , . , . , . . , . , . , . , . , . . , . , . , . , . , .

. . . . . . . .

51 0 8 6 0 0 87110 2 5 0 7 31 0 6 39 0 7 2 314 37 15 53 16 0 7 0 8 13 0 7 18 0 6 100

Class

ki

S L S W P L

r~

. , . , . , . , . , . . , . , . , . , . , . . , . , . , . , . , . .. . . . . .

5 2 0 9 6110 7 0 0 9 2 6 0 8 32 0 7 38 0 6 2 4 16 39 14 5 4 15 0

6 0 6 12 0 6 19 0 6 100, . , . , . , . , .

. .

~

P W Class

kjr

Page 83: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8383

Fusion(subsumption)

Fusion

k

S L S W P L P W Class

ki

S L S W

RS

r

~ ~

~

: , , , , , , , , , , , , , , , , , , , ,

, , , , , , , , , ,

. . . . . . . .

. . . .

111111111111111 0 1 0111111 100

1111111111111

, , , , , , , , , ,

: , , , , , , , , , , , , , , , , , , , ,

. . . .

. . . . . . . .

~

~ ~

11 0 1 0111 0 1 0 100

111111111111111 0 1 0111111 100

P L P W Class

kj

k

S L S W P L P W Class

r

RS

ki

S L S W P L P W Class

kj

k

r

r

RS

~

~

~ ~

, , , , , , , , , , , , , , , , , , , ,

: . , . , . , . . , .

. . . . . . . .

111111111111111 0 1 0111 0 1 0 100

51 0 8 6 0 0 8711

0 2 5 0 7 31 0 6 39 0 7 2 314 37 15 53 16 0 7 0 8 13 0 7 18 0 6 100S L S W P L P W Class

kir. . . . . . . .

. , . , . , . , . , . . , . , . , . , . , . . , . , . , . , . , .

~

Page 84: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8484

Experiments- Hepatitis Diagnosis

Ten knowledge sources are integrated155 cases used in the experiment19 attributes and 2 classes

Page 85: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8585

Experiments- Hepatitis Diagnosis

0 13 69 124 181 261 414140121103110355038174000

Generation CPU Time

00:00:0000:00:0400:00:1900:00:3500:00:5200:01:1500:01:5500:06:0700:09:2000:13:4200:15:3500:16:4500:17:36

Accuracy

0.76880.78440.81320.83280.84320.85250.86880.88760.89490.89650.89770.91830.9290

Fitness

0.75370.76900.79720.81640.82660.83570.85170.87010.87730.87890.88000.90020.9107

Page 86: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8686

75

78

81

84

87

90

93

0 69 181 414 2110 3550 4000

Generation (α=0.125)

fitn

ess

Experiments- Hepatitis Diagnosis

Page 87: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8787

Application : Sugar-Cane Breeding Prediction

Four knowledge sources are integrated699 actual cases used in the application36 attributes and 2 classes

Page 88: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8888

Application : Sugar-Cane Breeding Prediction

Each rule is encoded into a string of 362 units long

0 2 9 17 29 37 76 230 290 392 498 99017342052210833415000

Generation CPU Time00:00:0000:00:0200:00:0800:00:1700:00:2900:00:3700:01:1600:03:5300:04:5300:06:3600:08:2200:16:3600:29:0600:34:2600:35:2200:55:5801:23:46

Accuracy0.56740.67800.68030.68680.68710.68770.69030.69040.69520.69540.71740.73520.73780.74140.74160.74490.7602

Fitness0.55620.66470.66690.67330.67420.67480.67660.67680.68150.68170.70330.72070.72330.72680.72700.73020.7452

Page 89: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

8989

5558616467707376

0 2 15 20 37 230 392 990 2052 3341

Generation (α=0.125)

fitn

ess

Application:Sugar-Cane Breeding Prediction

Page 90: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9090

GFKIGM Approach

Genetic-Fuzzy Knowledge-Integration with a set of Global Membership functionsConsisting of two parts

Knowledge encodingKnowledge integration

Generating a fuzzy rule-set associated with a global collection of membership functions for all fuzzy rules

Page 91: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9191

Knowledge Encoding

Rule Set+MFS

Variable-Length Rule-Set String

MFS String

+ MFS String

Fixed-Length Rule String

Intermediary Rule

Page 92: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9292

Examples: IRIS Flowers

: IF P.L.=Short Then Setosaqr 1~

: IF P.L.=Long Then Virginicaqr 2~

: IF P.W.=Medium Then Versicolorqr 3~

: IF P.W.=Wide Then Virginicaqr 4~

Page 93: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9393

Examples : IRIS Flowers

花萼長度

u(S.L.)

4.3 7.95.2 6.1 7.0

Short Medium Long

S.L.

花萼寬度

u(S.W. )

Medium

2.0 4.42.6 3.2 3.8

Narrow Wide

S.W.

花瓣長度 花瓣寬度

u(P.L. )

1.0 6.92.4 3.9 5.4

Short Medium Long

P.L.

u(P.W. )

Medium

01 2.50.7 1.3 1.9

Narrow Wide

P.W.

Setosa =1, Versicolor=2, Virginica=3

Page 94: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9494

Examples : IRIS Flowers

Rule String

qr

S L S W P L PW Class

1 111 111 100 111 100'~ :. . . . . . . .

IF P.L.=Short Then Setosa

IF S.L.=(Short or Medium or Long) and S.W.=(Narrow or Medium or Wide)and P.L.=Short and P.W.=(Narrow or Medium or Wide) Then Setosa

Intermediary Representation

Page 95: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9595

Examples : IRIS Flowers (Cont.)

: IF P.L.=Short Then Setosaqr 1~

: IF P.L.=Long Then Virginicaqr 2~

: IF P.W.=Medium Then Versicolorqr 3~

: IF P.W.=Wide Then Virginicaqr 4~

111 111 100 111 100 111 111 001 111 001 111 111 111 010 010 111 111 111 001 001

1 2 3 4

S L S W P L PW ClassS L S W P L PW ClassS L S W P L PW ClassS L S W P L PW Class

q q q qr r r r

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

' ' ' '~ ~ ~ ~

52 9 61 9 7 0 9 2 6 6 32 6 38 6 2 4 15 39 15 54 15. ,0. , . ,0. , . ,0. . ,0. , . ,0. , . ,0. . , . , . , . , . , .. . . . . .

Short Medium Long Narrow Medium Wide Short Medium Longu u u u u u u u u

S L S W P LMF MF MF

0 7 6 13 6 19 6. ,0. , . ,0. , . ,0.. .

Narrow Medium Wideu u u

P W

q

MF

MFS

Page 96: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9696

Knowledge Integration

Initial Population

Genetic Operation

RS1+MFS

RS2+MFS

RSn+MFS

CrossoverMutationFusion

Fitness Function

Generation 1

RS1+MFS

RS2+MFS

RSn+MFS

Page 97: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9797

Crossover

2 2

21 2 2 1 2

111001 101 010 010101 61 9 0 9 13 0 6 15 63

~ ~ :

~ ~ ~

. ,0. ,7. ,0. , . , . , . ,0.

' ' '

RS MFS

r r r MF MF

bits

j h A A

2rscp2mfcp 3 unit

2dcp

1 1

11 1 1 1 2

101001 011 001 110101 5 7 8 9 11 16 0 7 18 93

~ ~ :

~ ~ ~

. ,0. ,6. , . , . , . , . ,0.

' ' '

RS MFS

r r r MF MF

bits

i k A A

1rscp 1mfcp1dcp

1endcp

3 unit

1O : 101001 ......... 011 010 ......... 010101 5.7,0.8,6.9,1.1, 1.61.6, 0.6,1.51.5,0.6

111001...101 001 ... 110101 6.1,0.9,7.0,0.9, 1.6, 0.7,1.8,0.92O :

out of sequence

1O : 101001 ......... 011 010 ......... 010101 5.7,0.8,6.9,1.1, 1.51.5, 0.6,1.61.6,0.6

111001...101 001 ... 110101 6.1,0.9,7.0,0.9, 1.6, 0.7,1.8,0.92O :

Crossover

Rearrange

Page 98: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9898

Mutation

rsmpmfmp

Mutation

Rearrange

New

New

1 1

11 1 1 1 2

101001 011 01 1101015 7 8 9 11 7 9~ ~ :

~ ~ ~

. ,0. ,6. , . , ,0. , ,0.

' ' '

RS MFS

r new r r MF MFi k A

out of sequence

A

1 1.6 1.2

1 1

11 1 1 1 2

101001 011 01 1101015 7 8 9 11 16 7 9~ ~ :

~ ~ ~

. ,0. ,6. , . , . ,0. , ,0.

' ' '

RS MFS

r r r MF MFi k A A

0 1.8

1 1

11 1 1 1 2

101001 011 01 1101015 7 8 9 11 7 9~ ~ :

~ ~ ~

. ,0. ,6. , . , ,0. , ,0.

' ' '

RS MFS

r new r r MF MFi k A A

1 1.2 1.6

Page 99: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

9999

Fusion

: IF (P.L.=Short) Then Class is Setosa

: IF (P.L.=Short) Then Class is Setosa

qir '~

qjr '~

111111100111100 111111100100100 5 2 0 9 61 0 9 7 0 0 9 0 7 0 6 13 0 6 19 0 6

qi qj S L P Wr r MF MF' '~ ~

. , . , . , . , . , . , . , . , . , . , . , .. . . .

111111100111100 111111100111100 5 2 0 9 61 0 9 7 0 0 9 0 7 0 6 13 0 6 19 0 6

qi qj S L P Wr r MF MF' '~ ~

. , . , . , . , . , . , . , . , . , . , . , .. . . .

Page 100: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

100100

Fusion (Subsumption)

: IF (P.L.=Short) Then Class is Setosa

: IF (P.L.=Short) and (P.W.=Narrow) Then Class is Setosa

qir '~

qjr '~

111111100111100 111111100100100 5 2 0 9 61 0 9 7 0 0 9 0 7 0 6 13 0 6 19 0 6

qi qj S L P Wr r MF MF' '~ ~

. , . , . , . , . , . , . , . , . , . , . , .. . . .

111111100111100 111111100100100 5 2 0 9 61 0 9 7 0 0 9 0 7 0 6 13 0 6 19 0 6

qi qj S L P Wr r MF MF' '~ ~

. , . , . , . , . , . , . , . , . , . , . , .. . . .

Page 101: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

101101

Experiments- Hepatitis Diagnosis

0 4 34 160 473 57010571495179122512580271030623342375638474000

Generation CPU Time

00:00:0000:00:0200:00:1000:00:4500:02:1400:02:4000:04:5100:06:4600:08:0300:10:0200:11:2700:12:0200:13:1300:14:4900:16:4400:17:0900:17:51

Accuracy

0.76880.78670.82280.84500.85420.85540.85780.86330.86560.87210.88370.88950.89100.90490.90560.90680.9161

Fitness

0.75730.77120.80660.82840.83740.83860.84090.84630.84860.85500.86630.87200.87350.88710.88780.88900.8981

Page 102: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

102102

75

78

81

84

87

90

93

0 34 473 1057 1791 2580 3062 3756 4000

Generation (α=0.125)

fitn

ess

Experiments- Hepatitis Diagnosis

Page 103: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

103103

Application : Sugar-Cane Breeding Prediction

Each knowledge source is encoded into a string of 542 units long

0 2 3 9 13 16 17 227 308 493138616372924315133005000

Generation CPU Time00:00:0000:00:0200:00:0300:00:0800:00:1200:00:1500:01:1600:03:4700:03:5300:08:1500:23:1400:27:2700:49:0400:52:5200:56:2201:24:37

Accuracy0.55060.67260.68020.68070.68640.68680.69220.69280.69440.71530.72010.72530.72670.72950.73620.7485

Fitness0.53450.65300.66030.66080.66640.66650.67200.67260.67410.69440.69910.70410.70550.70820.71470.7266

Page 104: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

104104

5255586164677073

0 2 9 13 17 308 1386 2924 3300

Generation (α=0.125)

fitn

ess

Application:Sugar-Cane Breeding Prediction

Page 105: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

105105

Comparison of GFKILM and GFKIGM

Experiment: Hepatitis Diagnosis

Application: Sugar-Cane Breeding Prediction

GFKILM

GFKIGM

Approach CPU Time Accuracy Rule No.

492.90%1056

91.61% 41071(4000 generations)

(4000 generations)

GFKILM

GFKIGM

Approach CPU Time Accuracy Rule No.

276.02%5026

74.85% 25077(5000 generations)

(5000 generations)

Page 106: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

106106

ROADMAP

TPGKIApproach

TPGFKIApproach

GKIDSOApproach

GFKIGMApproach

GFKILMApproach

MGKIApproach

MGFKIApproach

PittsburghApproach

Michigan Approach

Vague Knowledge

Page 107: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

107107

Why Data Mining?

Simon

Commodities

Supermarketif one customer buys milk

then he is likely to buy bread, so...

Page 108: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

108108

Mining Association Rules

Bread

Milk

IF bread is bought then milk is bought

Page 109: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

109109

The Role of Data Mining

Preprocess data

Useful patterns

Knowledge and strategy

Page 110: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

110110

Mining steps

Step1:Define minsup and minconfex: minsup=50%

minconf=50%

Step2:Find large itemsets

Step3:Generate association rules

Page 111: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

111111

ExampleLarge itemsets

TID I tems100 A C D200 B C E300 A B C E400 B E

Database

C 2

I temset{A B}{A C}{A E}{B C}{B E}{C E}

C 3

I temset{B C E}

ScanDatabas

e

ScanDatabas

e

ScanDatabas

e

I temset Sup.{A} 2{B} 3{C} 3{D} 1{E} 3

C 1

I temset Sup.{A B} 1{A C} 2{A E} 1{B C} 2{B E} 3{C E} 2

C 2

I temset Sup.{B C E} 2

C 3

Itemset Sup.{A} 2{B} 3{C} 3{E} 3

L1

I temset Sup.{A C} 2{B C} 2{B E} 3{C E} 2

L 2

I temset Sup.{B C E} 2

L 3

Page 112: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

112112

Example

Association rules ConfidenceIF BC THEN E S(BCE)/S(BC)=2/2IF BE THEN C S(BCE)/S(BE)=2/3IF CE THEN B S(BCE)/S(CE)=2/2IF B THEN CE S(BCE)/S(B)=2/3IF C THEN BE S(BCE)/S(C)=2/3IF E THEN BC S(BCE)/S(E)=2/3IF A THEN C S(AC)/S(A)=2/2IF C THEN A S(AC)/S(C)=2/3IF B THEN C S(BC)/S(B)=2/3IF C THEN B S(BC)/S(C)=2/3IF B THEN E S(BE)/S(B)=3/3IF E THEN B S(BE)/S(E)=3/3IF C THEN E S(CE)/S(C)=2/3IF E THEN C S(CE)/S(E)=2/3

Page 113: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

113113

Integrating Mined Knowledge

Association Rules

A BB, C DA, C E

.

.

.

If customers buy A, then they will

buy B.

If customers buy B and C, then

they will buy D .

If customers buy A and C, then they will

buy E .

Branch 1

Branch 2

Branch 3

Headquarter

Page 114: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

114114

Integration of Association Rules

Xindong Wu and Shichao Zhang (2003)Synthesizing High-Frequency Rules fromDifferent Data Sources

Known data sources

AB→CA→DB→E

AB→CA→DB→E

DB1 DB2... DBn

RD1 RD2 RDn...

GRB Synthesizing High-Frequency Rules

• Weighting

• Ranking

AB→CA→DB→E

Page 115: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

115115

Integration of Association Rules (Cont.)

Xindong Wu and Shichao Zhang (2003)Synthesizing High-Frequency Rules fromDifferent Data Sources

Unknown data sources

Internet

Web books journals

X→Yconf=0.7

X→Yconf=0.72

X→Yconf=0.68

X→Yconf=?

Synthesizing• clustering method

Page 116: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

116116

Integration of Association Rules

Framework

Functions 1Membership

Fuzzy Rule Set 1

Data Mining Method

Data Mining Method

Transaction database 1

Functions iMembership

Fuzzy Rule Set i

Data Mining Method

Data Mining Method

Transaction database i

Functions nMembership

Fuzzy Rule Set n

Data Mining Method

Data Mining Method

Transaction database n

……

Intermediary representation

Intermediary representation

Intermediary representation

Knowledge IntegrationGenetic Fuzzy

Encoding

IntegrationSample Data

Fuzzy Rule Set+

Membership Functions

Page 117: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

117117

Data Mining Method

Mining Fuzzy Association Rules and Membership Functions

Chromosome1

MF Acquisition process

linguistic terms

Membership

Genetic Fuzzy

Fuzzy Miningfor Large 1-itemsets

Final Membership Function Set

PC

Minimum support Minimum confidence

… Function Set1

Membership Function Set2

Membership Membership Function Setq Function Set3

Chromosome2 Chromosome3 Chromosomeq…

Transaction

Database

Population

Fuzzy Mining

Fuzzy Association Rules

Mining Membership Functions

Mining Fuzzy Association Rules

Page 118: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

118118

Mining Membership Functions

Examplemilk

5 10 15

Low Middle High

Quantity0

Membership value

bread

6 12 18

Low Middle High

Quantity0

Membership value

cookies

3 6 9

Low Middle High

Quantity0

Membership value

beverage

4 8 12

Low Middle High

Quantity0

Membership value

5, 5 10, 5 15, 5 6, 6 12, 6 18, 6 3, 3 6, 3 9, 3, , , , , ,

11 12 13 21 22 23 31 32 33R R R R R R R R R

MF1 MF2 MF3

4, 4 8, 4 12, 4, ,

41 42 43R R R

MF4

Low Middle High Low Middle High Low Middle High Low Middle High

Page 119: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

119119

Fitness Function

Formally

The two bad kinds of membership functions

)C(ySuitabilit

|L|)C(f

q

1q

5 8 9

Low Middle High

Quantity0

(a)

5 20 25

Low Middle High

Quantity0

(b)

Page 120: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

120120

Mining Fuzzy Association Rules

Our fuzzy mining algorithm (2001)

Trade-off between time complexity and number of rules for fuzzy mining from quantitative data

Page 121: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

121121

Conclusions

Classification RulesA genetic knowledge-integration framework and four knowledge integration methodologies are proposed

GKIDSO ApproachTPGKI ApproachGFKILM ApproachGFKIGM Approach

Two real-world applications have been developed by our approaches

A self-integrating knowledge-based brain tumor diagnostic systemA sugar-cane breeding prediction system

Page 122: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

122122

Conclusions (Cont.)

AdvantagesOnly a little computation time is neededA large number of rule sets can be effectively integratedIt is objectiveIt may find new knowledgeDomain experts need not intervene when conflict occurs

Page 123: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

123123

Conclusions (Cont.)

DisadvantagesAll knowledge sources need pre-process to be represented by rule stringsIt need collect a set of data to measure the resulting knowledgeIf the derived knowledge sources are too few, the initial some dummy knowledge sources are inserted into the population

Page 124: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

124124

Conclusions (Cont.)

Fuzzy Association Rulesfuzzy Mining + GA-based evolution

Page 125: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

T. P. T. P. HongHong

125125

Future Work

Heterogeneous knowledge representationVocabulary

Page 126: Knowledge Integration by Genetic Algorithms Prof. Tzung-Pei Hong Department of Electrical Engineering National University of kaohsiung

Thank YouThank YouThank YouThank You