elnaz delpisheh york university department of computer science and engineering april 13, 2015...

20
Elnaz Delpisheh York University Department of Computer Science and Engineering July 4, 2022 Identifying Interesting Association Rules with Genetic Algorithms

Upload: romeo-worcester

Post on 14-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Elnaz DelpishehYork University

Department of Computer Science and Engineering

April 21, 2023

Identifying Interesting Association Rules with Genetic

Algorithms

Page 2: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Data mining

2

Data

Data Mining

Association rules

Too much data

•I = {i1,i2,...,in} is a set of items.•D = {t1,t2,...,tn} is a transactional database.•ti is a nonempty subset of I.•An association rule is of the form AB, where A and B are the itemsets, A⊂ I, B⊂ I, and A∩B=∅ .•Apriori algorithm is mostly used for association rule mining.•{milk, eggs}{bread}.

Page 3: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Apriori Algorithm

TID List of item IDs

T100

I1,I2,I3

T200

I2, I4

T300

I2, I3

T400

I1,I2,I4

T500

I1, I3

T600

I2, I3

T700

I1, I3

T800

I1, I2, I3, I5

T900

I1, I2, I3

3

Page 4: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Apriori Algorithm (Cont.)

4

Page 5: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Association rule mining

5

Too many

association rules

Data

Data Mining

Association rules

Too much data

Page 6: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Interestingness criteria

6

Comprehensibility.Conciseness.Diversity.Generality.Novelty.Utility....

Page 7: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Interestingness measures

Subjective measuresData and the user’s prior knowledge are considered.Comprehensibility, novelty, surprisingness, utility.

Objective measuresThe structure of an association rule is considered.Conciseness, diversity, generality, peculiarity.Example: Support

It represents the generality of a rule. It counts the number of transactions containing both A and

B.

7

Page 8: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Drawbacks of objective measuresDetabase-dependence

Lack of knowledge about the databaseThreshold dependence

SolutionMultiple database reanalysis

Problemo Large number of disk I/O

Detabase-independence

8

Page 9: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Genetic algorithm-based learning (ARMGA )1. Initialize population2. Evaluate individuals in population3. Repeat until a stopping criteria is met

A. Select individuals from the current population

B. Recombine them to obtain more individualsC. Evaluate new individualsD. Replace some or all the individuals of the

current population by off-springs

4. Return the best individual seen so far

9

Page 10: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA ModelingGiven an association rule XYRequirement

Conf(XY) > Supp(Y)

Aim is to maximise

10

Page 11: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA EncodingMichigan Strategy

Given an association k-rule XY, where X,Y⊂I, I is a set of items I=i1,i2,..., in, and X∩Y=∅.

For example{A1,...,Aj}{Aj+1,...,Ak}

11

Page 12: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Encoding (Cont.)

12

The aforementioned encoding highly depends on the length of the chromosome.

We use another type of encoding:Given a set of items {A,B,C,D,E,F}Association rule ACFB is encoded as follows

00A11B00C01D11E00F00: Item is antecedent11: Item is consequence01/10: Item is absent

Page 13: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators

SelectCrossoverMutation

13

Page 14: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators-SelectSelect(c,ps): Acts as a filter of the

chromosomeC: ChromosomePs: pre-specified probability

14

Page 15: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators-CrossoverThis operation uses a two-point strategy

15

Page 16: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Operators-Mutate

16

Page 17: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Initialization

17

Page 18: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

ARMGA Algorithm

18

Page 19: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

Empirical studies and EvaluationImplement the entire procedure using

Visual C++Use WEKA to produce interesting

association rulesCompare the results

19

Page 20: Elnaz Delpisheh York University Department of Computer Science and Engineering April 13, 2015 Identifying Interesting Association Rules with Genetic Algorithms

20