chapter 10 association rule

Chapter 10

ASSOCIATION RULEBy:

Aris D.(13406054)

Ricky A.(13406058)

Nadia FR. (13406069)

Amirah K.(13406070)

Paramita AW.(13406091)

Bahana W.(13406102)

Introduction

• Affinity Analysis

Study of attributes or characteristics that “go together”.

• Market Based Analysis

The method, uncover rules for quantifying the relationship between two or more attributes.

“If antecedent, then consequent”

Affinity Analysis & Market Basket Analysis

• Example: Supermarket may find that of the 1000 customers

shopping on a Thursday night, 200 bought diapers, and of the 200 who bought diapers, 50 bought beer.

The association rule:If buy diapers, then buy beers”,with support of 50/1000 = 5%,and confidence of 50/200=25%

Affinity Analysis & Market Basket

Analysis (2)Examples business & research:• Investigating the proportion of subscribers to your

company’s cell phone plan that respond positively to an offer of a service upgrade

• Examining the proportion of children whose parents read to them who are themselves good readers

• Predicting degradation in telecommunications networks• Finding out which items in a supermarket are purchased

together & which are never purchased together• Determining the proportion of cases in which a new drug

will exhibit dangerous side effects

Affinity Analysis & Market Basket

Analysis (3)• The number of possible association rules grows

exponentially in the number of attributes.

• If binary attributes (yes/no) then there are k.[2^(k-1)] possible association rule.

• Example: a convinience store that sells 100 items. Possible association rules = 100.[2^99] ≈ 6,4 x (10^31)

• A priori algorithm (pendahuluan) reduce the search problem to a more manageable size

Notation for Data Representation in

Market Basket Analysis• Farmer sells I = {asparagus, beans, broccoli,

corn, green peppers, squash, tomatoes}

• A customer puts in a basket, Subset I = {broccoli, corn}

• Subset doesn’t keep track of how much each item is purchased, just the name of item.

Transactional Data Format

Tabular Data Format

Support, Confidence, Frequent

Itemsets, & the Apriori Property• Example:D : set of transactions represented in Table 10.1T : a transaction in D represents a set of itemsI : set of itemsSet of items A : beans, squashSet of items B : asparagus

THEN …Association rule takes the form if A, then B (AB),A and B are PROPER subsets of I, and are mutuallyexclusive

Table of Transaction Made

Support and Confidence• Support, s, is the proportion of transactions in D

that contain both A and B.support = P(AB)= number of transactions containing both A&B

total number of transactions• Confidence, c, is a measure of the accuracy of the

rule.confidence = P(B|A)= P(AB)

P(A)= number of transactions containing both A&B

number of transactions containing A

• Analysts prefer RULES:High support AND High confidence

Frequent Itemset Definition…

An Itemset is a set of items contained in I, and a k-itemset containing k items. e.g: {beans, squash} 2-itemset The itemset frequency…

the number of transactions that contain the particular itemset A frequent itemset …

itemset that occurs at least a certain minimum number of times, having itemset frequency

Example:Set that = 4, then itemsets that occur more than FOUR times are said to be frequent.

• Mining Association RulesIt is a two-steps process:1. Find all frequent itemsets (all itemsets with

frequency )2. From the frequent itemsets, generate

association rules satisfying the minimum support and confidence conditions

• The Apriori property states that if an itemset Z isnot frequent, then adding another item A tothe itemset Z will not make Z morefrequent. This helpful property reducessignificantly the search space for the a priorialgorithm.

The Apriori Property

How does the Apriori Algorithm Work?

• Part 1: Generating Frequent Itemsets

• Part 2: Generating Association Rules

Generating Frequent Itemsets• Example:

let = 4, so that an itemset is frequent if it occursfour or more times in D.

F1= {asparagus, beans, broccoli, corn, greenpeppers, squash, tomatoes}F2 first, constructs a set Ck of candidate k-itemsetsby joining Fk-1 with itself. Then it prunes Ck usingthe a priori property.Ck for k=2, consists of all the combinations ofvegetables in Table 10.4F3 not much different than the steps for F2, butuse k number = 3

Table 10.3 (pg.183)

Table 10.4 (pg. 185)

• However, consider s={beans, corn, squash}

the subset {corn, squash} has frequency 3 < 4 =, so that {corn, squash} is not frequent.

By the priori property, therefore, {beans, corn,squash} cannot be frequent, is therefore pruned,and doesn’t appear in F3

So does the s= {beans, squash, tomatoes}, the frequency of the subsets is < 4

Generating Association Rules

1. Generate all subsets of s.

2. Association Rule R : ss ⇒ (s-ss)Generate R if fulfills the minimum confidence requirement.

(s-ss) is set s without ss

Example two antecedent

• All transaction = 14

• Transaction include asparagus and beans = 5

• Transaction include asparagus and Squash = 5

• Transaction include Beans and squash = 6

Ranked by support x Confidence

• Minimum Confidence 80%

Clementine generating Association

Rules

Clementine generating Association

Rules (2)• Support means occurences of antecedent,

different from what we defined before.

• First columns indicates number of antecedent occurs.

• To find actual “support” using clementine, multiply support and confidence.

Extension From Flag Data to General

Categorical Data

- Association rule not only for Flag (Boolean) data.

- A priori algorithm can be applied to categorical data.

Example using Clementine

• Recall Normalized adult data set in chapter 6 and 7

Information-Theoretic Approach:

Generalized Rule Induction MethodWhy GRI?

• A priori algorithm is not well equipped to handle numerical attributes, need discretization

• Discretization can lead to loss of information

• GRI can handle both categorical or numerical variables as inputs, but still requires categorical variables as output

Generalized Rule Induction Method (2)

J-Measure

• p(x) probability of the value of x (antecedent)

• p(y) probability of the value of y (consequent)

• p(y|x) conditional probability of y given that x has occured

)(1

)|(1ln)].|(1[

)(

)|(ln).|()(

yp

xypxyp

yp

xypxypxpJ

Generalized Rule Induction Method (3)

• J-Measure shows “interestingness”

• In GRI, user specifies how many association rules would be reported

• If the “interestingness” of new rule > current minimum J in the rule table, new rule is inserted, rule with minimum J is eliminated

Application of GRIp(x) : female, never married

p(x) = 0.1463

Application of GRI (2)

p(y) : work class = private

p(y) = 0.6958

Application of GRI (3)p(y|x) : work class = private;

given : female, never married

p(y|x) = conditional probabilities = 0.763

Application of GRI

Calculation :

001637.0

)]7791.0ln().237.0()0966.1ln(.763.0[1463.0

3042.0

237.0ln).237.0(

6958.0

763.0ln.763.01463.0

)(1

)|(1ln)].|(1[

)(

)|(ln).|()(

yp

xypxyp

yp

xypxypxpJ

When not to use Association Rules

• Association Rules chosen a priori could be used based on:

▫ Confidence

▫ Confidence Difference

▫ Confidence Ratio

• Association Rules need to be applied with care because the results are sometimes unreliable.

When not to use Association Rules (2)Association Rules chosen a priori, based on confidence

• Applying this association rule reduces the probability of randomly selecting desired data.

• Eventhough the rule is useless, software still reported it probably because the default ranking mechanism for priori’s algorithm is confidence.

• We should never simply believe the computer output without making the effort to understand the models and mechanism underlying the result.

When not to use Association Rules (3)Association Rules chosen a priori, based on confidence

When not to use Association Rules (4)Association Rules chosen a priori, based on confidence difference

• A random selection from the database wouldhave provided more effective results (none useless report)than applying the association rule.

• This rule provide the greatest increase in confidence from the prior to posterior.

• Evaluation measures the absolute difference between the prior and posterior confidences.

When not to use Association Rules (5)Association Rules chosen a priori, based on confidence difference

When not to use Association Rules (6)Association Rules chosen a priori, based on confidence ratio

• Analyst prefer to use the confidence ratio to evaluate potential rules.

• Confidence difference criterion yielded the very same rules as did the confidence ratio criterion.

When not to use Association Rules (7)Association Rules chosen a priori, based on confidence ratio

• Example:

If Marital_Satus = Divorced, then sex = Female. p(y)=0.3317 danp(y|x)=0.60

Do Association Rules Represent

Supervised or Unsupervised Learning?• Supervised learning:

▫ Variable is prespecified

▫ Algorithm is provided with a rich collection of examples where possible association between the target vaiable and the predictor variables may be uncovered

• Unsupervised learning:▫ No target variable is identified explicitly

▫ Algorithm searches for patterns and structure among all the variables

• Association Rules generally used for unsupervised learning but can also be applied for supervised learning for classification task

Local Patterns Versus Global Models

Model: Global Description or Explanation of a data set. Patterns: Essential local features of Data Association rules are well suited to uncovering

local patterns in data Applying “if “clause drills down deep into data set,

uncovering a hidden local pattern that might be relevant Finding local patterns is one of the most

important goals in data mining. It can lead to new profitable initiatives.

chapter 10 association rule

Documents