apriori.rb - la ruby presentation

109
Apriori A Ruby wrapper for Christian Borgelt’s implementation of Agrawal et al.’s algorithm

Upload: nate-murray

Post on 22-Apr-2015

4.024 views

Category:

Technology


2 download

DESCRIPTION

Nate Murray's talk on Apriori.rb - A ruby gem/wrapper around Christian Borgelt’s apriori.c software for finding frequently purchased itemsets

TRANSCRIPT

Page 1: Apriori.rb - LA Ruby Presentation

AprioriA Ruby wrapper for

Christian Borgelt’s implementationof Agrawal et al.’s algorithm

Page 2: Apriori.rb - LA Ruby Presentation

what does it do?

Page 3: Apriori.rb - LA Ruby Presentation

picture of a grocery store

Page 4: Apriori.rb - LA Ruby Presentation

overview

Page 5: Apriori.rb - LA Ruby Presentation

overview

• Find regularities in shopping behavior

Page 6: Apriori.rb - LA Ruby Presentation

overview

• Find regularities in shopping behavior

• Market Basket Analysis

Page 7: Apriori.rb - LA Ruby Presentation

overview

• Find regularities in shopping behavior

• Market Basket Analysis

• Sets of products

Page 8: Apriori.rb - LA Ruby Presentation

suggest items to a customer

Page 9: Apriori.rb - LA Ruby Presentation

association rules

Page 10: Apriori.rb - LA Ruby Presentation

association rules

Page 11: Apriori.rb - LA Ruby Presentation

association rules

“A customer who buys apples buys cheese with 30% certainty”

Page 12: Apriori.rb - LA Ruby Presentation

association rules

“A customer who buys apples buys cheese with 30% certainty”

Confidence

Page 13: Apriori.rb - LA Ruby Presentation

why would we want to do this?

Page 14: Apriori.rb - LA Ruby Presentation

picture of “buy this too”

Page 15: Apriori.rb - LA Ruby Presentation

Example

Page 16: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 17: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Item

s

Page 18: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Customers

Page 19: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

purchased

Page 20: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 21: Apriori.rb - LA Ruby Presentation

problem:

Page 22: Apriori.rb - LA Ruby Presentation

too many possible rules

Page 23: Apriori.rb - LA Ruby Presentation
Page 24: Apriori.rb - LA Ruby Presentation
Page 25: Apriori.rb - LA Ruby Presentation
Page 26: Apriori.rb - LA Ruby Presentation
Page 27: Apriori.rb - LA Ruby Presentation
Page 28: Apriori.rb - LA Ruby Presentation
Page 29: Apriori.rb - LA Ruby Presentation
Page 30: Apriori.rb - LA Ruby Presentation
Page 31: Apriori.rb - LA Ruby Presentation
Page 32: Apriori.rb - LA Ruby Presentation

solution:

Page 33: Apriori.rb - LA Ruby Presentation

Don’t look at all the rules(which is how Apriori works)

Page 34: Apriori.rb - LA Ruby Presentation

term:

Itemset:a combination of one or more items

Page 35: Apriori.rb - LA Ruby Presentation

examples of itemsets

Page 36: Apriori.rb - LA Ruby Presentation

examples of itemsets

Page 37: Apriori.rb - LA Ruby Presentation

examples of itemsets

Page 38: Apriori.rb - LA Ruby Presentation

examples of itemsets

Page 39: Apriori.rb - LA Ruby Presentation

examples of itemsets

Page 40: Apriori.rb - LA Ruby Presentation

Step 1) Build a prefix tree

Page 41: Apriori.rb - LA Ruby Presentation
Page 42: Apriori.rb - LA Ruby Presentation
Page 43: Apriori.rb - LA Ruby Presentation
Page 44: Apriori.rb - LA Ruby Presentation
Page 45: Apriori.rb - LA Ruby Presentation
Page 46: Apriori.rb - LA Ruby Presentation
Page 47: Apriori.rb - LA Ruby Presentation

prefix

Page 48: Apriori.rb - LA Ruby Presentation

prefix

Page 49: Apriori.rb - LA Ruby Presentation
Page 50: Apriori.rb - LA Ruby Presentation
Page 51: Apriori.rb - LA Ruby Presentation
Page 52: Apriori.rb - LA Ruby Presentation

prefix

Page 53: Apriori.rb - LA Ruby Presentation

prefix

Page 54: Apriori.rb - LA Ruby Presentation

prefix

Page 55: Apriori.rb - LA Ruby Presentation

prefix

Page 56: Apriori.rb - LA Ruby Presentation
Page 57: Apriori.rb - LA Ruby Presentation

Step 2) Prune statistically insignificant rules

Page 58: Apriori.rb - LA Ruby Presentation

statistically significant

Page 59: Apriori.rb - LA Ruby Presentation

term:

Support:the percentage of transactions that a rule/itemset can be applied to

Page 60: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 61: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 62: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 63: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

3/5

Page 64: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

3/5 = 60%

Page 65: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

3/5 = 60%

Support

Page 66: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 67: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 68: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 69: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

2/5

Page 70: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

2/5 = 40%

Page 71: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

2/5 = 40%

Support

Page 72: Apriori.rb - LA Ruby Presentation

support of:

Page 73: Apriori.rb - LA Ruby Presentation

= 40% support of:

Page 74: Apriori.rb - LA Ruby Presentation
Page 75: Apriori.rb - LA Ruby Presentation
Page 76: Apriori.rb - LA Ruby Presentation

The key optimization:

Page 77: Apriori.rb - LA Ruby Presentation

= 40% support of:

Page 78: Apriori.rb - LA Ruby Presentation

= 40% support of:

support of:

Page 79: Apriori.rb - LA Ruby Presentation

= 40% support of:

support of:

Page 80: Apriori.rb - LA Ruby Presentation

= 40% support of:

support of:

+

Page 81: Apriori.rb - LA Ruby Presentation

= 40% support of:

support of:

+

Page 82: Apriori.rb - LA Ruby Presentation

= 40% support of:

support of:

+ <= 40%

Page 83: Apriori.rb - LA Ruby Presentation
Page 84: Apriori.rb - LA Ruby Presentation
Page 85: Apriori.rb - LA Ruby Presentation
Page 86: Apriori.rb - LA Ruby Presentation

Step 3) Find “good” rules

Page 87: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 88: Apriori.rb - LA Ruby Presentation

need a way to calculate “goodness”

Page 89: Apriori.rb - LA Ruby Presentation

term:

Page 90: Apriori.rb - LA Ruby Presentation

term:

Confidence:number of cases in which the rule is correct relative to the number of cases in which it is applicable

Page 91: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 92: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

Page 93: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

2/2

Page 94: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

2/2 = 100%

Page 95: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

2/2 = 100%

Confidence

Page 96: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 97: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 98: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

Page 99: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

Page 100: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

1/3

Page 101: Apriori.rb - LA Ruby Presentation

✓ ✓✓ ✓✓ ✓ ✓✓ ✓

✓ ✓

->

1/3 = 33%

Page 102: Apriori.rb - LA Ruby Presentation

intermission

Page 103: Apriori.rb - LA Ruby Presentation

Apriori(in ruby)

Page 104: Apriori.rb - LA Ruby Presentation

code example

Page 105: Apriori.rb - LA Ruby Presentation

available today

Page 106: Apriori.rb - LA Ruby Presentation

gem install apriori

Page 107: Apriori.rb - LA Ruby Presentation

requires: rubygems >= 1.2.0

gem update --system

Page 108: Apriori.rb - LA Ruby Presentation

AprioriA Ruby wrapper for

Christian Borgelt’s implementationof Agrawal et al.’s algorithm

Page 109: Apriori.rb - LA Ruby Presentation

AprioriA Ruby wrapper for

Christian Borgelt’s implementationof Agrawal et al.’s algorithm