integration of association rules into wum bastian germershaus

27
Integration of association rules into WUM Bastian Germershaus

Upload: karli-frost

Post on 31-Mar-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Integration of association rules into WUM Bastian Germershaus

Integration of association rules into WUM

Bastian Germershaus

Page 2: Integration of association rules into WUM Bastian Germershaus

2

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo

Page 3: Integration of association rules into WUM Bastian Germershaus

3

Seminar Webmining, Institut für Wirtschaftsinformatik

the problem

we have a large amount of data (e.g. from the server log of our website)

we would like to know if there are any rules in the behavior of our costumers

we could use these rules later on to optimize our business

Page 4: Integration of association rules into WUM Bastian Germershaus

4

Seminar Webmining, Institut für Wirtschaftsinformatik

an example (Amazon.COM, as usual)

Page 5: Integration of association rules into WUM Bastian Germershaus

5

Seminar Webmining, Institut für Wirtschaftsinformatik

looking at their sales data retrieving all orders by customer and

current item building rules:

if a customer bought book X → he also bought books X and/or Y and/or Z

what Amazon.COM does

Page 6: Integration of association rules into WUM Bastian Germershaus

6

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo

Page 7: Integration of association rules into WUM Bastian Germershaus

7

Seminar Webmining, Institut für Wirtschaftsinformatik

WUM - web usage miner (1)

main goal: navigation pattern discoverysequence of pages through the websitetypical patternsoptimization of site navigation

three steps log file cleaningpattern analysisvisualization

Page 8: Integration of association rules into WUM Bastian Germershaus

8

Seminar Webmining, Institut für Wirtschaftsinformatik

WUM – web usage miner (2)

Source: Myra Spiliopoulou: “Web Usage Mining for Web Site Evaluation” in Communications of the ACM, August 2000, Vol. 43

Page 9: Integration of association rules into WUM Bastian Germershaus

9

Seminar Webmining, Institut für Wirtschaftsinformatik

WUM – web usage miner (3)

special requirementsminer should understand abstract pattern

descriptions ‘MINT’ (SQL-like query language)

usage patterns should be more than a sequence of frequently accessed pages

integration of statistics about the routes connecting pages frequently accessed together

Page 10: Integration of association rules into WUM Bastian Germershaus

10

Seminar Webmining, Institut für Wirtschaftsinformatik

WUM – web usage miner (4)

Source: Myra Spiliopoulou: “Web Usage Mining for Web Site Evaluation” in Communications of the ACM, August 2000, Vol. 43

Page 11: Integration of association rules into WUM Bastian Germershaus

11

Seminar Webmining, Institut für Wirtschaftsinformatik

WUM – web usage miner (5)

evaluation of discovered patterns is neededstatistical testingsemantic evaluation

discovered navigation patterns may help restructuring the siteredesign pages, inserting linksrestructuring may confuse some users

Page 12: Integration of association rules into WUM Bastian Germershaus

12

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo

Page 13: Integration of association rules into WUM Bastian Germershaus

13

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules (1)

example we sell cell-phones, gadgets and accessories

Homepage (H)

cell-phones (C1) gadgets (G1) accessories (A1)

Nokia (C21) Siemens (C22)

3110 (C211)

8110 (C212)

C35 (C221)

S 45 (C222)

Palm (G11) Compaq (G12)

Palm III (G111)

Palm V (G112)

Ipaq (G121)

Nokia (A11) Siemens (A12)

hands-free kit (A111) hands-free kit (A121)

Page 14: Integration of association rules into WUM Bastian Germershaus

14

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules (2)

possible association rule

C212 (13)G121 & A111 (30)

Support: 0,065 (6,5 %) Confidence: 0,433 (43,3 %)

200 different orders in database

13 of 30 users, that bought a Compaq Ipaq and a hands-free kit for Nokia phones also bought a Nokia 8110.

Page 15: Integration of association rules into WUM Bastian Germershaus

15

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules (3)

sequence of pages does NOT matter ‘if – then – condition’ parameters (support, confidence) useful rules

apply reasonably often (support)are unusually reliable (confidence)make interesting predictions

Page 16: Integration of association rules into WUM Bastian Germershaus

16

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo

Page 17: Integration of association rules into WUM Bastian Germershaus

17

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in general (1) “if a customer came to our website

through a banner and it is not his first visit then he buys an article”

this object has three attributes:came through bannerat least second visitbuys an article

Page 18: Integration of association rules into WUM Bastian Germershaus

18

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in general (2) binary attributes (0 or 1; yes or no) rules should have the form

if attribute X ►then attribute Y (X→Y)

attributes should be disjunctive (X∩Y=Ø)

Page 19: Integration of association rules into WUM Bastian Germershaus

19

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in general (3) parameters for association rules:

confidence“60% where attribute X is true → attribute Y is also true”

support“40% where attribute X is true → attribute Y is also true; that applies to 10% of all cases in the database”

Page 20: Integration of association rules into WUM Bastian Germershaus

20

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in general (4) the goal of the used Apriori algorithm is:

“find all rules where minimum support and minimum confidence holds true”

two iterative stepsfind ‘large item sets’ with minimum supportcandidate-generating and pruning

Page 21: Integration of association rules into WUM Bastian Germershaus

21

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in general (5) find large item sets

support-calculation for every candidate(support means occurrence of candidate in relation to whole number of objects)

remove every candidate with smaller support then ‘minimum support’

save candidates with high incidence

Page 22: Integration of association rules into WUM Bastian Germershaus

22

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in general (6) candidate-generating and pruning

temporary candidates: for two sets X, Y of cardinality n, which have n-1 attributes in common, build a temporary candidate X U Y

pruning: eliminate all candidates, where support of each candidate with a cardinality of n is lower than min. support

Page 23: Integration of association rules into WUM Bastian Germershaus

23

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo

Page 24: Integration of association rules into WUM Bastian Germershaus

24

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in WUM (1)

Page 25: Integration of association rules into WUM Bastian Germershaus

25

Seminar Webmining, Institut für Wirtschaftsinformatik

association rules in WUM (2)

Page 26: Integration of association rules into WUM Bastian Germershaus

26

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo

Page 27: Integration of association rules into WUM Bastian Germershaus

27

Seminar Webmining, Institut für Wirtschaftsinformatik

contents

introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo