integration of association rules into wum bastian germershaus
TRANSCRIPT
Integration of association rules into WUM
Bastian Germershaus
2
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo
3
Seminar Webmining, Institut für Wirtschaftsinformatik
the problem
we have a large amount of data (e.g. from the server log of our website)
we would like to know if there are any rules in the behavior of our costumers
we could use these rules later on to optimize our business
4
Seminar Webmining, Institut für Wirtschaftsinformatik
an example (Amazon.COM, as usual)
5
Seminar Webmining, Institut für Wirtschaftsinformatik
looking at their sales data retrieving all orders by customer and
current item building rules:
if a customer bought book X → he also bought books X and/or Y and/or Z
what Amazon.COM does
6
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo
7
Seminar Webmining, Institut für Wirtschaftsinformatik
WUM - web usage miner (1)
main goal: navigation pattern discoverysequence of pages through the websitetypical patternsoptimization of site navigation
three steps log file cleaningpattern analysisvisualization
8
Seminar Webmining, Institut für Wirtschaftsinformatik
WUM – web usage miner (2)
Source: Myra Spiliopoulou: “Web Usage Mining for Web Site Evaluation” in Communications of the ACM, August 2000, Vol. 43
9
Seminar Webmining, Institut für Wirtschaftsinformatik
WUM – web usage miner (3)
special requirementsminer should understand abstract pattern
descriptions ‘MINT’ (SQL-like query language)
usage patterns should be more than a sequence of frequently accessed pages
integration of statistics about the routes connecting pages frequently accessed together
10
Seminar Webmining, Institut für Wirtschaftsinformatik
WUM – web usage miner (4)
Source: Myra Spiliopoulou: “Web Usage Mining for Web Site Evaluation” in Communications of the ACM, August 2000, Vol. 43
11
Seminar Webmining, Institut für Wirtschaftsinformatik
WUM – web usage miner (5)
evaluation of discovered patterns is neededstatistical testingsemantic evaluation
discovered navigation patterns may help restructuring the siteredesign pages, inserting linksrestructuring may confuse some users
12
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo
13
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules (1)
example we sell cell-phones, gadgets and accessories
Homepage (H)
cell-phones (C1) gadgets (G1) accessories (A1)
Nokia (C21) Siemens (C22)
3110 (C211)
8110 (C212)
C35 (C221)
S 45 (C222)
Palm (G11) Compaq (G12)
Palm III (G111)
Palm V (G112)
Ipaq (G121)
Nokia (A11) Siemens (A12)
hands-free kit (A111) hands-free kit (A121)
14
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules (2)
possible association rule
C212 (13)G121 & A111 (30)
Support: 0,065 (6,5 %) Confidence: 0,433 (43,3 %)
200 different orders in database
13 of 30 users, that bought a Compaq Ipaq and a hands-free kit for Nokia phones also bought a Nokia 8110.
15
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules (3)
sequence of pages does NOT matter ‘if – then – condition’ parameters (support, confidence) useful rules
apply reasonably often (support)are unusually reliable (confidence)make interesting predictions
16
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo
17
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in general (1) “if a customer came to our website
through a banner and it is not his first visit then he buys an article”
this object has three attributes:came through bannerat least second visitbuys an article
18
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in general (2) binary attributes (0 or 1; yes or no) rules should have the form
if attribute X ►then attribute Y (X→Y)
attributes should be disjunctive (X∩Y=Ø)
19
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in general (3) parameters for association rules:
confidence“60% where attribute X is true → attribute Y is also true”
support“40% where attribute X is true → attribute Y is also true; that applies to 10% of all cases in the database”
20
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in general (4) the goal of the used Apriori algorithm is:
“find all rules where minimum support and minimum confidence holds true”
two iterative stepsfind ‘large item sets’ with minimum supportcandidate-generating and pruning
21
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in general (5) find large item sets
support-calculation for every candidate(support means occurrence of candidate in relation to whole number of objects)
remove every candidate with smaller support then ‘minimum support’
save candidates with high incidence
22
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in general (6) candidate-generating and pruning
temporary candidates: for two sets X, Y of cardinality n, which have n-1 attributes in common, build a temporary candidate X U Y
pruning: eliminate all candidates, where support of each candidate with a cardinality of n is lower than min. support
23
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo
24
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in WUM (1)
25
Seminar Webmining, Institut für Wirtschaftsinformatik
association rules in WUM (2)
26
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo
27
Seminar Webmining, Institut für Wirtschaftsinformatik
contents
introduction WUM web usage miner in general association rules – a brief example association rules – the theory association rules in WUM association rules in WUM – a demo