2006-10-09 - kes2006 - dawn · 9 oct 2006 kes 2006 - sebastian stober : dawn 3 motivation goal:...
TRANSCRIPT
Sebastian Stober and Andreas Nürnberger Institute of Language and Knowledge Engineering
Otto-von-Guericke University Magdeburg, Germany
http://irgroup.cs.uni-magdeburg.de e-mail: {stober,nuernb}@iws.cs.uni-magdeburg.de
DAWN - A System for Context-based Link Recommendation in Web Navigation
KES2006 10th International Conference on Knowledge-Based & Intelligent Information & Engineering Systems
October 9, 2006
9 oct 2006 kes 2006 - sebastian stober : dawn 2
Outline
Motivation Underlying Model
General Assumptions Problems using Markov Models Generalization Graph Representation
Recommender Algorithm Test Results, Work in Progress, Future Work Summary
“DAWN” (Direction Anticipation in Web Navigation)
9 oct 2006 kes 2006 - sebastian stober : dawn 3
Motivation Goal: System that provides support for navigating the
World Wide Web …by recommending outgoing links of a web page …without confinement to a specific web site or domain
Approaches made so far include: look-ahead crawler finding pages with similar content as
the last (n) pages [Lieberman’95:Letizia] recommending pages (within the web site) that users with
similar interests/information need have visited [Joachims et al.’97:WebWatcher]
Idea: Users may (unconsciously) develop navigational patterns for web browsing learn these patterns and use them as heuristic for
recommendation
9 oct 2006 kes 2006 - sebastian stober : dawn 4
Modeling Navigational Patterns Assumption: Browsing process has the Markov
Property, i.e.: the (conditional) probability for the next link
chosen depends only on the last n visited web pages
web pages visited further back in history have no impact on the choice of the next link
Successfully applied in web-cache optimization (predictive prefetching) web-usage mining
9 oct 2006 kes 2006 - sebastian stober : dawn 5
Problems using Markov Models Exponential space complexity
for each possible sequence of n pages, a probability distribution for the next page has to be stored: complexity O(|S|n+1) (where S is the state space of the model)
WWW contains too many web pages and links state space would become way too big to handle very sparse training data even if the training data set is
huge (sparsity problem)
If we don‘t want to reduce the order of the model, there are only 2 ways to reduce a model‘s space complexity: reduce the size of the state space find an efficient way to store the model (compression)
9 oct 2006 kes 2006 - sebastian stober : dawn 6
Reducing the Size of the State Space
… by grouping similar pages into “contexts” by clustering (k-means on standard TFiDF doc representations)
C1 C3
C2
sequences of web pages:
P1 → P3 → P5
P1 → P2 → P4 → P5
sequences of contexts:
C1 → C2 → C3
C1 → C1 → C2 → C3
P1
P2
P3 P4
P5
reduces the size of the state space (S) generalizes the model (works on unseen pages as well) reduces sparsity-problem
9 oct 2006 kes 2006 - sebastian stober : dawn 7
A 1st-order Markov Model can be represented as a weighted directed graph: contexts → vertices for all ci, cj with P(cj |ci) > 0
insert a directed edge from ci to cj with weight P(cj |ci)
Question: How can this representation be extended to higher-order models? idea: context sequences → vertices
Model Representation (1st order)
Note: This is not to be confused with a Markov Network! Vertices refer here to states of a system and not to random variables!
c1
c2
0.38 0.75
c3
c4
1.0
0.8 0.5
0.62 c5
0.5
0.25
0.2
9 oct 2006 kes 2006 - sebastian stober : dawn 8
c1
c2
0.38 0.75
c3
c4
1.0
0.8 0.5
0.62 c5
0.5
0.25
0.2
idea: context sequences (n-grams) → vertices i.e. for each possible sequence of contexts in the data with length n, create a vertice:
Observation: vertices can be merged if the probability distributions for the next context are (nearly) identical
0.5
Model Representation (nth order)
c1c2
0.38 0.75
1.0
1.0 0.5
0.62
0.25
c4c1
c5c4
c1c3
c2c3
c4c3
c2c5 c3c5
1.0
1.0
old c3
old c5
c5c3
1.0
0.4
0.6
9 oct 2006 kes 2006 - sebastian stober : dawn 9
Model Representation (compression)
idea [Borges and Levene ‘05]: construct graph incrementally
(i.e. start with 1st-order model and construct nth-order model from (n-1)th-order model)
clone a vertice only if the nth-order probability distribution differs from the one in the (n-1)th-order model
if a vertice needs to be cloned, merge clones with (nearly) identical probability distributions
9 oct 2006 kes 2006 - sebastian stober : dawn 10
Overview “DAWN” (Direction Anticipation in Web Navigation)
so far…
next…
9 oct 2006 kes 2006 - sebastian stober : dawn 11
Recommendation Algorithm each time the history is updated (i.e. user clicks a link)
map the history onto the model by …finding similar paths (context sequences)
…and computing the path weights (minimum of the context-page similarities)
compute the probability distribution for the next context by overlaying all similar paths (weighted)
for each candidate page (outgoing link of current page)
map onto the model (find similar contexts)
recommend if the probability for at least one of the most similar contexts exceeds a threshold θ
9 oct 2006 kes 2006 - sebastian stober : dawn 12
Ex: Recommendation Algorithm
find similar contexts for Pt-1
A,C,C‘ find successors of A, C and
C‘, that are similar to Pt B,E
compute path weights (min of all context-page similarities)
wA,B = min(0.7 , 0.9) = 0.7 wC,B = min(0.9 , 0.9) = 0.9 wC‘,E = min(0.9 , 0.8) = 0.8 wB = 0.9 wE = 0.8
C F
A DB
C’
H E
G
5 (0.5)
2 (0.4)
5 (0.5)
3 (0.6) 0.8
0.9
0.9
0.9
0.7
map the history (Pt-1,Pt) onto the model (similarity threshold λ=0.7):
9 oct 2006 kes 2006 - sebastian stober : dawn 13
Ex: Recommendation Algorithm
compute the probability distribution for the next state weighted overlay of the probability distributions induced
by the different similar paths in the model:
processing of candidate page Y: identify similar contexts of Y (with similarity threshold λ=0.7)
D if P(D|Ct-1,Ct) ≥ θ then Y is recommended (e.g. for θ = 0.4)
9 oct 2006 kes 2006 - sebastian stober : dawn 14
Outlook Test results on server logs indicate that
DAWN’s recommendations might be useful: In about 30% of all test cases the actual chosen
link was amongst the 3 highest ranked recommendations
Work in progress: Development of browser front-end Incorporation of further information History visualization
Future work: User study (usability, helpfulness)
9 oct 2006 kes 2006 - sebastian stober : dawn 15
Summary DAWN uses a higher-order Markov Model to
capture a user’s browsing behavior and to predict, which page the user might want to access next
Assumption: Probability for the next link chosen depends only on the last n visited web pages
Model size is reduced by grouping similar pages into contexts using a special graph representation
Preliminary test results on server logs indicate usefulness of the recommendations
Thank You for Your Attention
Sebastian Stober and Andreas Nürnberger http://irgroup.cs.uni-magdeburg.de
e-mail: {stober,nuernb}@iws.cs.uni-magdeburg.de
9 oct 2006 kes 2006 - sebastian stober : dawn 17
Detailed Test Results Number of sessions, requests and unique URLs in the data used for
training and evaluation:
Predicted ranks of the actually followed links. The number of candidates (number of different outbound links in a web page) ranged from 1 to 607 with a mean of 10.77.
9 oct 2006 kes 2006 - sebastian stober : dawn 18
Browser Integration
Recommendations are displayed in an overlay <div>- element within the web page
9 oct 2006 kes 2006 - sebastian stober : dawn 19
Browser Integration thumbnail
Visual information: Impression of the page layout and favicon that the user might
remember Content information:
Page title, uri Ranking information:
List of recommendations is sorted by visit probability predicted by the model
probabilty
favicon
page title uri
9 oct 2006 kes 2006 - sebastian stober : dawn 20
Ex: Inducing 1st-order Model
S F
A C
B
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2
3 3
3 3
absolute transition frequency (in the data)
artificial states for start and end
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
9 oct 2006 kes 2006 - sebastian stober : dawn 21
Ex: Inducing 1st-order Model
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
S F
A C
B
D
5 5
3 3
2 2
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2
9 oct 2006 kes 2006 - sebastian stober : dawn 22
Ex: Inducing 1st-order Model
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
S F
A C
B
D
5 5
4 4
2 2 E
1 1
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2
9 oct 2006 kes 2006 - sebastian stober : dawn 23
Ex: Inducing 1st-order Model
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
S F
A C
B
D
5 5
4 4
3 3 E
2 2
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2
9 oct 2006 kes 2006 - sebastian stober : dawn 24
Ex: 1st-order Model
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
S F
A C
B
D
6 (0.38) 6 (1)
6 (0.38) 6 (1)
10 (0.62)
4 (0.40)
H
E
G
5 (1)
5 (1) 5 (0.31)
5 (0.31)
6 (0.60)
6 (1)
transition probability P(C|B)
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2
9 oct 2006 kes 2006 - sebastian stober : dawn 25
Ex: Inducing 2nd-order Model
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
S F
A C
B
D
6 (0.38) 6 (1)
6 (0.38) 6 (1)
10 (0.62)
4 (0.40)
H
E
G
5 (1)
5 (1) 5 (0.31)
5 (0.31)
6 (0.60)
6 (1)
C D
A,B 0.5 0.5
E,B 0.2 0.8
G,B 0.4 0.6
check transition probability distribution induced by the different in-paths (2-grams)
B needs to be cloned!
similar!
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2
9 oct 2006 kes 2006 - sebastian stober : dawn 26
Ex: 2nd-order Model
Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.
S F
A C
D
6 (0.38) 6 (1) 5 (0.45)
6 (0.55)
1 (0.2)
4 (0.8) B’
B
6 (1)
4 (0.40)
H
E
G 5 (1)
5 (1)
5 (0.31)
5 (0.31)
6 (0.60)
6 (1)
insert clone B‘ of B for 2-gram “SA” and adjust all incoming and outgoing edges
sequence # A,B,C 3
A,B,D 2
E,B,C 1
E,B,D 1
G,B,C 2
G,B,D 1
A,B,D,H 1
E,B,D,H 3
G,B,D,H 2