2006-10-09 - kes2006 - dawn · 9 oct 2006 kes 2006 - sebastian stober : dawn 3 motivation goal:...

Sebastian Stober and Andreas Nürnberger Institute of Language and Knowledge Engineering

Otto-von-Guericke University Magdeburg, Germany

http://irgroup.cs.uni-magdeburg.de e-mail: {stober,nuernb}@iws.cs.uni-magdeburg.de

DAWN - A System for Context-based Link Recommendation in Web Navigation

KES2006 10th International Conference on Knowledge-Based & Intelligent Information & Engineering Systems

October 9, 2006

9 oct 2006 kes 2006 - sebastian stober : dawn 2

Outline

 Motivation  Underlying Model

  General Assumptions   Problems using Markov Models   Generalization   Graph Representation

 Recommender Algorithm  Test Results, Work in Progress, Future Work  Summary

“DAWN” (Direction Anticipation in Web Navigation)


Motivation   Goal: System that provides support for navigating the

World Wide Web   …by recommending outgoing links of a web page   …without confinement to a specific web site or domain

  Approaches made so far include:   look-ahead crawler finding pages with similar content as

the last (n) pages [Lieberman’95:Letizia]   recommending pages (within the web site) that users with

similar interests/information need have visited [Joachims et al.’97:WebWatcher]

  Idea: Users may (unconsciously) develop navigational patterns for web browsing   learn these patterns and use them as heuristic for

recommendation


Modeling Navigational Patterns  Assumption: Browsing process has the Markov

Property, i.e.:   the (conditional) probability for the next link

chosen depends only on the last n visited web pages

  web pages visited further back in history have no impact on the choice of the next link

 Successfully applied in   web-cache optimization (predictive prefetching)   web-usage mining


Problems using Markov Models   Exponential space complexity

  for each possible sequence of n pages, a probability distribution for the next page has to be stored: complexity O(|S|n+1) (where S is the state space of the model)

  WWW contains too many web pages and links   state space would become way too big to handle   very sparse training data even if the training data set is

huge (sparsity problem)

  If we don‘t want to reduce the order of the model, there are only 2 ways to reduce a model‘s space complexity:   reduce the size of the state space   find an efficient way to store the model (compression)


Reducing the Size of the State Space

 … by grouping similar pages into “contexts” by clustering (k-means on standard TFiDF doc representations)

C1 C3

C2

sequences of web pages:

P1 → P3 → P5

P1 → P2 → P4 → P5

sequences of contexts:

C1 → C2 → C3

C1 → C1 → C2 → C3

P1

P2

P3 P4

P5

  reduces the size of the state space (S)   generalizes the model (works on unseen pages as well)   reduces sparsity-problem


 A 1st-order Markov Model can be represented as a weighted directed graph:   contexts → vertices   for all ci, cj with P(cj |ci) > 0

insert a directed edge from ci to cj with weight P(cj |ci)

 Question: How can this representation be extended to higher-order models?   idea: context sequences → vertices

Model Representation (1st order)

Note: This is not to be confused with a Markov Network! Vertices refer here to states of a system and not to random variables!

c1

c2

0.38 0.75

c3

c4

1.0

0.8 0.5

0.62 c5

0.5

0.25

0.2


c1

c2

0.38 0.75

c3

c4

1.0

0.8 0.5

0.62 c5

0.5

0.25

0.2

  idea: context sequences (n-grams) → vertices i.e. for each possible sequence of contexts in the data with length n, create a vertice:

  Observation: vertices can be merged if the probability distributions for the next context are (nearly) identical

0.5

Model Representation (nth order)

c1c2

0.38 0.75

1.0

1.0 0.5

0.62

0.25

c4c1

c5c4

c1c3

c2c3

c4c3

c2c5 c3c5

1.0

1.0

old c3

old c5

c5c3

1.0

0.4

0.6


Model Representation (compression)

  idea [Borges and Levene ‘05]:   construct graph incrementally

(i.e. start with 1st-order model and construct nth-order model from (n-1)th-order model)

  clone a vertice only if the nth-order probability distribution differs from the one in the (n-1)th-order model

  if a vertice needs to be cloned, merge clones with (nearly) identical probability distributions


Overview “DAWN” (Direction Anticipation in Web Navigation)

so far…

next…


Recommendation Algorithm   each time the history is updated (i.e. user clicks a link)

  map the history onto the model by   …finding similar paths (context sequences)

  …and computing the path weights (minimum of the context-page similarities)

  compute the probability distribution for the next context by overlaying all similar paths (weighted)

  for each candidate page (outgoing link of current page)

  map onto the model (find similar contexts)

  recommend if the probability for at least one of the most similar contexts exceeds a threshold θ


Ex: Recommendation Algorithm

  find similar contexts for Pt-1

  A,C,C‘   find successors of A, C and

C‘, that are similar to Pt   B,E

  compute path weights (min of all context-page similarities)

  wA,B = min(0.7 , 0.9) = 0.7   wC,B = min(0.9 , 0.9) = 0.9   wC‘,E = min(0.9 , 0.8) = 0.8   wB = 0.9   wE = 0.8

C F

A DB

C’

H E

G

5 (0.5)

2 (0.4)

5 (0.5)

3 (0.6) 0.8

0.9

0.9

0.9

0.7

  map the history (Pt-1,Pt) onto the model (similarity threshold λ=0.7):


Ex: Recommendation Algorithm

  compute the probability distribution for the next state   weighted overlay of the probability distributions induced

by the different similar paths in the model:

  processing of candidate page Y:   identify similar contexts of Y (with similarity threshold λ=0.7)

  D   if P(D|Ct-1,Ct) ≥ θ then Y is recommended (e.g. for θ = 0.4)


Outlook  Test results on server logs indicate that

DAWN’s recommendations might be useful:   In about 30% of all test cases the actual chosen

link was amongst the 3 highest ranked recommendations

 Work in progress:   Development of browser front-end   Incorporation of further information   History visualization

 Future work:   User study (usability, helpfulness)


Summary  DAWN uses a higher-order Markov Model to

capture a user’s browsing behavior and to predict, which page the user might want to access next

 Assumption: Probability for the next link chosen depends only on the last n visited web pages

 Model size is reduced by   grouping similar pages into contexts   using a special graph representation

 Preliminary test results on server logs indicate usefulness of the recommendations

Thank You for Your Attention

Sebastian Stober and Andreas Nürnberger http://irgroup.cs.uni-magdeburg.de

e-mail: {stober,nuernb}@iws.cs.uni-magdeburg.de


Detailed Test Results   Number of sessions, requests and unique URLs in the data used for

training and evaluation:

  Predicted ranks of the actually followed links. The number of candidates (number of different outbound links in a web page) ranged from 1 to 607 with a mean of 10.77.


Browser Integration

Recommendations are displayed in an overlay <div>- element within the web page


Browser Integration thumbnail

  Visual information:   Impression of the page layout and favicon that the user might

remember   Content information:

  Page title, uri   Ranking information:

  List of recommendations is sorted by visit probability predicted by the model

probabilty

favicon

page title uri


Ex: Inducing 1st-order Model

S F

A C

B

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2

3 3

3 3

absolute transition frequency (in the data)

artificial states for start and end

Example from: José Borges and Mark Levene. Generating Dynamic Higher-Order Markov Models in Web Usage Mining, PKDD, 2005.




S F

A C

B

D

5 5

3 3

2 2

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2




S F

A C

B

D

5 5

4 4

2 2 E

1 1

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2




S F

A C

B

D

5 5

4 4

3 3 E

2 2

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2


Ex: 1st-order Model


S F

A C

B

D

6 (0.38) 6 (1)

6 (0.38) 6 (1)

10 (0.62)

4 (0.40)

H

E

G

5 (1)

5 (1) 5 (0.31)

5 (0.31)

6 (0.60)

6 (1)

transition probability P(C|B)

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2


Ex: Inducing 2nd-order Model


S F

A C

B

D

6 (0.38) 6 (1)

6 (0.38) 6 (1)

10 (0.62)

4 (0.40)

H

E

G

5 (1)

5 (1) 5 (0.31)

5 (0.31)

6 (0.60)

6 (1)

C D

A,B 0.5 0.5

E,B 0.2 0.8

G,B 0.4 0.6

check transition probability distribution induced by the different in-paths (2-grams)

B needs to be cloned!

similar!

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2


Ex: 2nd-order Model


S F

A C

D

6 (0.38) 6 (1) 5 (0.45)

6 (0.55)

1 (0.2)

4 (0.8) B’

B

6 (1)

4 (0.40)

H

E

G 5 (1)

5 (1)

5 (0.31)

5 (0.31)

6 (0.60)

6 (1)

insert clone B‘ of B for 2-gram “SA” and adjust all incoming and outgoing edges

sequence # A,B,C 3

A,B,D 2

E,B,C 1

E,B,D 1

G,B,C 2

G,B,D 1

A,B,D,H 1

E,B,D,H 3

G,B,D,H 2

2006-10-09 - kes2006 - dawn · 9 oct 2006 kes 2006 - sebastian stober : dawn 3 motivation goal:...

Documents