adaptive web navigation for wireless devices corin anderson pedro domingos dan weld
TRANSCRIPT
Adaptive Web Navigationfor Wireless Devices
Corin Anderson
Pedro Domingos
Dan Weld
2
Web personalization
• Web sites designed “one size fits all”
• But one size does not fit all– Visitors will use the site in unforeseen ways– Browser may be resource-constrained– Unintuitive or impossible to use
• Personalization adapts and customizes content for each visitor
• In most need: mobile web visitors
3
c|net on the desktop
•1024x768 screen–Multi-column okay–Many links easily
visible
•Fast network–Hierarchical link
structure okay–Large images,
HTML pages
4
c|net on a wireless Palm
• Very small screen– Only few lines of text– Lots of scrolling
• Slow net connectivity– Following links costly
Challenge: automatically improve PC-centric web site for mobile browsing
5
Web site personalizers
• An intermediary between server and visitor
• Adapts and customizes site for each visitor
• Personalizations include:– Add shortcut links shorten long paths– Rearrange content to increase salience– Elide content, replacing with link
Visitor Personalizer Web server
6
Old and new
• 590AI – Autumn 2000– Proteus– Shortcut links and content elision– Key ideas: Expected utility and search
• 590AI – Spring 2001– MinPath– Shortcut links– Key ideas: clustering and predictive models
7
Trails
• A trail is a sequence of page requests…
• …coherent in time…
• …and coherent in space
8
Shortcuts
• Connect two previously unconnected pages
• Savings of shortcut is # links skipped– Don’t forget the link you follow – the shortcut
itself!
9
Shortcut link selection problem
• Given:– visitor V– trail prefix– maximum number of shortcuts m
• Output: – list of shortcuts
that minimize the number of links the visitor must follow to reach the visitor’s destination
ipp ,,0
mii qpqp ,,1
10
Finding shortcuts
• If we know the whole trail
• finding the right shortcut is easy
• Unfortunately, omniscience is hard to come by
ni ppp ,,,,0
ni pp
11
Expectation
• All we really know is the prefix
• We must guess the rest of the trail
• Idea:– Foreach suffix of trail on site
• Calculate the probability of that suffix• Add that probability to the shortcut to the end of
that suffix
– Return top m shortcuts
12
Predictive models
• We don’t really guess the suffix – we try all of them on the site
• We calculate each probability using a model of behavior
• [[ predict the next request given the past requests, the position in the trail, and the visitor’s identity ]]
• We’ve tried a number of variations…
13
Unconditional model
• Ignore visitor identity, trail prefix, position
• [[equation 1 from paper]]
• Of course, the visitor is bound to the links actually on the page; MinPath thus uses:
• [[next equation from paper]]
14
Naïve Bayes mixture
• Unconditional builds one model for everyone
• Intuitively, we suspect that not everyone behaves the same
• Cluster visitors, and condition prediction on cluster identity
15
Clustering
• Use Expectation-Maximization– Simultaneously cluster, build models
• Cluster sequences, not visitors
• …
16
Clusters of unconditional models
17
Markov models
• Condition on past history
• First order: one bit of history
• [[ equation ]]
• Also build mixtures of Markov models
18
Request position
• Training data suggest ordinal position important
• We’ll try adding position to unconditional and Markov models
19
Experiments
• Train models on 20 days of logs (35,000 trails)
• Test on 1.5 days (2,500 trails)
• All trails have length >= 3 (3 pages, 2 links)
• Measure performance as # links followed to reach destination
20
Results
• Compare different models
• Compare different cluster sizes
• Compare cluster assignment method
MinPath does save navigational effort Markov models offer most savings Clustering helps
21
figure 1 from ijcai paper
22
Figure 2 from ijcai paper
23
Clusters at www.cs
Anecdotal clusters from www.cs data
24
Summary
• Shortcuts for navigation
• cluster
• model
• save 40%
25
Ongoing work
• Proteus, MinPath feed on HTML and web graph
• But many sites are built from databases and templates
• What adaptations are possible given a declarative model of the site?