ed h. chi ima digital library workshop 2001-02-23 1 ed h. chi u of minnesota ph.d.: visualization...
Post on 21-Dec-2015
216 views
TRANSCRIPT
Ed H. Chi IMA Digital Library Workshop 2001-02-23
1
Ed H. ChiEd H. Chi www.geekbiker.comwww.geekbiker.com
U of MinnesotaPh.D.: Visualization
SpreadsheetsM.S.: Computational
Biology
Expertise: InfoVis, Study of the Web, TaeKwonDo, Poetry, Motorcycling, Pottery
Information Scent Information Scent Modeling User Browsing Strategies Modeling User Browsing Strategies
on the Webon the Web
Ed H. ChiPeter Pirolli
User Interface Research GroupThis research was supported in part by
Office of Naval Research contract number 'N00014-96-C-007'.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
3
Comparison to LibraryComparison to Library
• Experience tells us:• general layout of content
– which floor, which section.• which books are of greatest interest
– by the wear on the spines.• which information is timely or
deadwood– by looking at the circulation check-out
stamps inside the book covers.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
4
Trends and ProblemsTrends and Problems
• 200M Web users, 6M web sites• Web design process ad-hoc, not
optimal• Some tools extract behaviors and
correlations but not intentionally
• Being successful requires making the Web more useful and usable to a broader audience
Ed H. Chi IMA Digital Library Workshop 2001-02-23
5
Information Information ForagingForagingInformation Information ForagingForaging
Amount ofAmount ofAccessibleAccessibleKnowledgeKnowledge
Amount ofAmount ofAccessibleAccessibleKnowledgeKnowledge
Cost [Time]Cost [Time]Cost [Time]Cost [Time]
Ed H. Chi IMA Digital Library Workshop 2001-02-23
6
Underlying ConceptUnderlying Concept
• Users seeking information is similar to hunter/gatherers optimization strategies.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
7
Underlying ConceptUnderlying Concept
• Information Scent is the user perception of the cost and value of information.– Similar to hunters
following animal foot prints.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
8
E n t e r E x i t
U s e r s e n t e r a w e b s i t e a t v a r i o u sp a g e s a n d b e g i n s u r f i n g
C o n t i n u i n g s u r f e r s d i s t r i b u t e t h e m s e l v e s d o w n v a r i o u s p a t h s
S u r f e r s a r r i v e a t p a g e s h a v i n g t r a v e l e d d i f f e r e n t p a t h s
A f t e r s o m e n u m b e r o f p a g e v i s i t s s u r f e r s l e a v e t h e w e b s i t e
( a )
( b )
( c )
( d )
p1
p3
p2
Ed H. Chi IMA Digital Library Workshop 2001-02-23
9
Information ScentInformation Scent
Users forage by surfing along links
Foragers use proximal cues (text snippets or graphics) to
accessdistal content (destination page)
Scent is the proximal perception of value and cost of distal content
contentlinksnippet
Ed H. Chi IMA Digital Library Workshop 2001-02-23
10
AssumptionsAssumptions
Users have information goals, their surfing patterns are guided by information scent
Two questions– Given an information goal and a starting
pointWhere do users go? (Behavior)
– Given some surfing patternWhat is the user’s goal? (Need)
Ed H. Chi IMA Digital Library Workshop 2001-02-23
11
WUFIS: Web User Flow by Information WUFIS: Web User Flow by Information ScentScent
UserInformation
goal
Web site
WebPage
contentlinks
Web user flow simulation
Predictedpaths
Ed H. Chi IMA Digital Library Workshop 2001-02-23
12
How does it work?How does it work?
Start users at page with some goal
Flow users through the
network
Examine user patterns
Ed H. Chi IMA Digital Library Workshop 2001-02-23
13
WUFIS AlgorithmWUFIS Algorithm
document
wordWQR
T
1000000
0010000
0000100
0100000
0000011
0000001
0001000
0101110
0
0
0
0
0
0
1
1
Weight MatrixQuery
1
Relevant
Documents
Ed H. Chi IMA Digital Library Workshop 2001-02-23
14
WUFIS Algorithm WUFIS Algorithm (cont.)(cont.)
R = Relevant documents
T = Topology matrix
from
toTRS
0269.0269.00000
10731.000212.00
0000000
00001576.00
00000212.00
0731.000001
0000000
2
Scent Matrix
Ed H. Chi IMA Digital Library Workshop 2001-02-23
15
Prelim. Evaluation of Prelim. Evaluation of WUFISWUFIS
Show that WUFIS generates good URL destinations based on information need.19 WebsitesSize: 27-12,000 pagesInfo Provider, eCommerce, Large Corp.Info Need from very general (product
info) to very specific (migraine headaches)
Top ten URL position simulated are extracted.
Each URL is blindly rated for relevancy.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
16
WUFIS EvaluationWUFIS Evaluation
570 ratings are collected = 3 variations of the algorithm x 10 URLs x 19 sites
Tabulated, Averaged.Result = 7.54 (out of 10)
19 Websites
Website Info,Algorithm Performance
Ed H. Chi IMA Digital Library Workshop 2001-02-23
17
IUNIS: Inferring User Need by Info IUNIS: Inferring User Need by Info ScentScent
UserInformation
goal
Web site
WebPage
contentlinks
Web user flow simulation
observedpaths
Ed H. Chi IMA Digital Library Workshop 2001-02-23
18
Extracting PathsExtracting Paths
Longest Repeating Sequence (LRS)New path mining techniqueExtracts significant surfing
pathsReduces the complexity of
path model
Ed H. Chi IMA Digital Library Workshop 2001-02-23
19
0
2
0
2
2
1
0
0
0
0
3
0
2
1
0110000
1010010
0000000
0000110
0000010
0100001
0000000
12
from
toPTP
IUNISIUNIS
0
2
0
2
2
1
0
1000000
0010000
0000100
0100000
0000011
0000001
0001000
0101110 document
wordPWK
Weight Path
P = observed user path
T = topology matrix
W = word x document weights
K = relevant keywords
2
1
Topology Path
Ed H. Chi IMA Digital Library Workshop 2001-02-23
20
Evaluation of IUNISEvaluation of IUNISGoal:
Show that keyword summaries produced by IUNIS are good at communicating the content of the user paths.
Dataset:8 participants random 10 paths from (5/18/1998, xerox.com,
path length=6)
booklets of pages on paths (in order)
Ed H. Chi IMA Digital Library Workshop 2001-02-23
21
Evaluation of IUNISEvaluation of IUNIS
Procedure:Single rating sheet with the ten 20-word
summaries. Beside each summary, users are asked to rate the summaries on a 5-point Likert Scale. A copy of this rating sheet is attached to each of the ten path booklets
Users are asked to read through each booklet and rate each of the path summaries.
User are also asked to identify which of the ten summaries was the best match.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
22
Evaluation of IUNISEvaluation of IUNIS
Results:Matching summary mean = 4.58
(median=5)Non-matching summary mean = 1.97
(median=1)Difference highly significant (p < .001)Best match summary: 5.6 out of 10
(Cohen Kappa=0.51)
Evaluation yield strong evidence that IUNIS generates good summaries of the Web paths.
Ed H. Chi IMA Digital Library Workshop 2001-02-23
23
ScentViz TasksScentViz Tasks
Overall siteHigh-level traffic flow and routes?Ease of access and costs?
Given a specific Web pageWhere do users come from?Where do they go? What other pages are related?
UsersWhat are interests of the users?Where should they go based on their
need?Do observed data match simulation?
Ed H. Chi IMA Digital Library Workshop 2001-02-23
24
Visualization DemoVisualization DemoDome TreeUsage Based LayoutPath Embedding
Ed H. Chi IMA Digital Library Workshop 2001-02-23
25
Scenario 1: Page TypesScenario 1: Page Types
Multi-way branching point
investor/sitemap.htm
Ed H. Chi IMA Digital Library Workshop 2001-02-23
26
Scenario 1: Drill-downScenario 1: Drill-down
Few well-traveled future paths
shareholder info1998 fact bookfinancial doc order
Conclusiongood local sitemap
Ed H. Chi IMA Digital Library Workshop 2001-02-23
27
Scenario 2: Well-Scenario 2: Well-traveledtraveled
Related information all over the site
One well-worn path on the left relating to product tutorial
Scansoft/tbpro98win/index.htm
Ed H. Chi IMA Digital Library Workshop 2001-02-23
28
Scenario 3: Identify Scenario 3: Identify NeedNeed
Need of path from shareinfo to orderdoc
reinvestmentstockbrochuredividendshareholder
investor/sitemap.htm
Ed H. Chi IMA Digital Library Workshop 2001-02-23
29
Scenario 4: Scent Scenario 4: Scent PredictPredict
Scent computed based on “pagis” need
Good match between scent and LRS paths
Scansoft/pagis/index.html
Ed H. Chi IMA Digital Library Workshop 2001-02-23
30
InfoScent SummaryInfoScent Summary
The overall goal is to model Web user information needsBridge gap between clicks and information
needsPredict user navigation behaviorDevelop new applications and Web usability
metrics
Ed H. Chi IMA Digital Library Workshop 2001-02-23
31
Questions?Questions?
Ed H. [email protected]://www.geekbiker.com