www and internetbennani/tmpc/visd/16-−considerable amount of screen real-estate used −only...
TRANSCRIPT
1
WWW and Internet
CS 7450 - Information VisualizationMarch 4, 2004John Stasko
Spring 2004 CS 7450 2
Internet and WWW
• By nature, abstract, so good target for visualization
• Often described in terms of metaphors− “Information Superhighway”
2
Spring 2004 CS 7450 3
Agenda
• Two main topics− Presentations of the Internet and WWW
Focus on topology and navigation, similar to the graph visualization work
− Visual aids for browsing and using the WWW and the InternetAssistive visualizations not focusing on presenting
net structure and connectivity
Spring 2004 CS 7450 4
1. Internet and WWW Topology
• Fundamentally, the Internet is a graph with some existing physical topology, though that is often not how we want to conceptualize it− Might think of it as having a structure
• Our discussions from graph visualization are germane here
3
Spring 2004 CS 7450 5
Mukherjea & Foley WWW ‘95
The Problem
Spring 2004 CS 7450 6
The Problem
• Websites simply are too big• Huge graphs• Layout is challenging
4
Spring 2004 CS 7450 7
Step Back
• Why would someone want to visualize the WWW?
Spring 2004 CS 7450 8
Some Reasons
• Aid authors and webmasters with production and organization of content
• Assist Web surfers making sense of the information
• Help researchers understand the Web
5
Spring 2004 CS 7450 9
Depictions of the Web
•• GreatGreat web site that presents many different conceptualizations of cyberspace− Atlas of Cyberspace
http://www.cybergeography.org/atlas/
• Let’s take a few minutes to browse...
Spring 2004 CS 7450 10
Mapping the Internet
• Bill Cheswick at ATT• Interesting visualizations plus the data
sets are available• www.cs.bell-labs.com/who/ches/map/index.html
6
Spring 2004 CS 7450 11
Internet Traffic Paths
www.caida.org/tools/measurement/skitter/
Spring 2004 CS 7450 12
MboneMap
www.cs.berkeley.edu/~elan/mbone/map.html
7
Spring 2004 CS 7450 13
Immersive Systems
www.pnl.gov/remote/projects/starlight/
Spring 2004 CS 7450 14
View of Web Site’s Pages
www.dynamicdiagrams.com/
8
Spring 2004 CS 7450 15
Web Site
www.mos.ics.keio.ac.jp/NattoView
Spring 2004 CS 7450 16
Web Site Visitations
www.inventix.com
9
Spring 2004 CS 7450 17
Task Analysis
• Potential web-related tasks− How and when has info been accessed?− Where do people enter and spend time?− How do they move about?− What paths aren’t traversed?− Where are they coming from?− What has been added, changed, deleted?− Do changes affect navigation patterns?− Do we need to do a redesign?
Spring 2004 CS 7450 18
Data Set
• Each server request is a data case• Example variables
− IP Address/Client host− Timestamp− URL requested− HTTP status (success, not found, …)− Bytes delivered− Referencing URL (HTTP-Referrer)− User agent (browser and OS info)− ...
10
Spring 2004 CS 7450 19
One Approach
• Use existing InfoVis tool (Eureka, Spotfire, InfoZoom, etc.), load the data set, and analyze it
• Get all the strengths and weakness of the InfoVis tool for supporting particular analysis tasks
Spring 2004 CS 7450 20
Web Ecology
• Problem: Most visualizations of the web fail to present the dynamically changing ecology of users and documents on the web
• What do we mean by ecology metaphor?
Chi, et al CHI ‘98
11
Spring 2004 CS 7450 21
Web Ecology
• By understanding set of relationships (ecology) among users and their information environment, and its change through time (evolution) individuals can better understand− Web Content− Layout of physical and topological space− Usage through time
Spring 2004 CS 7450 22
Existing Visualizations
• Despite useful functions, problems − Difficulty visualizing large number of
documents− Considerable amount of screen real-estate
used− Only permits the visualization of a site at a
particular point in time, very difficult to make comparisons across times
− No mechanisms provided that allow differences in usage to be identified
12
Spring 2004 CS 7450 23
Techniques
• Disk Tree− Center-rooted tree that represents the
hyperlink structure of a web site
• Time Tube− Set of disk trees that organizes and visualizes
the evolution of web sites
Spring 2004 CS 7450 24
Task Application
• Visualizations designed to be useful for− Local - Finding specific content− Comparison - Comparing info at two places − Global - Discovering a trend or pattern in the
site
13
Spring 2004 CS 7450 25
Analysis Domain
• www.xerox.com, April ‘97− 7,588 items across a 30-day period− 889 new items− Daily log kept of additions, modifications, and
deletions of content − Base data comes from link info, usage log
from web servers− Topological info from custom hyperlink
database
Spring 2004 CS 7450 26
Disk Trees
• Interested in shortest number of hops from one document to another
• Breadth-first traversal transforms the web graph into a tree by placing the node as close to the root node as possible
• After obtaining this tree we then visualize the structure using the Disk Tree technique
14
Spring 2004 CS 7450 27
Disk Tree
Lines - tree linksLine size & brightness -
page access frequencyColor - page lifecycle stage
new: red continued: greendeleted: yellow
Spring 2004 CS 7450 28
Advantages
• Structure is compact, with pattern easily recognizable
• When viewed straight on or at slight angles, no occlusion problems, since entire layout is on a 2-D plane
• Unlike cone trees, this 2-D representation can utilize a third dimension for other information, such as time
• Circularity pleasing to the eye
15
Spring 2004 CS 7450 29
Time Tubes
• Time Tubes are multiple disk trees layered out along a spatial axis
• Advantages− By using a spatial axis to represent time, we
see information space-time in a single visualization
− Focus and Context− Possibility for Animation
Spring 2004 CS 7450 30
Time Tubes
16
Spring 2004 CS 7450 31
Key Point
• Pages there any time during the studied period are shown in all disk trees for period, even if they didn’t exist yet
Spring 2004 CS 7450 32
Real Use
• Time Tube answers following questions:− What devolved into dead wood? When did it?
Was there a correlation with the restructuring of the web?Product safety pages got darker and darker,
indicating lower usageDoesn’t tell why page is less popular, just raises a
flag to explore page further
17
Spring 2004 CS 7450 33
Real Use
• What evolved into a popular page? When did it? Was there a correlation with the restructuring of the Web site? − Redesign of site called attention to Fact Book
page− Became more popular and the corresponding
Disk Trees become greener and greener in successive weeks
Spring 2004 CS 7450 34
Real Use
• How was usage affected by items added over time?− Press release issued for new family of
products, shown as red links− Usage in the third week jumped from 1
access to 871 accesses, this example helps us understand that this was probably a well received product line
18
Spring 2004 CS 7450 35
Real Use
• How was usage affected by items deleted over time?− Change in removing direct link from home
page to main driver page did not negatively affect the overall use of driver information
− Info stayed green indicating usage, but link from home page was black, showing not much traffic
Spring 2004 CS 7450 36
E-Commerce Applications
• What if your focus is on understanding user access patterns for web sites selling products to consumers?
• What tasks are important?
19
Spring 2004 CS 7450 37
One Approach
• Blue Martini Software• Aggregate web data and visualize
simplified graph of user movements through web site
• Highlight places where people leave before purchasing
• ...Brainerd & BeckerInfoVis ‘01
Spring 2004 CS 7450 38
Different icons representdifferent kinds of pages
Only show most-used pages
20
Spring 2004 CS 7450 39
E-Commerce mimicsmall shopping :^)
Gender differences inpurchase paths atwebsites
Spring 2004 CS 7450 40
2. Aiding WWW Browsing
• Can we utilize information visualization techniques to help people interact with the WWW and the Internet?
• Battle “lost in hyperspace” problem• Help us know what’s there• Help us find things
21
Spring 2004 CS 7450 41
WebBook and Web Forager
• Personal computers viewed as knowledge processors before− Spreadsheets and calculators
• Now viewed as knowledge sources, portals to vast information worlds− Networking and WWW
Card, Robertson and York CHI ‘96
Spring 2004 CS 7450 42
WWW Problems
• Pages are hard to find• Users get lost, can’t relocate pages• Difficulty organizing things once found• Difficulty doing knowledge processing on
found thing• Interacting with web is too slow to
incorporate gracefully into other activities
22
Spring 2004 CS 7450 43
Information Foraging Theory
• From Ecological Biology• Idea: user stalks certain types of
information• Users have tendency to interact
repeatedly with small clusters of information (locality of reference)
• Information encountered at certain rate− Users evolve to increase finding rate− Sources evolve to be more attractive
Spring 2004 CS 7450 44
Mechanisms Evolved
• 3 mechanisms in the evolution of the web on the server side − Indexes - Lycos search− Table of contents - Yahoo− Home pages provided by users with big lists
of related links
23
Spring 2004 CS 7450 45
Assisting People
• To provide insight− must support sensemaking− restructuring− recoding
• Hotlists are one mechanism in this direction
Spring 2004 CS 7450 46
Improvements
• WebBook and Web Forager try to do two things to foster information sensemaking− Move away from a single web page, and
group and manipulate related pages− Move from a work environment containing a
single element to a workspace in which the page is contained with multiple other entities, including Web Books
24
Spring 2004 CS 7450 47
WebBook
Spring 2004 CS 7450 48
Features
• WebBook allows for the rapid interaction with object at a higher level of aggregation than pages
• 3D book representation, uses animation• Can ruffle through pages, leave
bookmarks
25
Spring 2004 CS 7450 49
Applications
• Hot List books• Topic books• Search reports• Book books• ...
Spring 2004 CS 7450 50
Web Forager
26
Spring 2004 CS 7450 51
Web Forager
• Application that embeds the WebBook and other objects in a hierarchical 3D information workspace
• Workspace is intended to create patches from the web where high density of relevant pages (grouped together in Web Books) can be combined with rapid access
Spring 2004 CS 7450 52
Constituents
• Hierarchical Workspace - 3 levels− Focus Place - full page shown, direct
interaction− Intermediate memory space - books or pages
placed when they are in use but not immediate focus
− Tertiary space - Storage (bookcase)
Video
27
Spring 2004 CS 7450 53
Discussion
• Strengths/Weaknesses
Spring 2004 CS 7450 54
Data Mountain
• 3D document management system• Prototype is an alternative to web browser
“bookmarks” or “favorites”• Could be used for any kind of document
management
Robertson, et al UIST ‘98
28
Spring 2004 CS 7450 55
Make-Up
• 3D inclined plane in which thumbnails of web pages are placed to serve as favorites
• User is responsible for organization• Uses smooth animation and audio to
assist interaction
Spring 2004 CS 7450 56Video
29
Spring 2004 CS 7450 57
User Study
• Data Mountain versus IE4 “Favorites”• Experienced IE4 users• Stored 100 pages, then retrieved them• DM fared about-as-well with “title” cue• DM fared better for all other cues
Spring 2004 CS 7450 58
Leveraging Human Capabilities
• Spatial memory: analogy with paper placed on a pile on your desk− User is responsible for personal organization
• 3D perception: minimal cognitive load, good utilization of screen space
30
Spring 2004 CS 7450 59
Interaction Techniques
• Placing pages: confinement to inclined plane makes normal 2D drag-and-drop sufficient; no unfamiliar 3D navigation needed
• Continuous feedback: both audio and visual feedback are natural; minimized unexpected interactions/surprises
Spring 2004 CS 7450 60
Limitations/Future
• Limits number of pages stored• No explicit support for grouping
• Landmarks/contours as helpers
31
Spring 2004 CS 7450 61
Discussion
• Strengths/Weaknesses
• Could it be used elsewhere?
Spring 2004 CS 7450 62
Upcoming
• Spring Break− Woo-woo
• Text & documents (2 days)− Reading
Chapter 10Salton et al
• Mid-project reports due March 25
32
Spring 2004 CS 7450 63
References
• Spence and CMS texts• All referred to papers and websites• McNamara & Defnet and Craighill,
Robeson & Sheridan F ‘99 slides