online information visualization of huge data spaces

48
M L Huang / 9 Ma y, 2000 Internetworking Research 1 Online Information Online Information Visualization of Huge Data Visualization of Huge Data Spaces Spaces by Mao Lin Huang

Upload: acton-gates

Post on 31-Dec-2015

22 views

Category:

Documents


2 download

DESCRIPTION

Online Information Visualization of Huge Data Spaces. by Mao Lin Huang. Application of graph drawing methods in information visualization to solve the problem of navigating large information spaces. The thesis covers three areas:. 1. Information visualization: the “small window” problem - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 1

Online Information Visualization of Huge Online Information Visualization of Huge Data Spaces Data Spaces

by Mao Lin Huang

Page 2: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 2

Application of graph drawing methods in information visualization to solve the problem of

navigating large information spaces.

1. Information visualization:

the “small window” problem

2. Graph drawing:

the “drawing partially unknown graphs” problem the “online graph drawing” problem the preserving “the mental map” problem

3. Information discovery (browsing & navigation): the “lost in hyperspace” problem

The thesis covers three areas:

Page 3: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 3

1) The “small window” 1) The “small window” problemproblem

Information visualization must allow user to view and browse information spaces and focus quickly on items of interest.

However, the limited number of pixels on the screen makes it difficult to completely display large information spaces in detail. This is known as the “small window” problem.

Page 4: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 4

““Static layout + dynamic Static layout + dynamic viewing”viewing”

The most common solution for viewing moderate data (with hundreds or up to thousands of nodes) and addressing the “ small window” problem is to use “static layout + dynamic viewing” approaches that build a static global context of the graph, and then allow the user to navigate through it.

Since the amount of data that can be effectively displayed at one time is limited, and the whole global context may not be displayed in detail at one time, they always involve a mechanism to change the view (dynamic viewing). This allows the user to effectively view only at one time a small area of the whole visualisation by changing the viewing area, zoomed focus point, or view point of the visualisation.

Page 5: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 5

Using a very large virtual page

The virtual page technique predefines the drawing of the whole graph, and then provides a small window and scroll bar to allow the user to navigate through it (by changing the viewing area).

Page 6: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 6

Fish-eye views

The fish-eye technique can keep a detailed picture of a part of a graph as well as the global context of the graph. It changes the zoomed focus point.

Page 7: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 7

Hyperbolic tree

The hyperbolic browser technique performs fish-eye viewing with animated transitions to preserve the user’s mental map. It changes both the viewing area and the zoomed focus point.

Page 8: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 8

3D Cone trees

3D methods, such as cone trees, enlarge the immediate UI workspace and increase the apparent density of information on the screen. It changes the view point.

Page 9: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 9

A summary of previous visualisation A summary of previous visualisation techniquestechniques

While these techniques deal with graphs of moderate size, they don’t handle huge graphs (with millions or perhaps billions of nodes). The major problems may be outlined as below: These techniques predefine the layout. In most cases, the whole

graph may not be known. In some cases, the local node in a distributed system may know only a small subgraph of the graph. It may be impossible to pre-compute the layout of the whole graph.

Pre-computation of the overall geometrical structure of huge graph is very computationally expensive. Most layout algorithms have super-linear time complexity, and in practice are too slow for interactive graphics if the number of nodes is large than a few hundred.

The layout is predefined and views are extracted of this layout. The user is unable to navigate logically through the graph and they naturally thinks in terms of logical relations, not in terms of the synthetic geometrical mapping onto the screen.

Page 10: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 10

Fred

Tony

The “ static layout + dynamic viewing” method is the traditional solution to the “ small window” problem.

Page 11: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 11

Online Information Online Information VisualisationVisualisation

This thesis proposed a new approach to address the above three problems by using a sequence of dynamic visual frames called “logical frames”:

•Online Navigational Visualisation: The user sees a tiny subset of the graph at any one time. The user changes view by traversing the graph logically.

Page 12: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 12

The Online Graph Model

Page 13: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 13

Let’s imagine that we are exploring a large snowfield.

We are unable to see the entire snowfield, and the limited things we can see are those that are located within our current field of vision.

Page 14: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 14

Online Navigational Online Navigational VisualisationVisualisation (OFDAV)

OFDAV provides a major departure from traditional methods. We visualise a tiny part (a “frame” Fi ) of a huge graph at time t. We change from Fi to Fi+1 by user interaction.

OFDAV does notneed to know the whole graph, it does not predefine the geometry (the user can navigate logically), and it is user-oriented.

Page 15: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 15

In OFDAV, the view of the user focuses on a small subgraph of a large graph G at any point in time.

• The subgraph is defined by its focus nodes.• Conceptually, the focus nodes form a FIFO queue. We then

allow the user to change the set of focus nodes by selecting another node on the screen.

• We use a force-directed graph drawing algorithm to draw the subgraph of G and a logical neighbourhood of this subgraph.

• We use animation to guide the user between views, reduce the cognitive effort and preserve the mental map.

• We also adopt a history that traces the subgraphs that the user has visited. This assists in backtracking through the graph.

Online Navigational VisualisationOnline Navigational Visualisation

Page 16: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 16

Transitions

To change from one logical key frame Fi to next Fi+1, the user selects a node vi+1 in Fi with a mouse click. The vi+1 is appended to the queue, and a node is deleted from the queue in a FIFO manner.

Page 17: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 17

2) The “partially unknown graphs” 2) The “partially unknown graphs” problemproblem

That is, the whole graph that we want to draw may not be known. In some cases, at the time of viewing only a small sub-graph is known. Thus, it is impossible to define a drawing of the graph.

We solve this problem by incrementally calculating and maintaining a small local visualization on-line, instead of predefining the overall visual structure of the graph at once.

Page 18: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 18

The graph is partially unknown

The graph is supplied to the system by a series of requests for neighbourhoods of focus nodes.

Hugegraph

new focus node v

neighbourhood of v

Small local graph

Page 19: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 19

3) The “online graph drawing” 3) The “online graph drawing”

problemproblem

We address the general graph drawing problems, that is, to make the layout of graph comprehensive and easier to read.

We also address some specific criteria for online graph drawing, we achieve this by using a “modified spring algorithm” for graph drawing.

Page 20: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 20

The “online graph drawing” The “online graph drawing”

problemproblem

The specific criteria for online drawing: The layout of logical frame must show the direction of the

exploration. Reduce the overlaps among the local regions. The sequence of drawing preserves the mental map.

The general criteria for graph drawing:

Reduce the edge crossings. …

Page 21: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 21

The layout of The layout of FFii must show the direction must show the direction

of the exploration.of the exploration.

Spring model Modified spring model

Page 22: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 22

Reducing the overlaps among the local Reducing the overlaps among the local regions.regions.

Spring model Modified spring model

Page 23: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 23

Reducing the number of edge crossingReducing the number of edge crossing

Spring model Modified spring model

Page 24: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 24

In the spring model, each node is replaced by a steel ring, and edges are replaced by Hookes’s law springs. The rings have a gravitational repulsion acting between them, and we can find a drawing which minimizes the energy.

Spring modelSpring model

Page 25: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 25

The physical forces

The Modified Spring Algorithm has many forces, including: Hooke’s law springs for all edges, with varying

strengths depending on whether the endpoints are focus nodes or not.

Gravitational repulsion forces for all nonedges. Special gravitational forces between nodes in each

neighbourhood. Some further forces.

The effect of these forces is to: try to keep the queue of focus nodes in a left-right line keep node images disjoint radially display neighbourhoods around each focus

node

Page 26: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 26

Modified spring algorithmModified spring algorithm

In order to address the specific criteria of on-line drawing, we add extra forces among the neighbourhoods, N(vi ), N(vi+1 ), …, N(vi+B-1 ) of the focus nodes. These extra forces are used to separate the neighbourhoods so that user can visually identify the changes. This extra force is also a Newtonian gravitational force.

Page 27: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 27

The force modelThe force model

Suppose that Fi = (Gi , Qi ) is the logical frame which is currentlybeing viewed on the screen, and Gi = (Vi , Ei ).

The total force applied on node v is:

Where fuv is the force exerted on v by the spring between u

and v, and guv and huv are the gravitational repulsions exerted on v by one of the other node u in Fi.

(1)

Qiu

uv

Viu

uv

vNu

uv hgfvf)(

)(

Page 28: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 28

An example of modified spring algorithm.An example of modified spring algorithm.

In this frame, there are two focus nodes, x and y. The total force

on node v is:

},{},,4,...,1{

)(yxu

uv

yxu

uvxv hgfvf

Page 29: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 29

4) The preserving“the mental map” 4) The preserving“the mental map”

problemproblem

It is a key quality issue of information visualization. That is, the user has difficulty in quickly understanding the underlying structure of the current view, when moving the focus around the huge graph by changing of views. The user has to spend time to re-form the mental map and understand the changes and relationships between the previous view and the current view.

Page 30: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 30

The mental map

Our goal is to preserve the user’s mental map, while taking best advantage of the view screen.

In OFDAV, we use three types of animation to assist the user in understanding the change in view.Fade Animation: We use shrinking/growing to

help the user identify nodes that are disappearing/appearing.

Camera Animation: This moves the whole drawing so that the new focus node moves toward the centre of the screen.

Layout Animation: We use a complex system of forces based on Hooke’s law springs to adjust the layout between logical key frames.

Page 31: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 31

Layout Animation

For each logical key frame Fi there is a graph drawing D(F) which consists of a sequence D1, D2, …, Dk of drawings of Fi; each is a screen of Fi.

We use Spring Algorithm to achieve the layout animation. The algorithm creates the in-betweening sequence of screens to smoothly revise the layout from the old key frame D(Fi ) to the new key frame D(Fi+1 ).

The change from one screen Di to the next screen Di+1 is computed by a numerical method which converges to a stable configuration of the force system.

Layout animation is the most important mechanism that we provide to achieve the smooth transition between views and preservation of the mental map.

Page 32: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 32

5) The “lost in hyperspace” 5) The “lost in hyperspace”

problemproblem

“Lost in hyperspace” is a famous problem of navigating the huge data space, where users become disoriented with respect to a complex system of hypertext links.

When users move around a large information space as much as they do in hypertext, there is a real risk that they may become disoriented or have trouble finding the information they need [Nielson, 1990].

Page 33: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 33

5) The “lost in hyperspace” problem

Even in this small document, which could be read in one hour, users experienced the ‘lost in hyperspace’ phenomenon as exemplified by the following user comment: ‘ I soon realized that if I did not read something when I stumbled across it, then I would not be able to find it later.’ Of the respondents, 56% agreed fully or partly with the statement, ‘When reading the report, I was often confused about where I was.’ [Nielson, 1990].

Page 34: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 34

Overview DiagramsThis is the real problem that happens when reading text in a nonsequential way with too many cross-references.

A number of researchers have noted that overview diagrams provide a reasonable solution to the “lost in hyperspace” problem. Our system can dynamically generate a sequence of such diagrams.

Other overview diagram systems have

been proposed.

Page 35: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 35

Overview diagrams using the biform views

However, these systems all predefine the layout and they

only visualise the history within very limited context

levels (eg. 4 levels).

(In contrast, OFDAV provides an on-line browsing environment in which we can navigate through unlimited context levels.)

Page 36: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 36

Focus+Context views

Another number of researchers have developed new dynamic methods to visualise query results of web search.Mukherjea proposes a dynamic focus+context view technique to show the focus node, immediate neighbourhood of the node and some landmark nodes in a web site. This helps user to quickly gain the understanding of where they are.

Page 37: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 37

Focus+Context viewsHowever, from visualisation & navigation points of the views, this technique has a number of weaknesses:

• The mental map is broken when jumping from one view to another. (OFDAV adopts three types of animations to smooth transform from one view to another.)

• The user understands where they are, but has no guide to returning to where they have visited in the past. (OFDAV adopts a “history” tail to traces the previous focus nodes that user has visited. This assists user in backtracking through the graph.)

Page 38: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 38

The current Web browsing The current Web browsing techniquetechnique

The amount of information now available through the WWW has grown explosively. An increasing number of tools are also available to assist the user to find and access information on the WWW. One of the key requirements for a WWW navigator is to maintain the user’s sense of orientation and facilitate navigation within the context of the total information space.

Page 39: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 39

Web browserWeb browser

The current generation of Web browsers, such as Netscape and MS explorer, provide users with an effective and convenient way to move in cyberspace.This is done by clicking on a series of hyperlinks embedded in Web pages.

However, this arrangement does not give users a visual “map” to guide the users in their Web journey. It does not provide a sense of “space” while the user is exploring the (cyber) space, instead it only gives a series of linear lists.

Page 40: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 40

Web browserWeb browser

This is mainly because of the difficulty of constructing such a huge, complex, and dynamic map with a (virtually) unlimited number of hyper-documents (nodes) and hyperlinks (edges).

Most existing visualisation techniques and current research interests emphasise “site mapping”. That is, they try to find an effective way of constructing a structured geometrical map for one Web site. This can only guide the user through a very limited region of cyberspace, and does not help users in their overall journey through cyberspace.

Page 41: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 41

Graphic Web Graphic Web browserbrowser

Graphic Web Browser - mapping and browsing the entire Cyberspace.

We look at the whole of Cyberspace as one graph; a huge and partially unknown graph. We use on-line visualisation to maintain and display a subset of this huge graph incrementally.

Page 42: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 42

Graphic Web Browser addresses the Graphic Web Browser addresses the problem of “lost in hyperspace” with a problem of “lost in hyperspace” with a

sense of “space”.sense of “space”.

Graphic Web Browser addresses the fundamental problem of “lost in hyperspace” by displaying a sequence of logical visual frames with a graphic “history tail” to track the user’s current location and keep records of his previous locations in the huge information space.

The logical neighborhood of the focus nodes indicates the current location of the user, and the tail of history indicates the path of the past locations during the navigation.

Page 43: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 43

Page 44: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 44

Page 45: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 45

Page 46: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 46

Page 47: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 47

Visualising the history of exploration

The queue of focus nodes is the recent history of the exploration. The force model that we use tends to keep this history in a horizontal line; the new nodes are added to one end of the line and the nodes which disappear are at the other end.

As well as the queue of focus nodes, OFDAV keeps track of past history: a queue of all previous focus nodes. An option in OFDAV is to show the past history.

Page 48: Online Information Visualization of Huge Data Spaces

M L Huang / 9 May, 2000

Internetworking Research 48

Conclusion Conclusion

More sophisticated filtering strategies and rules should be created. Existing filtering rules may sometimes make us lose useful information.

The labelling problem has not been completely solved yet. If we put the entire long URL string into a box as its label, then the boxes are enlarged and cost more display space. The issues are: 1) how to shorten the length of labels, and 2) make these short labels unique. The investigation of these issues is proceeding.