large scale information visualization · introduction class 1, part a. 4 motivation - data...

Post on 13-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Large Scale Information Visualization

Jing YangFall 2007

2

Pre-requisite

Information Visualization Course slides: www.cs.uncc.edu/~jyang13Basic techniques will be covered in the first two weeksPlease learn other contents by yourself

3

Introduction

Class 1, Part A

4

Motivation - Data Explosion

Society is more complex: computers, internet, web, ...How much data?

Between 1 and 2 exabytes of unique info produced per year

1000000000000000000 bytes250 meg for every man, woman and childPrinted documents only .003% of total

Peter Lyman and Hal Varian, 2000Cal-Berkeley, Info Mgmt & Systems

www.sims.berkeley.edu/how-much-info

Slide courtesy of John Stasko

5

Data Overload

Confound: How to make use of the dataHow do we make sense of the data?How do we harness this data in decision-making processes?How do we avoid being overwhelmed?

Slide courtesy of John Stasko

6

Human Vision

Highest bandwidth senseFast, parallelPattern recognitionPre-attentiveExtends memory and cognitive capacityPeople think visually

Slides Slide courtesy of John Stasko

7

The Challenge

Transform the data into information (understanding, insight) thus making it useful to people

Slide courtesy of John Stasko

8

ExampleWhich state has the highest income? Is there a relationship between income and education?Are there any outliers?

Questions:

Example courtesy of Chris North

9

Visualize the Data

10

Illustrate Our Approach

Provide tools that present data in a way to help people understand and gain insight from itCliches“Seeing is believing”“A picture is worth a thousand words”

Slide courtesy of John Stasko

11

Don’t you believe it?

This graphic is worth at least 700 words

12

VisualizationVisualization - the use of computer-supported, interactive visual representations of data to amplify cognition

From [Card, Mackinlay Shneiderman ’98]

Often thought of as process of making a graphic or an imageReally is a cognitive process

Form a mental image of somethingInternalize an understanding

“The purpose of visualization is insight, not pictures”Insight: discovery, decision making, explanation

Slide courtesy of John Stasko

13

What is Information Visualization

Information Visualization is NOT Scientific Visualization

X

Scientific Visualization is primarily related to and represents something physical or geometric

Examples:Air flow over a wingStresses on a girderWeather over Pennsylvania

14

What is Information Visualization

It is about abstract data

X

15

InfoVis Is About Numerical Data

16

InfoVis Is About Ordinal Data

ThemeRiver: Visualizing Theme Changes Over Time [Havre et al. Infovis 00]

17

InfoVis Is About Nominal Data

Swiss mountain map, L. Matterhorn, 1983

18

InfoVis Is About Structured Data

19

InfoVis Is About Space

Statistics Bureau, Tokyo, 1985

20

InfoVis Is About Time

New York Times, January 1982

21

InfoVis Is About Change

22

InfoVis Is About Motion and Process

Illustration of magic turning a silver coin into a copper coin

23

InfoVis Is About All Kinds of Information

24

What is Information Visualization

It is about analyzing, communicating, and decision making

Of all method for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful - E. Tufte

25

What is Information Visualization

It is about scale and dimensionalityScale

Essential problem in reasoning is comparisonComparisons must be enforced within scope of eyespan

DimensionalityThe world is multi-dimensionalThe paper and computer is 2 dimensionalEscape from the flatland

26

What is Information Visualization

It is about interactive explorationWant to show multiple different perspectives on the dataA way to increase scalability and dimensionality

27

What is Information Visualization

It is about applicationsDocument, images, videos, multimediaFinancial/business dataInternet informationSoftware

28

The Need is There

In five years, 100 million people will be using an information-visualization tool on a near-daily basis. And products that have visualization as one of their top three features will earn $1 billion per year.

-- Ramana Rao, founder and chief technology

officer, Inxight Software Inc., Sunnyvale, Calif.

http://www.computerworld.com/databasetopics/data/story/0,10801,80243,00.html

Slide courtesy of John Stasko

29

Basic Techniques 1: Multi-dimensional Visualization

Class 1, Part B

30

Multi-dimensional (Multivariate) Dataset

Slide courtesy of John Stasko

31

Data Item (Object, Record, Case)

32

Dimension (Variable, Attribute)

33

Multidimensional DataExample: Iris Data

Scientists measured the sepal length, sepal width, petal length, petal width of many kinds of iris...

34

Multidimensional DataExample: Iris Data

sepallength

sepalwidth

petallength

petalwidth

5.1 3.5 1.4 0.2

4.9 3 1.4 0.2

... ... ... ...

5.9 3 5.1 1.8

35

ClassificationTable-based techniques

HeatMap, tableLensGeometric techniques

Scatterplot matrices, parallel coordinates, landscapes, Dust&Magnet …

Icon-based techniquesglyphs, shape-coding, *color icons, …

Hierarchical techniquesDimensional stacking, worlds-within-worlds, …

Pixel-oriented techniquesRecursive pattern, circle segments, spiral, axes techniques,…

36

Table-Based Techniques

Basic idea: Visualization that improves the existing spreadsheet table format

HeatMapTable Lens [RC 94]

37This figure is used by Dr. M. Ward’s permission

38This figure is used by Dr. M. Ward’s permission

39This figure is used by Dr. M. Ward’s permission

40This figure is used by Dr. M. Ward’s permission

41This figure is used by Dr. M. Ward’s permission

42This figure is used by Dr. M. Ward’s permission

43This figure is used by Dr. M. Ward’s permission

44This figure is used by Dr. M. Ward’s permission

45

Eureka (Table Lens)

Rao R., Card S. K.: ‘The Table Lens: Merging Graphical and Symbolic Representation in an Interactive Focus+Context Visualization for Tabular Information’, Proc. Human Factors in Computing Systems CHI ‘94Conf., Boston, MA, 1994, pp. 318-322.

46

Eureka (Table Lens)

47

Geometric TechniquesBasic idea: Visualization of geometric transformations and projections of data

Scatterplot-matrices [And 72, Cle 93]Parallel coordinates [Ins 85, ID 90]Parallel Glyphs [Fanea:05] Parallel Sets [Bendix:05]Star coordinates [Kan 2000]Landscapes [Wis 95]Dust & Magnet [Yi 2005]Projection Pursuit Techniques [Hub 85]Prosection Views [FB 94, STDS 95]Hyperslice [WL 93]

48

Recall 1-Dimensional Visualization

1 2

(1.6)

49

Parallel Coordinates

sepa llength

sepalw idth

peta llength

peta lw idth

5.1 3.5 1.4 0.2

4.9 3 1.4 0.2

... ... ... ...

5 .9 3 5.1 1.8

A. Inselberg. The Plane with Parallel Coordinates. Special Issue on Computational Geometry, The Visual Computer, 69-97

50

5.1

3.5

sepa lleng th

sepa lw id th

peta lleng th

peta lw id th

5 .1 3 .5 1 .4 0 .2

1.4 0.2

51

Cluster and Outlier

ClusterA group of data items that are similar in all dimensions.

Outlier A data item that is similar to FEW or No other data items.

52

53

54

Parallel Sets: Visual Analysis of Categorical Data. F. Bendix, R. Kosara, H. Hauser. Infovis 2005

Parallel Coordinates Parallel Sets

55

Scatterplot Matrix

56

Scatterplot Matrix

57

VaR Display

J. Yang et al. Value and Relation Display: Interactive Visual Exploration of Large Data Sets with Hundreds of Dimensions. IEEE Trans. Vis. Comput. Graph. 13(3): 494-507 (2007)

58

Projection Pursuit Techniques

Idea: For High-Dimensional Dataset1. locate projections to low-dimensional space that reveal most details about dataset structure 2. extract and analyze projection structures from projections

Two general approaches: manual and automatic.

59

Automatic Projection PursuitFriedman and Tukey

Friedman and Tukey’s method

1. characterize a given projection by a numerical index that indicates the amount of structure that is present. 2. heuristically search to locate the ``interesting'' projections using the index. 3. remove detected structures from the data. 4. Iterate until there is no remaining structure detectable within the data.

60

Landscapes

Hot topics of a news collectionL. Nowell, E. Hetzler, and T. Tanasse. “Change Blindness in Information Visualization: A Case Study”. Infovis 2001

61

Landscapes

How was the figure generated?Documents (data items)

Keywords (dimensions)

N-d vector for each documents

Projection from N-d space to 2-d space

Landscape view

Wise, J., Thomas, J., et al. “Visualizing theNon-Visual: Spatial Analysis and Interaction withInformation from Text Documents”, Infovis 95

62

Dust & MagnetDust & Magnet: Multivariate Information Visualization using a Magnet Metaphor, Yi et al. Information Visualization (2005)

Video

63

Hierarchical Techniques

Basic ideas: Visualization of the data using a hierarchical partitioning into subspaces

Dimensional Stacking [LWW90]Worlds-within-Worlds [FB 90a/b]Treemap [Shn 92, Joh 93]Cone Trees [RMC 91]InfoCube [RG93]

64

Dimensional Stacking

Imagine each data item (4 attributes) as a small block. We place all blocks on a table.

65

Dimensional Stacking

Add grids on the table. Place the blocks in the grids according to their values of attribute1.

According to values of attribute 1

66

Dimensional Stacking

Add grids in grids. Place the blocks in the grids according to their values of attribute2.

According to values of attribute 2

67

Dimensional Stacking

Add grids in grids. Place the blocks in the grids according to their values of attribute3.

According to values of attribute 3

68

Dimensional Stacking

Add grids in grids. Place the blocks in the grids according to their values of attribute4.

According to values of attribute 4

69

Dimensional Stacking

Fix one block!

Expand the block

70

Dimensional Stacking

Fix another block

71

Dimensional Stacking

Dimensional stacking!

72

Dimensional Stacking

visualization of oil mining data with longitude and latitude mapped to the outer x-, y- axes and ore grade and depth mapped to the inner x-, y- axes

M. Ward, Worcester Polytechnic Institute

73

Worlds within Worlds

Feiner S., Beshers C.: ‘World within World: Metaphors for Exploring n-dimensional Virtual Worlds’, Proc. UIST, 1990, pp. 76-83.

74

Icon-Based Techniques

Basic idea: visualization of the data values as features of icons

Chernoff-Faces [Che 73, Tuf 83]Stick Figures [Pic 70, PG88]Shape Coding [Bed 90]Color Icons [Lev 91, KK94]TileBars [Hea 95]

75

Glyphs

Profile GlyphsEach bar encodes a variable’s value

3 data items in the iris dataset

Min

Max

1 data item with 4 attributes

v1 v2 v3 v4

76

Chernoff faces (1973, Herman Chernoff)

77

Dr. Eugene Turner, 1979

78

Star Glyphs

Space out variables at equal angles around a circleEach arm encodes a variable’s value

4 data items in the iris dataset

v1

v2

v3

v4

1 data item with 4 attributes

79

Glyphs

80

Stick Figures

The mappingTwo attributes - display axes Others - angle and/or length of limbs

Texture patterns in the visualization show certain data characteristics

Stick figure icon

Pickett R. M., Grinstein G. G.: ‘Iconographic Displays for Visualizing Multidimensional Data’, Proc. IEEE Conf.on Systems, Man and Cybernetics, 1988, pp. 514-519.

81

Stick Figures

http://ivpr.cs.uml.edu/gallery/G. Grinstein, University of Massachusetts at Lowell

5-dim. imagedata from thegreat lake region

82

Stick Figures

Stick figure icon familyTry all different mappings

12X5! = 1440 picturesMovie

83

Stick Figures

The same dataset, different mapping

http://ivpr.cs.uml.edu/gallery/G. Grinstein, University of Massachusetts at Lowell

84

Stick Figures

Black background White background

http://ivpr.cs.uml.edu/gallery/G. Grinstein, University of Massachusetts at Lowell

85

More

http://ivpr.cs.uml.edu/gallery/G. Grinstein, University of Massachusetts at Lowell

86

Shape Coding

Data item - small array of fieldsEach field - one attribute value

Arrangement of attribute fields (e.g., 12-dimensional data):

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00

87

Shape Coding

Beddow J.: ‘Shape Coding of Multidimensional Data on a Mircocomputer Display’, Visualization ‘90, 1990, pp. 238-246.

88

Color Icons [Lev 91, KK94]

Data item - color icon (arrays of color fields)Each field - one attribute valueArrangement is query-dependent (e.g., spiral)

6 dimensional dataset

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00.

89

Color Icons

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00

90

Pixel-Oriented TechniquesBasic idea

Each value - one colored pixel (value ranges -> fixed colormap)

Values for each attribute are presented in separate subwindowsValues of the same data item are at the same postions of all subwindows

Keim’s tutorial notes in Infovis 00

91

Major Challenge

Patterns <-the texture of the subwindows

How to order the pixels to get informative textures?

92

Query-Independent Techniques

Recursive pattern arrangements

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00

93

Query-Independent Techniques

Recursive pattern arrangements

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00

94

Query-Dependent Techniques

Basic idea:data items (a1, a2, ..., am) & query (q1, q2, ..., qm)distances (d1, d2, ... dm)extend distances by overall distance (dm+1)determine data items with lowest overall distancesmap distances to color (for each attribute)visualize each distance value di by one colored pixel

The slide is taken from Dr. D. Keim’s tutorial notes in Infovis 00

95

Query-Dependent Techniques

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00

Spiral technique

Arrangement The m+1 dimension (overall distance)

96

Query-Dependent Techniques

The figure is taken from Dr. D. Keim’s tutorial notes in Infovis 00

Spiral technique

97

Major References

Colin Ware. Information Visualization, 2004Daniel Keim. Tutorial note in InfoVis 2000John Stasko. Course slides, Fall 2005Edward Tufte:

The Visual Display of Quantitative InformationEnvisioning InformationVisual Explanation

98

Homework1. Read referred papers2. Download XMDV fromhttp://davis.wpi.edu/~xmdv/downloadxmdv.htmlBinary Release, Microsoft Windows, Xmdv 7.0 and play with it.3. Prepare your next class by reading papers

The paper list and course slides will be published on my webpage

www.cs.uncc.edu/~jyang13

99

top related