16.1 vis_2002 data visualization lecture 14 information visualization part 1

36
16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

Upload: ross-lane

Post on 17-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.1Vis_2002

Data VisualizationData Visualization

Lecture 14Information Visualization

Part 1

Page 2: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.2Vis_2002

What is Visualization?What is Visualization?

Generally:– The use of computer-supported,

interactive, visual representations of data to amplify cognition

Card, McKinlay and Schneiderman

Two ‘branches’:– Scientific Visualization– Information Visualization

.. But first… an experiment

Page 3: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.3Vis_2002

The ExperimentThe Experiment

You need a watch with a second-hand

Without using pencil and paper (or a calculator!!), multiply 72 by 34

How long did it take?

Now you need pencil and paper as well as watch

Multiply 47 by 54 How long did it

take? Conclusion?

Page 4: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.4Vis_2002

Visualization – Twin SubjectsVisualization – Twin Subjects

Scientific Visualization

– Visualization of physical data

Information Visualization

– Visualization of abstract data

Ozone layer around earthAutomobile web site- visualizing links

… but this is only one characterisation

Page 5: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.5Vis_2002

Scientific Visualization – Another CharacterisationScientific Visualization – Another Characterisation

Focus is on visualizing an entity measured in a multi-dimensional space

– 1D– 2D– 3D– Occasionally nD

Underlying field is recreated from the sampled data

Relationship between variables well understood – some independent, some dependent

http://pacific.commerce.ubc.ca/xr/plot.html

Image from D. Bartz and M. Meissner

Page 6: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.6Vis_2002

Scientific Visualization Model

Scientific Visualization Model

Visualization represented as pipeline:

– Read in data– Build model of

underlying entity– Construct a

visualization in terms of geometry

– Render geometry as image

Realised as modular visualization environment

– IRIS Explorer – IBM Open Visualization

Data Explorer (DX)– AVS

visualizemodeldata render

Page 7: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.7Vis_2002

Information VisualizationInformation Visualization

Focus is on visualizing set of observations that are multi-variate

Example of iris data set

– 150 observations of 4 variables (length, width of petal and sepal)

– Techniques aim to display relationships between variables

Page 8: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.8Vis_2002

Dataflow for Information Visualization

Dataflow for Information Visualization

Again we can express as a dataflow – but emphasis now is on data itself rather than underlying entity

First step is to form the data into a table of observations, each observation being a set of values of the variables

Then we apply a visualization technique as before

visualizedatatabledata render

A B C

1 .. .. ..

2 .. .. ..

variables

observations

Page 9: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.9Vis_2002

Applications of Information Visualization

Applications of Information Visualization

Data Collections– Census data – Astronomical Data –

Bioinformatics Data– Supermarket checkout data – and so on– Can relationships be discovered amongst

the variables? Networks of Information

– E-mail traffic - Web documents– Hierarchies of information (eg filestores)

We shall see that all can be described as data tables

Page 10: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.10Vis_2002

Multivariate VisualizationMultivariate Visualization

Software:– Xmdvtool

Matthew Ward

Techniques designed for any number of variables

– Scatter plot matrices

– Parallel co-ordinates

– Glyph techniques

Acknowledgement:Many of images in followingslides taken from Ward’s work ..and also IRIS Explorer!

Page 11: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.11Vis_2002

Scatter PlotScatter Plot

Simple technique for 2 variables is the scatter plot

This example from NIST showslinear correlationbetween the variables

www.itl.nist.gov/div898/handbook/eda/section3/scatterp.htm

Page 12: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.12Vis_2002

3D Scatter Plots3D Scatter Plots

There has been some success at extending concept to 3D for visualizing 3 variables

XRT/3d

Page 13: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.13Vis_2002

Extending to Higher Numbers of VariablesExtending to Higher

Numbers of Variables

Additional variables can be visualized by colour and shape coding

IRIS Explorer used to visualize data from BMW

– Five variables displayed using spatial arrangement for three, colour and object type for others

– Notice the clusters…

Kraus & Ertl

Page 14: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.14Vis_2002

IRIS Explorer 3D Scatter Plots

IRIS Explorer 3D Scatter Plots

Try this….

Thanks to: http://www.mpa-garching.mpg.de/MPA-GRAPHICS/scatter3d.html

Page 15: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.15Vis_2002

Scatter Plots for M variables

Scatter Plots for M variables

For table data of M variables, we can look at pairs in 2D scatter plots

The pairs can be juxtaposed:

A

B

C

CBA

With luck,you may spotcorrelations between pairsas linearstructures.

..

..

..

..

. . .

..

.

. . .

Page 16: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.16Vis_2002

Scatter PlotScatter Plot

Data represents7 aspects of cars:what relationshipscan we notice?

For example, what correlates with high MPG?

Pictures from Xmdvtool developed byMatthew Ward:davis.wpi.edu/~xmdv

Page 17: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.17Vis_2002

Parallel Coordinates:Visualizing M variables on one

chart

Parallel Coordinates:Visualizing M variables on one

chart

A B C D E F

- create M equidistant vertical axes, each correspondingto a variable- each axis scaled to [min, max] range of the variable- each observation corresponds to a line drawn throughpoint on each axis corresponding to value of the variable

Page 18: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.18Vis_2002

Parallel CoordinatesParallel Coordinates

A B C D E F

-correlations may start to appear as the observationsare plotted on the chart- here there appears to be negative correlationbetween values of A and B for example- this has been used for applications with thousands of data items

Page 19: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.19Vis_2002

Parallel Coordinates Example

Parallel Coordinates Example

Detroit homicidedata7 variables13 observations

Page 20: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.20Vis_2002

The Screen Space ProblemThe Screen Space Problem

All techniques, sooner or later, run out of screen space

Parallel co-ordinates

– Usable for up to 150 variates

– Unworkable greater than 250 variates

Remote sensing: 5 variates, 16,384 observations)

Page 21: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.21Vis_2002

Brushing as a SolutionBrushing as a Solution

Brushing selects a restricted range of one or more variables

Selection then highlighted

Page 22: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.22Vis_2002

Scatter PlotScatter Plot

Use of a‘brushing’ toolcan highlight subsets of data

..now we can seewhat correlateswith high MPG

Page 23: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.23Vis_2002

Parallel CoordinatesParallel Coordinates

Brushing picksout the high MPGdata

Can you observethe same relationsas with scatterplots?

More or less easy?

Page 24: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.24Vis_2002

Parallel CoordinatesParallel Coordinates

Here we highlighthigh MPG andnot 4 cylinders

Page 25: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.25Vis_2002

Clustering as a SolutionClustering as a Solution

Success has been achieved through clustering of observations

Hierarchical parallel co-ordinates

– Cluster by similarity

– Display using translucency and proximity-based colour

Page 26: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.26Vis_2002

Hierarchical Parallel Co-ordinates

Hierarchical Parallel Co-ordinates

Page 27: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.27Vis_2002

Reduction of Dimensionality of Variable

Space

Reduction of Dimensionality of Variable

Space

Reduce number of variables, preserve information

Principal Component Analysis

– Transform to new co-ordinate system

– Hard to interpret Hierarchical reduction

of variable space– Cluster variables

where distance between observations is typically small

– Choose representative for each cluster

Page 28: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.28Vis_2002

Glyph Techniques – Star Plots

Glyph Techniques – Star Plots

Star plots– Each observation

represented as a ‘star’

– Each spike represents a variable

– Length of spike indicates the value

Crime inDetroit

Page 29: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.29Vis_2002

Chernoff FacesChernoff Faces

Chernoff suggested use of faces to encode a variety of variables - can map to size, shape, colour of facial features - human brain rapidly recognises faces

Page 30: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.30Vis_2002

Chernoff FacesChernoff Faces

Here are some of the facial features you can use

http://www.bradandkathy.com/software/faces.html#chernoff

Page 31: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.31Vis_2002

Chernoff FacesChernoff Faces

Demonstration applet at:– http://www.hesketh.com/

schampeo/projects/Faces/

Page 32: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.32Vis_2002

Chernoff’s FaceChernoff’s Face

.. And here is Chernoff’s face

http://www.fas.harvard.edu/~stats/Chernoff/Hcindex.htm

Page 33: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.33Vis_2002

Daisy ChartsDaisy Charts

Dry

Wet

Showery

Saturday

Sunday

Leeds

Sahara

Amazon

variables andtheir valuesplaced aroundcircle

lines connectthe values forone observation

This item is { wet, Saturday, Amazon }

http://www.daisy.co.uk

Page 34: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.34Vis_2002

Daisy Charts - Underground Problems

Daisy Charts - Underground Problems

Page 35: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.35Vis_2002

Scientific Visualization – Information VisualizationScientific Visualization – Information Visualization

Focus is on visualizing set of observations that are multi-variate

There is no underlying field – it is the data itself we want to visualize

The relationship between variables is not well understood

Focus is on visualizing an entity measured in a multi-dimensional space

Underlying field is recreated from the sampled data

Relationship between variables well understood

Scientific Visualization

Information Visualization

Page 36: 16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1

16.36Vis_2002

Further ReadingFurther Reading

Information Visualization– Robert Spence– published 2000 by Addison Wesley

See also resources section of the module web site