13.1 vis_04 data visualization lecture 13 information visualization part 1
TRANSCRIPT
13.1Vis_04
Data VisualizationData Visualization
Lecture 13Information Visualization
Part 1
13.2Vis_04
What is Visualization?What is Visualization?
Generally:– The use of computer-supported,
interactive, visual representations of data to amplify cognition
Card, McKinlay and Schneiderman
Two ‘branches’:– Scientific Visualization– Information Visualization
.. But first… an experiment
13.3Vis_04
The ExperimentThe Experiment
You need a watch with a second-hand
Without using pencil and paper (or a calculator!!), multiply 72 by 34
How long did it take?
Now you need pencil and paper as well as watch
Multiply 47 by 54 How long did it
take? Conclusion?
13.4Vis_04
Visualization – Twin SubjectsVisualization – Twin Subjects
Scientific Visualization
– Visualization of physical data
Information Visualization
– Visualization of abstract data
Ozone layer around earthAutomobile web site- visualizing links
… but this is only one characterisation
13.5Vis_04
Scientific Visualization – Another CharacterisationScientific Visualization – Another Characterisation
Focus is on visualizing an entity measured in a multi-dimensional space
– 1D– 2D– 3D– Occasionally nD
Underlying field is recreated from the sampled data
Relationship between variables well understood – some independent, some dependent Image from D. Bartz and M. Meissner
13.6Vis_04
Scientific Visualization Model
Scientific Visualization Model
Visualization represented as pipeline:
– Read in data– Build model of
underlying entity– Construct a
visualization in terms of geometry
– Render geometry as image
Realised as modular visualization environment
– IRIS Explorer – IBM Open Visualization
Data Explorer (DX)– AVS
visualizemodeldata render
13.7Vis_04
Information VisualizationInformation Visualization
Focus is on visualizing set of observations that are multi-variate
Example of iris data set
– 150 observations of 4 variables (length, width of petal and sepal)
– Techniques aim to display relationships between variables
13.8Vis_04
Dataflow for Information Visualization
Dataflow for Information Visualization
Again we can express as a dataflow – but emphasis now is on data itself rather than underlying entity
First step is to form the data into a table of observations, each observation being a set of values of the variables
Then we apply a visualization technique as before
visualizedatatabledata render
A B C
1 .. .. ..
2 .. .. ..
variables
observations
13.9Vis_04
Applications of Information Visualization
Applications of Information Visualization
Data Collections– Census data – Astronomical Data –
Bioinformatics Data– Supermarket checkout data – and so on– Can relationships be discovered amongst
the variables? Networks of Information
– E-mail traffic - Web documents– Hierarchies of information (eg filestores)
We shall see that all can be described as data tables
13.10Vis_04
Multivariate VisualizationMultivariate Visualization
Software:– Xmdvtool
Matthew Ward
Techniques designed for any number of variables
– Scatter plot matrices
– Parallel co-ordinates
– Glyph techniques
Acknowledgement:Many of images in followingslides taken from Ward’s work ..and also IRIS Explorer!
13.11Vis_04
Scatter PlotScatter Plot
Simple technique for 2 variables is the scatter plot
This example from NIST showslinear correlationbetween the variables
www.itl.nist.gov/div898/handbook/eda/section3/scatterp.htm
13.12Vis_04
3D Scatter Plots3D Scatter Plots
There has been some success at extending concept to 3D for visualizing 3 variables
XRT/3d
http://www.ist.co.uk/XRT/xrt3d.html
13.13Vis_04
Extending to Higher Numbers of VariablesExtending to Higher
Numbers of Variables
Additional variables can be visualized by colour and shape coding
IRIS Explorer used to visualize data from BMW
– Five variables displayed using spatial arrangement for three, colour and object type for others
– Notice the clusters…
Kraus & Ertl
13.14Vis_04
IRIS Explorer 3D Scatter Plots
IRIS Explorer 3D Scatter Plots
Try this….
Thanks to: http://www.mpa-garching.mpg.de/MPA-GRAPHICS/scatter3d.html
13.15Vis_04
Scatter Plots for M variables
Scatter Plots for M variables
For table data of M variables, we can look at pairs in 2D scatter plots
The pairs can be juxtaposed:
A
B
C
CBA
With luck,you may spotcorrelations between pairsas linearstructures.
..
..
..
..
. . .
..
.
. . .
13.16Vis_04
Scatter PlotScatter Plot
Data represents7 aspects of cars:what relationshipscan we notice?
For example, what correlates with high MPG?
Pictures from Xmdvtool developed byMatthew Ward:davis.wpi.edu/~xmdv
13.17Vis_04
Parallel Coordinates:Visualizing M variables on one
chart
Parallel Coordinates:Visualizing M variables on one
chart
A B C D E F
- create M equidistant vertical axes, each correspondingto a variable- each axis scaled to [min, max] range of the variable- each observation corresponds to a line drawn throughpoint on each axis corresponding to value of the variable
13.18Vis_04
Parallel CoordinatesParallel Coordinates
A B C D E F
-correlations may start to appear as the observationsare plotted on the chart- here there appears to be negative correlationbetween values of A and B for example- this has been used for applications with thousands of data items
13.19Vis_04
Parallel Coordinates Example
Parallel Coordinates Example
Detroit homicidedata7 variables13 observations
1961 -1973
13.20Vis_04
The Screen Space ProblemThe Screen Space Problem
All techniques, sooner or later, run out of screen space
Parallel co-ordinates
– Usable for up to 150 variates
– Unworkable greater than 250 variates
Remote sensing: 5 variates, 16,384 observations)
13.21Vis_04
Brushing as a SolutionBrushing as a Solution
Brushing selects a restricted range of one or more variables
Selection then highlighted
13.22Vis_04
Scatter PlotScatter Plot
Use of a‘brushing’ toolcan highlight subsets of data
..now we can seewhat correlateswith high MPG
13.23Vis_04
Parallel CoordinatesParallel Coordinates
Brushing picksout the high MPGdata
Can you observethe same relationsas with scatterplots?
More or less easy?
13.24Vis_04
Parallel CoordinatesParallel Coordinates
Here we highlighthigh MPG andnot 4 cylinders
13.25Vis_04
Clustering as a SolutionClustering as a Solution
Success has been achieved through clustering of observations
Hierarchical parallel co-ordinates
– Cluster by similarity
– Display using translucency and proximity-based colour
13.26Vis_04
ComparisonComparison
One of 3 clusters
13.27Vis_04
Hierarchical Parallel Co-ordinates
Hierarchical Parallel Co-ordinates
13.28Vis_04
Reduction of Dimensionality of Variable
Space
Reduction of Dimensionality of Variable
Space
Reduce number of variables, preserve information
Principal Component Analysis
– Transform to new co-ordinate system
– Hard to interpret Hierarchical reduction
of variable space– Cluster variables
where distance between observations is typically small
– Choose representative for each cluster
13.29Vis_04
Further ReadingFurther Reading
Information Visualization– Robert Spence– published 2000 by Addison Wesley
See also resources section of the module web site