16.1 vis_2002 data visualization lecture 14 information visualization part 1
Post on 17-Jan-2016
216 Views
Preview:
TRANSCRIPT
16.1Vis_2002
Data VisualizationData Visualization
Lecture 14Information Visualization
Part 1
16.2Vis_2002
What is Visualization?What is Visualization?
Generally:– The use of computer-supported,
interactive, visual representations of data to amplify cognition
Card, McKinlay and Schneiderman
Two ‘branches’:– Scientific Visualization– Information Visualization
.. But first… an experiment
16.3Vis_2002
The ExperimentThe Experiment
You need a watch with a second-hand
Without using pencil and paper (or a calculator!!), multiply 72 by 34
How long did it take?
Now you need pencil and paper as well as watch
Multiply 47 by 54 How long did it
take? Conclusion?
16.4Vis_2002
Visualization – Twin SubjectsVisualization – Twin Subjects
Scientific Visualization
– Visualization of physical data
Information Visualization
– Visualization of abstract data
Ozone layer around earthAutomobile web site- visualizing links
… but this is only one characterisation
16.5Vis_2002
Scientific Visualization – Another CharacterisationScientific Visualization – Another Characterisation
Focus is on visualizing an entity measured in a multi-dimensional space
– 1D– 2D– 3D– Occasionally nD
Underlying field is recreated from the sampled data
Relationship between variables well understood – some independent, some dependent
http://pacific.commerce.ubc.ca/xr/plot.html
Image from D. Bartz and M. Meissner
16.6Vis_2002
Scientific Visualization Model
Scientific Visualization Model
Visualization represented as pipeline:
– Read in data– Build model of
underlying entity– Construct a
visualization in terms of geometry
– Render geometry as image
Realised as modular visualization environment
– IRIS Explorer – IBM Open Visualization
Data Explorer (DX)– AVS
visualizemodeldata render
16.7Vis_2002
Information VisualizationInformation Visualization
Focus is on visualizing set of observations that are multi-variate
Example of iris data set
– 150 observations of 4 variables (length, width of petal and sepal)
– Techniques aim to display relationships between variables
16.8Vis_2002
Dataflow for Information Visualization
Dataflow for Information Visualization
Again we can express as a dataflow – but emphasis now is on data itself rather than underlying entity
First step is to form the data into a table of observations, each observation being a set of values of the variables
Then we apply a visualization technique as before
visualizedatatabledata render
A B C
1 .. .. ..
2 .. .. ..
variables
observations
16.9Vis_2002
Applications of Information Visualization
Applications of Information Visualization
Data Collections– Census data – Astronomical Data –
Bioinformatics Data– Supermarket checkout data – and so on– Can relationships be discovered amongst
the variables? Networks of Information
– E-mail traffic - Web documents– Hierarchies of information (eg filestores)
We shall see that all can be described as data tables
16.10Vis_2002
Multivariate VisualizationMultivariate Visualization
Software:– Xmdvtool
Matthew Ward
Techniques designed for any number of variables
– Scatter plot matrices
– Parallel co-ordinates
– Glyph techniques
Acknowledgement:Many of images in followingslides taken from Ward’s work ..and also IRIS Explorer!
16.11Vis_2002
Scatter PlotScatter Plot
Simple technique for 2 variables is the scatter plot
This example from NIST showslinear correlationbetween the variables
www.itl.nist.gov/div898/handbook/eda/section3/scatterp.htm
16.12Vis_2002
3D Scatter Plots3D Scatter Plots
There has been some success at extending concept to 3D for visualizing 3 variables
XRT/3d
16.13Vis_2002
Extending to Higher Numbers of VariablesExtending to Higher
Numbers of Variables
Additional variables can be visualized by colour and shape coding
IRIS Explorer used to visualize data from BMW
– Five variables displayed using spatial arrangement for three, colour and object type for others
– Notice the clusters…
Kraus & Ertl
16.14Vis_2002
IRIS Explorer 3D Scatter Plots
IRIS Explorer 3D Scatter Plots
Try this….
Thanks to: http://www.mpa-garching.mpg.de/MPA-GRAPHICS/scatter3d.html
16.15Vis_2002
Scatter Plots for M variables
Scatter Plots for M variables
For table data of M variables, we can look at pairs in 2D scatter plots
The pairs can be juxtaposed:
A
B
C
CBA
With luck,you may spotcorrelations between pairsas linearstructures.
..
..
..
..
. . .
..
.
. . .
16.16Vis_2002
Scatter PlotScatter Plot
Data represents7 aspects of cars:what relationshipscan we notice?
For example, what correlates with high MPG?
Pictures from Xmdvtool developed byMatthew Ward:davis.wpi.edu/~xmdv
16.17Vis_2002
Parallel Coordinates:Visualizing M variables on one
chart
Parallel Coordinates:Visualizing M variables on one
chart
A B C D E F
- create M equidistant vertical axes, each correspondingto a variable- each axis scaled to [min, max] range of the variable- each observation corresponds to a line drawn throughpoint on each axis corresponding to value of the variable
16.18Vis_2002
Parallel CoordinatesParallel Coordinates
A B C D E F
-correlations may start to appear as the observationsare plotted on the chart- here there appears to be negative correlationbetween values of A and B for example- this has been used for applications with thousands of data items
16.19Vis_2002
Parallel Coordinates Example
Parallel Coordinates Example
Detroit homicidedata7 variables13 observations
16.20Vis_2002
The Screen Space ProblemThe Screen Space Problem
All techniques, sooner or later, run out of screen space
Parallel co-ordinates
– Usable for up to 150 variates
– Unworkable greater than 250 variates
Remote sensing: 5 variates, 16,384 observations)
16.21Vis_2002
Brushing as a SolutionBrushing as a Solution
Brushing selects a restricted range of one or more variables
Selection then highlighted
16.22Vis_2002
Scatter PlotScatter Plot
Use of a‘brushing’ toolcan highlight subsets of data
..now we can seewhat correlateswith high MPG
16.23Vis_2002
Parallel CoordinatesParallel Coordinates
Brushing picksout the high MPGdata
Can you observethe same relationsas with scatterplots?
More or less easy?
16.24Vis_2002
Parallel CoordinatesParallel Coordinates
Here we highlighthigh MPG andnot 4 cylinders
16.25Vis_2002
Clustering as a SolutionClustering as a Solution
Success has been achieved through clustering of observations
Hierarchical parallel co-ordinates
– Cluster by similarity
– Display using translucency and proximity-based colour
16.26Vis_2002
Hierarchical Parallel Co-ordinates
Hierarchical Parallel Co-ordinates
16.27Vis_2002
Reduction of Dimensionality of Variable
Space
Reduction of Dimensionality of Variable
Space
Reduce number of variables, preserve information
Principal Component Analysis
– Transform to new co-ordinate system
– Hard to interpret Hierarchical reduction
of variable space– Cluster variables
where distance between observations is typically small
– Choose representative for each cluster
16.28Vis_2002
Glyph Techniques – Star Plots
Glyph Techniques – Star Plots
Star plots– Each observation
represented as a ‘star’
– Each spike represents a variable
– Length of spike indicates the value
Crime inDetroit
16.29Vis_2002
Chernoff FacesChernoff Faces
Chernoff suggested use of faces to encode a variety of variables - can map to size, shape, colour of facial features - human brain rapidly recognises faces
16.30Vis_2002
Chernoff FacesChernoff Faces
Here are some of the facial features you can use
http://www.bradandkathy.com/software/faces.html#chernoff
16.31Vis_2002
Chernoff FacesChernoff Faces
Demonstration applet at:– http://www.hesketh.com/
schampeo/projects/Faces/
16.32Vis_2002
Chernoff’s FaceChernoff’s Face
.. And here is Chernoff’s face
http://www.fas.harvard.edu/~stats/Chernoff/Hcindex.htm
16.33Vis_2002
Daisy ChartsDaisy Charts
Dry
Wet
Showery
Saturday
Sunday
Leeds
Sahara
Amazon
variables andtheir valuesplaced aroundcircle
lines connectthe values forone observation
This item is { wet, Saturday, Amazon }
http://www.daisy.co.uk
16.34Vis_2002
Daisy Charts - Underground Problems
Daisy Charts - Underground Problems
16.35Vis_2002
Scientific Visualization – Information VisualizationScientific Visualization – Information Visualization
Focus is on visualizing set of observations that are multi-variate
There is no underlying field – it is the data itself we want to visualize
The relationship between variables is not well understood
Focus is on visualizing an entity measured in a multi-dimensional space
Underlying field is recreated from the sampled data
Relationship between variables well understood
Scientific Visualization
Information Visualization
16.36Vis_2002
Further ReadingFurther Reading
Information Visualization– Robert Spence– published 2000 by Addison Wesley
See also resources section of the module web site
top related