visualising multi dimensional data @ fifth elephant 2015

Post on 14-Aug-2015

908 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Amit Kapoor@amitkaps

Visualising Multi - Dimensional Data

x

wz

y

Flatland A Romance in Many Dimensions

by Edwin Abbot (1884)

Square

A square is but a line in 2d

eye

The Square

The disappearing circle

The Sphere visits the 2d

flatland

The Sphere rising out of

2d space

The Sphere on the point of

vanishing

eye

The square is a cube in 3d!

eye

The Square sees the world in

a new way!

Show, not Tell

70%of the sensory

receptors are in the eyes

50%of the brain

used for visual processing

100msto get a sense of the visual

scene

Visual Wired Brain

Symbolic Abstraction

Visual Abstraction

Phenomena

Source: Bret Victor

“Visualisation is the transformation of the

symbolic into geometric”

Small DataLarge Data

Big DataWide Data

Visualise Small DataArea Sales (Rs.)

North 5

East 25

West 15

South 20

Central 10

Area Sales

North 5

East 25

West 15

South 20

Central 10

AcquireData

Area Sales

North 5

East 25

West 15

South 20

Central 10

x y

1 5

2 25

3 15

4 20

5 10

x (C) = Areay (Q) = Sales

Parse Variables

AcquireData

Area Sales

North 5

East 25

West 15

South 20

Central 10

x y

1 5

2 25

3 15

4 20

5 10

x (C) = Areay (Q) = Sales

x y

20

40

100

140

180

Encode Shape & Select Scales

Parse Variables

AcquireData

x - position, y - barscale - 200 x 200

Area Sales

North 5

East 25

West 15

South 20

Central 10

x y

1 5

2 25

3 15

4 20

5 10

x (C) = Areay (Q) = Sales

x - position, y - barscale - 200 x 200

x y

20

40

100

140

180

Parse Variables

AcquireData

cartesian

Render with Coordinates

Encode Shape & Select Scales

Points Line Bar

Bar - Stacked Bar - Stagger CoordinatesSystem

Create Visualisations

CoordinatesCartesian

x

y

Dot Plot Line Chart Column Chart

WaterfallStacked Column

CoordinatesCartesian - Flip

Dot Plot Line Chart Bar Chart

CascadeStacked Bar

y

x

PolarCoordinate - X

x = θy = r

Marked Radar Line Radar CoxComb

Polar WaterfallBullseye

PolarCoordinate - Y

x = ry = θ

Target Line Track Wind Rose

Polar CascadePie Chart

Data Viz Process (Small Data)Acquire Data

Encode Shape

Select Scales

Render Coordinates

Parse Variables

gadfly

bokeh

ggplot2

matplotlib

graphics

Small DataLarge Data

Big DataWide Data

Visualise Large Data

Pincode Map

Scatter plot,play with alpha to

show density

But what if I want to show geographic

nature of pincode?

Pincode+ Map

Exploration of large data is iterative!

Refine Data (Filter, Transform)

Data Viz Process (Large Data)

Acquire Data

Encode Shape

Select Scales

Render Coordinates

Parse Variables

Refine Data

Small DataLarge Data

Big DataWide Data

Visualise Big Data

Comparable to theNumber of Pixels

on my MacBook Air

Data

Data Sample

Sampling can be effective (with overweighting

unusual values)

Require multiple plots or careful

tuning parameters

Data Sample

Model

Models are great as they scale nicely.

But, visualisation is required as

“I don’t know, what I don’t know.”

Data Sample

ModelBinning

Binning can solve a lot of these challenges

“Bin - Summarize - Smooth: A framework for visualising big data” - Hadley Wickam (2013)

“imMens: Real-time Visual Querying of Big Data” - Liu,

Jiang, Heer (2013)

Tools Matter

Defaults Matter

“We are calling 2015 the year of the histogram”

- Amanda Cox

“Visualising big data is the process of creating generalized histograms”

Data Viz Process (Big Data)

Acquire Data

Encode Shape

Select Scales

Render Coordinates

Parse Variables

Filter Data

Aggregate Data

Small DataLarge Data

Big DataWide Data

Multi Dimensional Viz Standard 2d/3d

Pixel Based Approach

Glyph Approach

Geometric Transforms

Stacking Approach

Scatterplot

SPLOM

Trellis / Facets

Multiple View

Star plots

Stick Figure

Chernoff Faces

Color Icons

Parallel Coord

Table lens

Star Coords

Tours

Space Filling

Pixel Bar Chart

Spiral Technique

Treemaps

Dimensional Stacking

Hierarchical Axis

Multi Dimensional Viz Standard 2d/3d

Pixel Based Approach

Glyph Approach

Geometric Transforms

Stacking Approach

Scatterplot

SPLOM

Trellis / Facets

Multiple View

Star plots

Stick Figure

Chernoff Faces

Color Icons

Parallel Coord

Table lens

Star Coords

Tours

Space Filling

Pixel Bar Chart

Treemaps

Dimensional Stacking

Hierarchical Axis

Need for Interaction

Ease of Interpretation

Spiral Technique

Diamonds dataset

Diamonds dataset

Price of diamonds is

related to the 4C’s

Diamonds dataset

zdepth

table width

z

y

x

Diamonds dataset

x

x

y

Chart Options Points Bars Lines Areas

1d Quantitative

1d Categorical

2d Quantitative + Categorical

2d Categorical + Categorical

2d Quantitative + Quantitative

Chart Options Points Bars Lines Areas

1d Quantitative Strip Plot Histogram Freq Poly Density Plots

1d Categorical Dot Plot Bar Chart Avoid Avoid

2d Quantitative + Categorical

Strip Plot Box Plot Freq Poly Density Plots

2d Categorical + Categorical

Avoid Bar Chart Avoid Mosaic Plot

2d Quantitative + Quantitative

Scatter Plot Table Lens Slopegraph Avoid

2d Scatter Plot

2d Scatter PlotInteraction: Annotation

2d Scatter Plotlog transformation

2d Scatter PlotSelect or Filter

Area of InterestCarat > 1, Price > 10,000

2d Scatter PlotInteraction: Pan & Zoom

z

x

y

Use aesthetic for 3d

Size Color Shape

3d Scatter PlotSize for Quantitative Dim

3d Scatter PlotColor for Categorical Dim

3d Scatter PlotShapes don’t scale well

3d Scatter Plotdepth persp not good

3d Scatter PlotInteraction: Rotation

zx

y

w

vu

4d Bubble PlotColor and Size

5d Bubble PlotColor, Size and Time

The Joy of Stat - Hans Rosling

Trellis / FacetsCreate Small Multiples

Trellis / Facet GridCreate Small Multiples

SPLOMScatterplot Matrix

Price

Carat

Table

Depth

Multiple ViewCreate Many Small Charts

Multiple ViewInteraction: Brushing & Linking

zx

y

w

v

u

Star Stick Chernoff

Icon based Approach

Star PlotMatrix Layout

color

clarity

depth

cut

table

Star PlotPlot on X-Y location

color

clarity

depth

cut

table

SubplotsBinned Plot Distribution

Orthogonal Parallel

Parallel CoordInteraction: Sorting

Parallel CoordInteraction: Selection

Table PlotInteraction: Bin & Sort

Table PlotInteraction: Zoom & Filter

Stacked Interaction: Brushing

Mosaic Plotcut, color and clarity

Other example - Treemaps

Data Viz Process (Wide Data)

Acquire Data

Encode Shape

Select Scales

Render Algorithm

Parse Variables

Filter Data

Aggregate Data

Make Views

Add Interactivity

Data Viz Process (Wide Data)

Acquire Data

Encode Shape

Select Scales

Render Algorithm

Parse Variables

Filter Data

Aggregate Data

Make Views

Add Interactivity

1. Encode wisely2. Use space and multiples 3. Add interactivity4. Reduce problem space

Code for these Slideshttps://github.com/amitkaps/multidim

R libraries

❖ ggplot2❖ GGally❖ ggsubplot❖ scales❖ iplots/Mondrian

❖ ggvis❖ tourr❖ rgl❖ scatterplot3d

❖ dplyr❖ tabplot❖ grid❖ gridExtra

“The greatest value of a picture is when it forces

us to notice what we never expected to see”

John Tukey

Amit Kapoor@amitkaps

amitkaps.comnarrativeviz.com

Data

Visual

Story

*

top related