making sense of data visually: a modern look at datavisualization

44
Making sense of data visually: A modern look at data visualization VLADIMIR MILEV NEW VENTURE SOFTWARE

Upload: vladimir-milev

Post on 13-Jul-2015

226 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Making sense of data visually: A modern look at datavisualization

Making sense of data visually:

A modern look at data visualization

VLADIMIR MILEV

NEW VENTURE SOFTWARE

Page 2: Making sense of data visually: A modern look at datavisualization

Author BioVladimir Milev

MCPD Enterprise

Speaker (Devreach, NTK Slovenia and others)

DV Evangelist

Founder at New Venture Software

@vmilev

www.linkedin.com/in/vladimirmilev/

Page 3: Making sense of data visually: A modern look at datavisualization

http://www.newventuresoftware.com/

Page 4: Making sense of data visually: A modern look at datavisualization

Agenda1. Big data and information overload

2. What problems DataViz solves

3. DataViz fundamental theory

4. Basic visualizations

5. Advanced visualizations

Page 5: Making sense of data visually: A modern look at datavisualization

Information OverloadTwitter: 500 million tweets per day

Facebook: 55 million status updates per day

Facebook: 900 million interactions per day (comments, likes etc.)

Reddit:

Page 6: Making sense of data visually: A modern look at datavisualization

Proliferation of smart devices We are already living in a world dominated by

smart devices What is the meaning of this? More connected, data is more accessible Less space for tables and text Must use visual communication

Page 7: Making sense of data visually: A modern look at datavisualization

Making Sense of DataIncreasing amount of data available

Increasing number of data consumer devices

Obtaining data no longer a problem

We have an Information Overload issue

Quick data analysis is the new problem

But how quick?

Page 8: Making sense of data visually: A modern look at datavisualization

A Picture is worth a 1000 wordsWith about 1,000,000 ganglion cells, the human retina would transmit data at roughly the rate of an Ethernet connection, or 10 million bits per second.”

-Vijay Balasubramanian, PhD, Professor of Physics at U Penn

Page 9: Making sense of data visually: A modern look at datavisualization

OK – That’s a lot of bandwidthBUT ARE WE USING IT EFFICIENTLY?

Page 10: Making sense of data visually: A modern look at datavisualization

EfficiencyBest readers usually read up to about 300 words per minute.

Average word length is 5.1 letters

300 * 5.1 = 1530 characters per minute

Or 1530 / 60 = 25.5 characters per second

1 character is usually stored as 8 bits

26 * 8 = 208 bits per second

Reading bandwidth is ~0.025 KiB/s

Or 0.00208% Efficiency

Page 11: Making sense of data visually: A modern look at datavisualization

So reading clearly isn’t the way to go…BUT WHAT IS THE SOLUTION?

Page 12: Making sense of data visually: A modern look at datavisualization

Using statisticsFor the most part of the 20th century

Using arithmetic mean, average, standard deviation

Variance, correlations, regressions

Turns out this is not good enough

Page 13: Making sense of data visually: A modern look at datavisualization

Anscombe’s QuartetI II III IV

x y x y x y x y

10 8.04 10 9.14 10 7.46 8 6.58

8 6.95 8 8.14 8 6.77 8 5.76

13 7.58 13 8.74 13 12.74 8 7.71

9 8.81 9 8.77 9 7.11 8 8.84

11 8.33 11 9.26 11 7.81 8 8.47

14 9.96 14 8.1 14 8.84 8 7.04

6 7.24 6 6.13 6 6.08 8 5.25

4 4.26 4 3.1 4 5.39 19 12.5

12 10.84 12 9.13 12 8.15 8 5.56

7 4.82 7 7.26 7 6.42 8 7.91

5 5.68 5 4.74 5 5.73 8 6.89

• Statistical properties are identical:• Mean of X (9.0) and Y (7.5) values are constant• Nearly same variances, correlations and regressions• As far as statistics is concerned these sets are almost the same

Page 14: Making sense of data visually: A modern look at datavisualization

Anscombe’s Quartet

Page 15: Making sense of data visually: A modern look at datavisualization

So DataViz is very powerful

But why does it work so well?

Page 16: Making sense of data visually: A modern look at datavisualization

Gestalt PsychologySeeing with the brain

The mind understands external stimuli as whole rather than the sum of their parts

We tend to order our experience in a manner that is regular, orderly, symmetric, and simple

Key principles of gestalt: reification, multistability, invariance

Gestalt laws of grouping: proximity, similarity, closure, symmetry

Page 17: Making sense of data visually: A modern look at datavisualization

Gestalt Principles - ReificationOur minds tend to construct/generate information

Page 18: Making sense of data visually: A modern look at datavisualization

Gestalt Principles - Multistability

The tendency of our mind to jump back and forth between ambiguous alternative interpretations

Spinning Girl Rubin Vase

Page 19: Making sense of data visually: A modern look at datavisualization

Gestalt Principles - InvarianceThe tendency to perceive simple geometric objects independent of rotation, translation, and scale

Also elastic deformations, different lighting, and different component features

Page 20: Making sense of data visually: A modern look at datavisualization

Gestalt Laws of Grouping - Similarity

We group objects based on visual similarity

Page 21: Making sense of data visually: A modern look at datavisualization

Gestalt Laws of Grouping - Proximity

We group items based on spatial proximity

Page 22: Making sense of data visually: A modern look at datavisualization

Gestalt Laws of Grouping - Closure

We perceive objects such as shapes, letters, pictures, etc., as being whole when they are not complete

Page 23: Making sense of data visually: A modern look at datavisualization

Application in Data Visualization Introducing the visual variables

Fundamental properties of objects which can encode information into a picture

Fundamental visual variables:◦ Position

◦ Size

◦ Color

◦ Shape

◦ Orientation

Basis for all Data Visualization!

Page 24: Making sense of data visually: A modern look at datavisualization

Basic/Common VisualizationsBar graphs

Line graphs

Area charts

Pie charts

Page 25: Making sense of data visually: A modern look at datavisualization

Bar Graphs

• Using color correctly to encode gender

• Using position (ordering) to create an orderly scale

• Using size to encode the values• Using orientation to differentiate

gender again

Page 26: Making sense of data visually: A modern look at datavisualization

Bar Graphs continued

• Labels are used• Color is neutral and does not encode

information• Again, we have top-down ordering

(position)• And again size encodes the relative

numeric value

Page 27: Making sense of data visually: A modern look at datavisualization

Bars and Normal Distribution

Minimum passing grade

• Distribution of test scores for Polish “Matura” exam

• Normal Distribution is expected

• Red line shows normal distribution

• 30 is the minimum expected grade

• Detecting behavioral changes• What happened?

Page 28: Making sense of data visually: A modern look at datavisualization

Line Graphs

Confirming what we already know –paper media is declining rapidly.

• Shape encodes the value• Color is not significant• Design goal is to show a

trend/change

Page 29: Making sense of data visually: A modern look at datavisualization

Area Graphs

Effect of school year on Team Fortress 2 players

School starts

• Similar to line graph• Design goal for area

charts is emphasize on the value/quantity, not so much on the trend

• You can see both• Color has no

meaning

Page 30: Making sense of data visually: A modern look at datavisualization

Area Graphs continued• This time color carries a meaning (legend)

• The graph is also good for displaying ratio between series of data over time

Page 31: Making sense of data visually: A modern look at datavisualization

Pie Charts

Page 32: Making sense of data visually: A modern look at datavisualization

Pie ChartsGolden Rules for Pie Charts

• Ratio of one piece to the whole

• Order the values

• Less than 6 pieces

• Avoid legends

• Sum up to 100%

Page 33: Making sense of data visually: A modern look at datavisualization

Abusing Pie Charts

Don’t break the rules!

Page 34: Making sense of data visually: A modern look at datavisualization

Maps

Plot millions of journal entries from 18th and 19th century ship logs, and you reveal a picture of ocean trade you've never seen before

• Visualization of routes

• Color saturation indicates heavily used routes

Page 35: Making sense of data visually: A modern look at datavisualization

Maps are good with animations too

• Concentration of NO2 from 2005 to 2011

• Using both color and position to encode concentration

• Using continuous color scale• Adding another dimension -

time

Page 36: Making sense of data visually: A modern look at datavisualization

Choropleth Maps

Displaying the most popular name for a newborn in each state

• Using discrete palette to encode information

Page 37: Making sense of data visually: A modern look at datavisualization

Heat Maps

• Excellent for plotting recurring values

• Color saturation/brightness encodes the values

• Position also encodes information

• Easy to spot concentrations and find patterns

Page 38: Making sense of data visually: A modern look at datavisualization

Heat Maps medicine/genetics

Page 39: Making sense of data visually: A modern look at datavisualization

Tree Maps

• Excellent for representing hierarchical data

• Color carries a meaning• Size carries a meaning as well• Position is irrelevant• Suitable for annotations

Page 40: Making sense of data visually: A modern look at datavisualization

Parallel Coordinates Plot

• Interactive visualization• Good at displaying

relationships between different dimensions of data

• Position encodes dimension

• Color encodes scale

Page 41: Making sense of data visually: A modern look at datavisualization

Parallel Coordinates Plot – in action

Selecting a subset of a dimension to display the relationships with the other dimensions

Page 42: Making sense of data visually: A modern look at datavisualization

Chord Diagram

• Similar to Parallel Coordinates plot

• Color and Position used to encode data

• Design is different• Filtering of dimensions is not a

design goal• Focuses on selecting a whole

dimension

Page 43: Making sense of data visually: A modern look at datavisualization

Some resourceshttp://www.reddit.com/r/dataisbeautiful/

http://blog.visual.ly/

http://flowingdata.com/

http://eagereyes.org/

http://www.perceptualedge.com/blog/

Page 44: Making sense of data visually: A modern look at datavisualization

Thank You!