communicating data using graphics mis2502 data analytics

31
COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Upload: earl-ellis

Post on 04-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

COMMUNICATING DATA USING GRAPHICSMIS2502

Data Analytics

Page 2: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

What makes a good chart?

Minard’s map of Napoleon’s campaign into Russia, 1869Reprinted in Tufte (2009), p. 41

Page 3: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

What makes a good chart?

http://www.popvssoda.com/countystats/total-county.html

Page 4: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

What makes a good chart?

Zhang et al. (2010), “A case study of micro-blogging in the enterprise: use, value, and related issues,” Proceedings of the 28th International Conference on Human Factors in Computing Systems.

This is from an academic conference

paper.

What are the problems with

this chart?

Page 5: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Some basic principles (adapted from Tufte 2009)

Tufte’s fundamental principle:Above all else show the data

Page 6: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Principle 1: The chart should tell a story

Page 7: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Examples?http://www.evl.uic.edu/aej/491/week03.html

http://flowingdata.com/2009/11/26/fox-news-makes-the-best-pie-chart-ever/

Page 8: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

http://www.nejm.org/doi/pdf/10.1056/NEJMon1211064

Page 9: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

http://www.ngoilgas.com/news/oil-spill-latest-the-cost-of-clumsiness/

Page 10: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Principle 2: The chart should have graphical integrity• Basically, it should not “lie” or mislead the reader.

Page 11: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Tufte’s “Lie Factor”

Should be ~ 1

< 1 = understated effect

> 1 = exaggerated effect

• Lie Factor = Graphical (Drawn) Difference / Actual Differences

Page 12: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Examples of the “lie factor”

Reprinted from Tufte (2009), p. 57 & p. 62

Page 13: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

A more recent, basic example

http://20bits.com/articles/politics-and-tuftes-lie-factor/

The original graphic from Real Clear Politics, 2008.

(Look at the y-axis)The adjusted graphic.

Page 14: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Other tips to avoid “lying”

80

90

100

110

120

130

140

2003 2004 2005 2006 2007 2008 2009 2010Year

Hypothetical Industries, Inc.

Revenue

Adjusted Revenue

350

360

370

380

390

400

410

2009 2010

Theft

s pe

r 100

000

citiz

ens

Hypothetical City Crime

25

75

125

175

225

275

325

375

425

2003 2004 2005 2006 2007 2008 2009 2010Th

efts

per 1

0000

0 ci

tizen

s

Hypothetical City Crime

vs.

Page 15: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Avoid an Implied Comparison of Incomparable Values

Page 16: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Other tips to avoid “lying” or misleading

Page 17: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Principle 3: The chart should minimize graphical complexity

Generally, the simpler the better…

Page 18: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

When a table is better than a chart• For a few data points, a table can do just as well…

$0.00

$50,000.00

$100,000.00

$150,000.00

$200,000.00

$250,000.00

Total Sales by SalespersonSalesperson Total Sales

Peacock $225,763.68

Leverling $201,196.27

Davolio $182,500.09

Fuller $162,503.78

Callahan $123,032.67

King $116,962.99

Dodsworth $75,048.04

Suyama $72,527.63

Buchanan $68,792.25

The table carries more information in less space and is more precise.

Page 19: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

The Ultimate Table: The Box Score

• Large amount of information in a very small space

• So why does this work?• Depends on the

reader’s knowledge of the data

Page 20: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

The Business Box Score?

• Applying the same concept to our salesforce example.

• How does this help? How could it hurt?

Sales Performance – March 2011

Salesperson TS WD BD NC DOR

Peacock 225 3 40 20 28

Leverling 201 2 45 18 27

Davolio 182 5 38 22 28

Fuller 162 2 22 16 20

Callahan 123 1 15 14 15

King 116 0.5 20 12 18

Dodsworth 75 0.3 12 10 20

Suyama 72 0 8 10 8

Buchanan 68 0 8 8 12

Key:TS – total salesWD – worst dayBD – best dayNC – number of customersDOR – days on the road

Page 21: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Data Ink

Should be ~ 1

< 1 = more non-data related ink in graphic

= 1 implies all ink devoted to data

Tufte’s principle:Erase ink whenever possible

Page 22: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Being conscious of data ink

25

75

125

175

225

275

325

375

425

2003 2004 2005 2006 2007 2008 2009 2010

Theft

s pe

r 100

000

citiz

ens

Hypothetical City Crime

25

75

125

175

225

275

325

375

425

2003 2004 2005 2006 2007 2008 2009 2010

Theft

s pe

r 100

000

citiz

ens

Hypothetical City Crime

200

270

320 330

370350

400370

2003 2004 2005 2006 2007 2008 2009 2010

Hypothetical City Crime

Lower data-ink ratio(worse)

Higher data-ink ratio(better)

Page 23: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

What makes a good chart?

020000400006000080000

100000120000140000160000

2011 Total Sales

Order Date

Sum of Extended Price

020000400006000080000

100000120000140000160000

2011 Total Sales

Order Date

Sum of Extended Price

Sometimes it’s really a matter of

preference.

These both minimize data

ink.

Why isn’t a table better here?

Page 24: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

3-D Charts

$0.00

$50,000.00

$100,000.00

$150,000.00

$200,000.00

$250,000.00

Total Sales by Salesperson

Evaluate this from a data-ink perspective.How does it affect the clarity of the chart?

Page 25: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Chartjunk: Data Ink “gone wild”

Page 26: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

25

75

125

175

225

275

325

375

425

2003 2004 2005 2006 2007 2008 2009 2010

Theft

s pe

r 100

000

citiz

ens

Hypothetical City Crime

Example: Moiré effects (Tufte 2009)

$0.00

$50,000.00

$100,000.00

$150,000.00

$200,000.00

$250,000.00

Total Sales by Salesperson

Page 27: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Example: The Grid

25

75

125

175

225

275

325

375

425

2003 2004 2005 2006 2007 2008 2009 2010

Theft

s pe

r 100

000

citiz

ens

Hypothetical City Crime

Why are these examples of chartjunk?

What could you do to

remedy it?

Page 28: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Data Ink Working Against Us

Evaluate this chart in terms of Data Ink.

Are there better

visualizations?

Page 29: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Data Ink Working For Us

Evaluate this chart in terms of Data Ink.

Imagine this as a bar chart.

As a table!!

Page 30: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Stacked Bar Charts are Often Trouble

• Original chart from the BBC website

• Why is this so difficult to read?

• What would be a better way to visualize it?

http://j-walkblog.com/index.php?/weblog/posts/bad_charts/

Page 31: COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics

Avoid Multi-Axis Graphs