Steve Figard, Ph.D., Abbott Laboratories, Abbott Park, IL
Just Because You Can, Doesn’t Mean You Just Because You Can, Doesn’t Mean You ShouldShould
––The Elements of Graphing Data WellThe Elements of Graphing Data Well
2
Agenda/Objectives of Presentation
Introduction Terminology The Ten Commandments of Good Graphics A Word about PowerPoint The “Best” & The “Worst” Concluding Statement
3
Introduction
The Problem: a plethora of options/features/capabilities Confusion of what can be done with what ought to be done
Attributable to ignorance…“Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning.” -Richard Cook, science fiction author, The Wizardry Compiled
Due to lack of trainingNot usually covered in university classes
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
4
Introduction
Two goals of good graphicsClarity revealing the story in the dataEase of visualizing the plotted data
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
“When a graph is made, quantitative and categorical information is encoded by a display method. Then the information is visually decoded. This visual perception is a vital link. No matter how clever the choice of the information, and no matter how technologically impressive the encoding, a visualization fails if the decoding fails. Some display methods lead to efficient, accurate decoding, and others lead to inefficient, inaccurate decoding.” - William Cleveland
The “Grand Unification Philosophy” of good graphicsminimize the mental gymnastics that the viewer must go through to understand the graph
5
Terminology• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
aka, y axis
aka, x axis
6
Terminology• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
7
The Ten Commandments…of good graphics
Folded, spindled, and mutilated from:
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
8
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The units of measure employed-alternate labels
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
How relate to log2?
Now these numbers we understand!
9
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The units of measure employed-what to plot
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
The raw data: interesting but not as informative…
…as the actual difference between the lines.
10
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The units of measure employed-plotting differences
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
The optical illusion: The lines are only one unit apart across the entire range.
Why? Because the eye is good at perceiving perpendicular
distances between two curves, but not the difference in height.
Lesson to be learned: plotting themetric of interest may be moreInformative than the raw data.
11
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The range of those units of measure
Choose your range so thatthe data rectangle fills upas much of the scale-linerectangle as possibleDo not insist that the zero always be included on a scale showing magnitude, but…
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
“Get your facts first, and then you can distort them as much as you please. (Facts are stubborn, but statistics are more pliable).” - Mark Twain
12
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The number of tick marks shownToo many = clutterToo few = “guesstimation” difficulties3-10 usually sufficientBeware abuse of time-scaletick marks by changing theinterval shown…
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
one year interval five year interval
13
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The presence or absence of breaks in the axisOnly when necessary…try log scale firstIf used, do a full scale breakIf used, do NOT connect numerical values across the break!
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
14
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The presence or absence of breaks in the axisPay close attention to ranges especially when breaks are clearly present – they will impact the interpretation of the data and may alter the message conveyed…
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
15
The Ten Commandments…of good graphics
1. Thou shalt pay very close attention to thy axes, for therein lieth great opportunity to succeed or to fail.
The size or length of the axis on the pageSometimes the default square or rectangle may hide important features in the data
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
JMP is particularly good at allowing“on the fly” adjustment of axes
16
The Ten Commandments…of good graphics
2. Thou shalt use color to categorize, not accessorize.Only two uses of color that transmit useful information to the viewer
Encoding a categorical variable
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
17
The Ten Commandments…of good graphics
2. Thou shalt use color to categorize, not accessorize.Only two uses of color that transmit useful information to the viewer
Encoding a quantitative variable: contour plotsthe choice of color for a contour plot must achieve two goals:effortless perception of the order of the values(i.e., we do not want to be constantly referring to a key)clearly perceived boundaries between adjacent levels
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
18
The Ten Commandments…of good graphics
3. Thou shalt choose symbols that can be easily distinguished from one another.
Concerned with plots in which the data overlaps so that discerning the different datasets being plotted becomes critical
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
19
The Ten Commandments…of good graphics
3. Thou shalt choose symbols that can be easily distinguished from one another.
“Texture” based on micropatterns inherent in the symbol: boundaries
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
JMP provides several sets of markers that should be evaluated with this commandment in mind
20
The Ten Commandments…of good graphics
4. Thou shalt not employ “chartjunk.”Category 1: unintentional optical art and the moiré effect
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Anyone recognizeExcel bar chartoptions here?!
The moiré effect describes the phenomenon when the graphic design interacts with the physiological tremor of the eye to generate the distracting appearance of vibration and movement.
21
The Ten Commandments…of good graphics
4. Thou shalt not employ “chartjunk.”Category 1: unintentional optical art and the moiré effect
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Can you say “garish”?!
garish adj. 1 too bright or gaudy; showy; glaring 2 gaudily or showily dressed, decorated, written, etc.(Webster’s New World College Dictionary, 4th Ed.)
22
The Ten Commandments…of good graphics
4. Thou shalt not employ “chartjunk.”Category 2: the dreaded grid (especially compared to symbol size)
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
23
The Ten Commandments…of good graphics
4. Thou shalt not employ “chartjunk.”Category 3: the self-promoting graphical duck
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
“When a graphic is taken over by decorative forms or computer debris, when the data measures and structures become Design Elements, when the overall design purveys Graphical Style rather than quantitative information, then that graphic may be called a duck in honor of the duck-form store, ‘Big Duck.’”- Edward Tufte
Based on an architectural observation that is valid for
graphics:“It is all right to decorate
construction but never construct decoration.”
Fortunately, this is really hard to do in JMP…you have to work at it. Our Worst case selection will further clarify this rule of thumb.
24
The Ten Commandments…of good graphics
4. Thou shalt not employ “chartjunk.”Category 3: the self-promoting graphical duck
The charts widely used in mass media and business publications, to wit, the pie chart, divided bar charts, and area charts, will, in most cases, violate this commandment when their use is attempted in science and technology.
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
25
The Ten Commandments…of good graphics
5. Thou shalt show variation in thy data, not in thy design.Avoid confounding design variation with data variation
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Five different vertical scales are used to show the price and two different horizontal scales to show the passage of time without one indication of these changes (not even a scale break)!
(FYI: This qualifies as a “self-promoting graphical duck.”)
26
The Ten Commandments…of good graphics
6. Thou shalt maximize the data-ink ratio in thy graphs.The data-ink ratio = the amount of “ink” used to depict the actual data divided by the total “ink” used to print the graphicThe “Precision Marching Bandof 63 Mosquitoes”:data-ink ratio < 0.6
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Some data that doesn’t fit the pattern, an important observation, is actually obscured by the Marching Band…
27
The Ten Commandments…of good graphics
6. Thou shalt maximize the data-ink ratio in thy graphs.The “Precision Marching Band of 63 Mosquitoes”: remove the elements below
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
28
The Ten Commandments…of good graphics
6. Thou shalt maximize the data-ink ratio in thy graphs.The “Precision Marching Band of 63 Mosquitoes”:
add a few labels and rotate y axis labels and numbers for easier reading
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Data-ink ratio up to ~ 0.9
All data clearly visible
29
The Ten Commandments…of good graphics
7. Thou shalt maximize thy data density and the size of thy data matrix (within reason).
The human eye has the ability to detect large amounts of information in small spaces: take advantage of this phenomenon
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Low data density:•data matrix contains only four entries
• the names (2) and the numbers (2) for the two bars on the right•bar on the left is the sum of the other two
•original graph covers 26.5 square inches•dividing 4 by 26.5 = data density of 0.15 numbers per square inch
NOT GOOD!
30
The Ten Commandments…of good graphics
7. Thou shalt maximize thy data density and the size of thy data matrix (within reason).
Good data density: map of France
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
This map of France was originally 27 square inches (close to that of previous slide). It shows the location and boundaries of 30,000 French communes. To recreate the data of the map would require somewhere in the neighborhood of 240,000 numbers: 30,000 latitudes, 30,000 longitudes, and an average of six numbers describing the shape of each commune. The data density thus works out to be nearly 9,000 numbers per square inch.
31
The Ten Commandments…of good graphics
7. Thou shalt maximize thy data density and the size of thy data matrix (within reason).
Of course, “within reason” applies…
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
32
The Ten Commandments…of good graphics
8. Thou shalt draw the viewer’s eye to the data, not to other design elements.
Use visually prominent graphical elements to show the dataDon’t clutter the interior of the scale-line rectangle with legends, labels, and linesTick marks should generally face outwardUse reference lines only when an important value must be seen across the entire graph, and then use a color, weight and style of line that does not overpower the data symbolsIf data labels are used inside the scale-line rectangle, don’t allow them to interfere with the data or to clutter the graphDon’t put notes and keys inside the scale-line rectangle; notes should go in a caption or the accompanying textWhen datasets are superimposed, choose color, symbol, line weights and styles, and other such graphical elements so that the datasets can be readily visually distinguished
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
33
The Ten Commandments…of good graphics
9. Thou shalt do and redo thy graphs to determine which one telleth the story best.
Experiment: this process is complex and multivariateNot only efficiency, but complexity, structure, density, and even beauty have a role to play in the generation of the final product.JMP, particularly the Graph Builder, is uniquely strong in this ability to “play” with visualization options.
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Simplest:height by weight…
34
The Ten Commandments…of good graphics
9. Thou shalt do and redo thy graphs to determine which one telleth the story best.
“Playing” with Graph Builder
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Wrap by sex:
35
The Ten Commandments…of good graphics
9. Thou shalt do and redo thy graphs to determine which one telleth the story best.
“Playing” with Graph Builder
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Adding overlay by age:
Eh, don’t like that?Hit undo button twice…
36
The Ten Commandments…of good graphics
9. Thou shalt do and redo thy graphs to determine which one telleth the story best.
“Playing” with Graph Builder
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Redo by reversingthe process use:wrap by ageoverlay by sex…
And if you stilldon’t like it orsee clearly the
story in the data,UNDO…REDO!
37
The Ten Commandments…of good graphics
10.Thou shalt not create “unfriendly” but “friendly” data graphics.
Remember your audience
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
38
The Ten Commandments…of good graphics
10.Thou shalt not create “unfriendly” but “friendly” data graphics.
Regarding typography: a quote of Tufte quoting someone else:
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
The concept that “the simpler the form of a letter the simpler its reading” was an obsession of beginning constructivism. It became something like a dogma, and is still followed by “modernistic” typographers…. Ophthalmology has disclosed that the more the letters are differentiated from each other, the easier is the reading. Without going into comparisons and details, it should be realized that words consisting of only capital letters present the most difficult reading – because of their equal height, equal volume, and with most, their equal width. When comparing serif letters with sans-serif, the latter provide an uneasy reading. The fashionable preference for sans-serif in text shows neither historical nor practical competence.
39
A Word about PowerPointThe charges: Cognitive style. Presenter-
focused, not content or audience focused.
Low resolution. Little info per slide - so more slides are needed. Data graphics are weak: average of 12 numbers per graphic.
Bullets. Bullet lists can show only 3 logical flows: sequence; priority; or membership. Multivariate models with feedback and simultaneity can’t be listed. This encourages lazy thinking, generic ideas and ignores critical relationships and assumptions.
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Guilty as charged?
40
A Word about PowerPoint
Yes and no: Some validity to these charges, BUT…they all seem to
ignore the fact that PowerPoint, or any other presentation software, is just a tool.
To blame the tool for its misuse is to kill the messenger for his message.What is needed is not condemnation of the tool but proper instruction of the use of that tool.
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
41
Competitors for Best & WorstThe Best: Minard’s data map + time-series
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Plots 6 (!) variables:1. size of army2/3. location on 2D surface4. direction of movement5. temperature6. dates during retreat (time)
Invasion starts with 422,000men at Polish-Russian border
A sacked and deserted Moscowreached with only 100,000 men
Retreat in dead of winterdepicted on lower darkerband and linked to temp
scale and dates on bottom
…defies “the pen of the historian by its brutal eloquence.”
Only 10,000 made it home!
42
Competitors for Best & WorstThe Worst:
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
Only five pieces ofdata (not variables) in this“graphically preposterous”work of art.
Not one but two axis breaks!
3D effect = chartjunkSince numbersall sum to 100%,plotting both isredundant.
Colors signify nothing
…”delighted connoisseurs of the graphically preposterous.”
43
Conclusion (in Tufte’s words)
“Design is choice. The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper....
“What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult – that is,
the revelation of the complex.”
• Introduction• Terminology• 10 Commandments• PowerPoint• Best/Worst• Conclusion
44
Handy ReferencesCleveland, William S. 1994. The Elements of Graphing Data. Revised Edition. Summit, New Jersey: Hobart Press.Tufte, Edward R. 2001. The Visual Display of Quantitative Information. 2nd Edition. Cheshire, Connecticut: Graphics Press.Huff, Darrell. 1954. How to Lie with Statistics. New York, New York: W. W. Norton & Company.Zumel, Nina. 2009. Good Graphs: Graphical Perception and Data Visualization. http://www.win-vector.com/, accessed 4 June 2010.Pirrello, Chuck. 2010. Effective Visualization Techniques for Data Discovery and Analysis. Cary, North Carolina: SAS Institute, Inc.Few, Stephen. 2009. Now You See It: Simple Visualization Techniques for Quantitative Analysis. Oakland, California: Analytics Press.Cleveland, William S. 1993. Visualizing Data. Summit, New Jersey: Hobart Press.Tufte, Edward R. 1990. Envisioning Information. Cheshire, Connecticut: Graphics Press.Annesley, Thomas M. 2010. Put Your Best Figure Forward: Line Graphs and Scattergrams. Clinical Chemistry 56 (8): 1229-1233.Bessler, LeRoy. 2004. Communication-Effective Use of Color for Web Pages, Graphs, Tables, Maps, Text, and Print. SUGI 29, Montreal, Canada. http://www2.sas.com/proceedings/sugi29/176-29.pdf, accessed 2 July 2010.