effective displays of data need more attention in...
TRANSCRIPT
Effective Displays of Data Need More Attention in Statistics Education
Thomas E. Bradstreet, Ph.D.Experimental Medicine Statistics, Merck Research Labs
Michael Nessly, M.S.Clinical Biostatistics, Merck Research Labs
Thomas H. Short, Ph.D.Mathematics Department, Indiana University of PA
JSMs 2006
2
OutlineMotivationPhilosophy, strategy, approach
Information continuumPerception, design, and constructionEducational objectivesHandson activities, interactive discussion, real data
Course contentParallel coverage for graphs (70%) and tables (30%)Interactive discussionsWorkshops and examples
Student evaluationsQuestions and comments
5
Importance of Graphs and Tables In General
Data analysisGraphs reveal structure and patterns.Tables organize and document findings.
Communicate results from experiments and surveysOral presentationsWritten reports and refereed publications
Target audiencesUnfamiliar with details of the data Less skilled quantitatively More statistically naïveInternal or external to an organization
6
Importance of Graphs and TablesInternal Communications – Industry
Document past activities; summarize ongoing efforts; support decisions on future initiatives
Example: Pharmaceutical research (animal and human)Science is communicated through series of oral
presentations and written reports.Critical review by different scientific disciplines and
levels of management
Competition for resources
Education and trainingInterdisciplinary communication
7
Importance of Graphs and Tables External Activities – In General
Professional Meetings: presentations
Refereed journals: publications
Competitions: best written paper, best oral presentation, best data analysis
8
Importance of Graphs and Tables External Activities – Industry and Academia
IndustryProduct research, development, and marketingProductivity and fiscal healthRecruiting efforts
AcademiaGrant writingSeminars and colloquiaConsulting and contract workStudent internshipsJob interviews
Academic – Industrial collaborations
9
Importance of Graphs and Tables Academic Preparation
Teaching: statistics, many service disciplines
Course work: data analysis, modeling, simulation
Research (Ph.D., M.S., B.S. honors): oral and written presentation
Qualifying exams: RSS Examination Board concerns and guidance
11
Information Continuum
Exploration Understanding Communication Discovery Inference Clarity Insight Decisions Efficiency
Data Analysis Presentation
12
Effective Communication = Perception ∩ Design ∩ Construction
Perception Design
Construction
EffectiveCommunication
13
Educational Objectives
1. Provide exposure to the principles of perception, design, and construction.
2. Be able to construct, revise, critique, and interpret graphic and tabular displays.
3. Take a more informed leadership role in effective communication and strategic decision making.
4. Build upon the intellectual tools and resources provided by the course.
14
Pedagogical Strategy
Workshop and example driven
Interactive discussions
Real examples and dataMerck studiesScientific literatureMass media
16
Course ContentIntroduction
Importance of graphs and tables
Graphs vs. tables vs. text
Context, common sense
“Grables”
Motivating examples
17
“Grables”(More Graph Than Table)
Three Bioequivalence Trials
Trial 1 Trial 2 Trial 3
AU
C R
atio
(Tes
t/Sta
ndar
d)
(1.03)(1.08)
(0.98)(1.01)(1.12)
(0.91)
(1.12)
(1.27)
(0.98)
1.80
0.50
0.80
1.00
1.25
2.32
edd7L May 14, 2004
2.70
18
“Grables”(More Table Than Graph )
Individual (Ο) and Mean ( ) Percent Changes From Baseline
Placebo (Panel A)Placebo (Panel B)0.2 mg Fasted0.5 mg Fasted1.0 mg Fasted2.0 mg Fasted5.0 mg Fasted
10.0 mg Fasted Hour 24
1 0 1Change (%)
0.40.30.60.50.40.30.60.4
0.50.40.00.41.50.60.20.5
0.370.260.260.350.650.310.310.06
0.130.060.220.020.630.280.230.47
65666663N Mean SD Min Max
edd8L May 14, 2004
19
Course ContentDesign and Construction
AnatomyGraphs: Flatland; small multiples; multifunctioning
graphical elements; specific components, …Tables: Vertical and horizontal alignment, specific
components, …Guidelines
Graphs: Effective graphs; erase nondataink and redundant data ink; data density; small multiples, …
Tables: Create a logical visual pattern; rounding numbers, …
Workshops 1 and 2
20
Course Content Perception, clarity, and communication
Graphs: Lie factor, visual area vs. numeric measure, proportionality, aspect ratio, mental subtraction, chartjunk, scales, scale breaks, zero, plotting symbols, reference lines, color, Cleveland’s ordered perceptual tasks, …
Tables: Illustrative, archival/storage, presentation, text, matrix, …
Workshop 3
22
Visual Area vs. Numerical MeasureDot Chart vs. Pie Chart
25 50 70 90 150 180 220
Amounts
H
F
E
A
B
D
C
G
I
Labe
ls
20 40
50220
20
180
25 4090
150
70I A
B
C
DF
G
E
H
edd22L Aug. 9, 2005
23
Proportionality: Data, Lines, CurvesSquare Scatter Plot
NET AUC (pg•hr/mL × 103)
80 280
Placebo
80
280D
rug
D
edd30L June 7, 2004
24
Proportionality: Data, Lines, CurvesDoes physical slope = algebraic slope?
Landscape
Portrait
80 280
Placebo
80
280
Dru
g D
80 280
Placebo
80
280
Dru
g D
edd31L May 24, 2004
26
Avoid Mental SubtractionMean Change in Supine Blood Pressure Following MK and Placebo (left) and Difference Between MK and Placebo Mean Change (right) at Each Time Point
0 2 4 8 12
20
10
0
10
20
Baseline: MK Pbo = 0.2
Baseline: MK Pbo = 1.1
edd47L May 14, 2004
0 2 4 8 1210
0
10
Supine
Baseline: MK=116.6 Pbo=116.8
Baseline: MK=76.1 Pbo=77.210
0
10Mean Change from Baseline
20
10
0
10
20
Difference in Mean Changes
DiastolicB.P. (mmHg)
Supine
B.P. (mmHg)Systolic
Average Change in Supine Blood Pressure following Rizatriptan and Placebo (left)and Average Difference between Rizatriptan and Placebo Change (right)
Hours Postdosing
MK Pbo
MK Pbo
MK
MK
Placebo
Placebo
at Each Time Point
28
Error Bars (No)
0 2 4 6 8 10Time (hrs)
0
100
200
300
400
500
600
P
lasm
a N
icor
andi
l C
once
ntra
tion
(ng/
ml)
edd55L May 25, 2004
10 mg20 mg40 mg60 mg
29
Workshops
Workshop 1: Constructing GraphsAn “Unusual Episode” and “Favorite Datasets”
Workshop 2: Constructing TablesAge discrimination data and “Favorite Datasets”
Workshop 3: Revised Graphs and TablesExisting graphs and tables, internal to Merck,
published in literature, and others
30
Workshop Approach and Benefits
Mix of categorical and continuous examples
Groups of six prepare and present results
Variability of approaches interests class and staff
Variability of backgrounds and responsibilities leads to discussion and division of labor
Workshops break up lecture presentations
35
Student Course Evaluations
Of those completing the course evaluation form …
26/27 (96.3%): Assist them in preparing effective displays of data
28/29 (96.6%): Strongly agree/agree the workshops were interesting, helpful, and fun.
28/29 (96.6%): Rated course as either excellent (62.1%) or good (34.5%).
27/27 (100%): Would recommend this course to others in their discipline with similar job responsibilities.
36
Student Course Evaluations
Some students’ comments:
“I was able to add my comments where necessary in areas I knew about. In areas I was not knowledgeable, I watched and learned.”
“The workshops were really valuable in reinforcing or giving meaning to the things we learned about and to consider when trying to communicate well with graphical displays of data.”
“A great course filling a core need within Merck. I will be discussing with RA management how attendance will improve effective external communications and enhance regulatory interactions.”
37
Student Course Evaluations
“I agree that it (the course) needs to be made mandatory for really many disciplines, Clin. Pharm., Clin. Research, Drug Metabolism, Safety Assessment, BARDS, Reg. Affairs, Epidemiology, WCDMO.”
“I never got this or saw it offered before in school or at Merck.”
“Relevant to the job, relevant to push Merck back to the top of big pharma.”
“Everyone can walk away with something they didn’t know before.”
38
Summary
Both students and working professionals need to be better equipped in creating and interpreting graphical and tabular displays of data.
Participants in the course see the immediate value of their course experience, and they are motivated to continue to improve their skill sets.
Our activity based approach was received well.
In our scientific and regulatory environment, effective displays of data are essential to communication, decision making, and competition. This applies to other environments as well.
39
ReferencesAnscombe, F.J. (1973). “Graphs In Statistical Analysis”, The American
Statistician, 27: 1721.
Becker, R.A. and KellerMcNulty, S. (1996). “Presentation Myths”, The American Statistician, 50: 112115.
Block, J.R., and H.E. Yuker (1992). Can You Believe Your Eyes?, New York: Brunner/Mazel Publishers.
Bradstreet, T.E. (1999). “Graphical Excellence – The Importance of Sound Principles and Practices for Effective Communication”, Bulletin of the International Statistical Institute, Book 2, 52nd Session, Helsinki, Finland, August 1018, 1999, 271274.
Chambers, J.M., W.S. Cleveland, B. Kliner, and P.A. Tukey (1983). Graphical Methods for Data Analysis, Belmont, CA: Wadsworth International Group and Boston, MA: Duxbury Press.
Cleveland, W.S. (1984). “Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging”, The American Statistician, 38(4): 270280.
(1985, 1994). The Elements of Graphing Data, Monterey,CA: Wadsworth Advanced Books and Software and Summit, NJ: Hobart Press.
40
References (1988). The Collected Works of John W. Tukey, Volume V Graphics: 1965
1985, Pacific Grove, CA: Wadsworth and Brooks/Cole Advanced Books and Software.
(1993). Visualizing Data, Summit, NJ: Hobart Press.Cleveland, W.S. and M.E. McGill (1988). Dynamic Graphics for Statistics,
Belmont, CA: Wadsworth and Brooks/Cole Advanced Books and Software.Cleveland, W.S. and R. McGill (1984). “Graphical Perception: Theory,
Experimentation, and Application to the Development of Graphical Methods”, Journal of the American Statistical Association, 79 (387): 531554.
Dalal, S.R., E.B. Fowlkes, and B. Hoadley (1989). “Risk Analysis of the Space Shuttle: PreChallenger Prediction of Failure”, Journal of the American Statistical Association, 84(408): 945957.
Dalal, S. and B. Hoadley (1991). “Comment”, Journal of the American Statistical Association, 86(416): 921922.
David, H. (1998). “Pictures, Please!”, RSS News, 26(1): 7.Ehrenberg, A.S.C. (1977). “Rudiments of Numeracy”, Journal of the Royal
Statistical Society, Series A, 140(3): 277297.Farquhar, A.B. and H. Farquhar (1891). Economic and Industrial Delusions, New
York: G.P. Putnam’s Sons.
41
ReferencesGould, A.L., H. Kaplan, P.A. Lachenbruch, and K. Monti (1999). “Guidelines for
Preparing Effective Presentations”, http://www.enar.org/presentationguidelines.htm.
Kosslyn, S.M. (1994). Elements of Graph Design, New York: W.H. Freeman and Company.
Lavine, M. (1991). “Problems in Extrapolation Illustrated With Space Shuttle ORing Data”, Journal of the American Statistical Association, 86(416): 919921.
Oliver, F. (1998). “How to Present Information in Graphics and Diagrams”, Notes on Behalf of the Examinations Board, Royal Statistical Society.
Pikounis, B., T.E. Bradstreet, and S.P. Millard (2001). “Graphical Insight and Data Analysis for the 2,2,2 Crossover Design”, Chapter 7 in S.P. Millard and A. Krause (eds.), Applied Statistics in the Pharmaceutical Industry with Case Studies Using SPLUS, New York: SpringerVerlag.
Robbins, N.B. (2005). Creating More Effective Graphics, Hoboken: John Wiley & Sons.
Short, T.H. and T.E. Bradstreet (1997, 2001). http://www.math.iup.edu/~tshort/bradstreet/
42
ReferencesSprent, P. (1998). “Conference Presentations Need Improving”, RSS News,
26(4):1213.Tufte, E.R. (1983) The Visual Display of Quantitative Information, Cheshire, CT:
Graphics Press. (1990). Envisioning Information, Cheshire, CT: Graphics Press. (1997). Visual Explanations, Cheshire, CT: Graphics Press. (2001). The Visual Display of Quantitative Information, Second Edition,
Cheshire, CT: Graphics Press. (2003). The Cognitive Style of PowerPoint, Cheshire, CT: Graphics Press.Tukey, J.W. (1977). Exploratory Data Analysis, Reading MA: AddisonWesley
Publishing Company. (1990). “DataBased Graphics: Visual Display in the Decades to Come”,
Statistical Science, 5: 327339. (1993). “Graphic Comparisons of Several Linked Aspects: Alternatives and
Suggested Principles” (with Discussions and Rejoinder), Journal of Computational and Graphical Statistics, 2: 149.
Wainer, H. (1984). “How to Display Data Badly”, The American Statistician, 38: 137147.
43
References (1997). Visual Revelations, New York: Copernicus, SpringerVerlag.
(2005). Graphic Discovery, Princeton: Princeton University Press.
Wilkinson, L. (2005). The Grammar of Graphics, Second Edition, New York: Springer.
44
Acknowledgements
Cindy White (Sr. Statistician Assistant)
Bert Gunter (Biometrics Research)Larry Gould (Investigative Research)
Vanessa Radcliff (Administrative Assistant)
48
Use Common SenseNumber of Ph.D. Degrees
Awarded on College CampusesSales of SuperCaff Soda
on College Campuses
1960 1970 1980 19902
4
6
8
10
Thou
sand
s
1960 1970 1980 19901
2
3
4
5
$ B
illio
ns
edd3L May 14, 2004
49
Motivating ExampleWhat does this graph tell you?
Serum Alkaline Phosfatase
05
101520253035404550556065707580
Placebo 5 mg 10 mg 100 mg
Doseedd5L May 14, 2004
Day1Day2Day3Total
Legend
pvalue = .03
51
Effective Graphs
Serve a defined purpose: Exploration, understanding, communication.
Show the data.Tell the truth.Encourage comparison of different pieces of data.Reveal a large amount of quantitative information in
a small region.Reveal the data at several levels of detail;
effectiveness increases with the complexity of the data.
52
Effective Graphs
Are only as complex as required by the task that they are designed to perform; they avoid pomposity
Provide impact: Communicate with clarity, precision, and efficiency.
Are a visual metaphor for the data
Are closely integrated with statistical and verbal descriptions of the data
53
Erase NonDataInk
45 85 125 165 205
Drug B
45
85
125
165
205
Dru
g A
1
1 11 1
1
11
1
1
22
2
22
22
22
AUC(nMol*hr)NO
45 85 125 165 205
Drug B
45
85
125
165
205
Dru
g A
1
1 11 1
1
11
1
1
22
2
22
22
22
edd13L May 14, 2004
YES
55
Visual Area vs. Numerical MeasureStacked Bar Charts
Clinic by Age Categories: What is going on?
Clinic 1 Clinic 2 Clinic 3 Clinic 40
20
40
60
80
100
Cou
nt
edd25L June 4, 2004
21 to 30 yrs.
41 to 50 yrs.31 to 40 yrs.
51 to 60 yrs.61 to 72 yrs.
Legend reordered
56
Visual Area vs. Numerical MeasureDot Chart
Age Categories by Clinic: What is going on?
0 10 20 30 40 50
Count
21 to 3031 to 4041 to 5051 to 6061 to 72
Clinic 1
Clinic 2
Clinic 3
Clinic 4
Age (yrs.)
edd26L June 4, 2004
57
Visual Area vs. Numerical MeasureDot Chart
Age Categories by Clinic: What is going on?
0 10 20 30 40 50Count
Clinic 1
Clinic 2
Clinic 3
Clinic 4
Age (yrs.)
edd27L June 4, 2004
61 to 7251 to 6041 to 5031 to 4021 to 30
Effective Displays of Data.ppt 179
58
Proportionality:Data, Lines, Curves
Does physical slope = algebraic slope?
Unbroken AxesBroken Axes
NonZero Origin (105, 210)
0 105 110 1150
210
220
230
edd39L Dec. 9, 2005
105 110 115210
220
230
59
Proportionality:Data, Lines, Curves
Does physical slope = algebraic slope?
Unbroken AxesBroken Axes
NonZero Origin (105, 210)
105 110 115210
220
230
0 105 110 1150
210
220
230
edd39L Dec. 9, 2005
60
Avoid Mental SubtractionMean Response Over Time
0
200 Outcome 1
Drug A
Drug B
Outcome 2
Drug A
Drug B
0 50 100 150 200
Days
0
200
Mea
n R
espo
nse
Drug A
0 50 100 150 200
edd42L May 25, 2004
Outcome 4
Drug B
Drug A
Outcome 3
Drug B
61
Avoid Mental SubtractionDifference (Drug A – Drug B) in Mean Responses
0 50 100 150 200Days
100
10203040
Diff
eren
ce
edd44L May 14, 2004
62
Abusive Tick Marks and Labels
0123456789
10111213
% R
eflu
x Ti
me
11.3
7.45.9
2.5
edd51L May 25, 2004
Mean ResponsePlacebo
40mg h.s.
20mg b.i.d.
40mg b.i.d.
63
Abusive Tick Marks and Labels(All that is really needed is …)
Mean % Reflux Time
0
2.5
5.9
7.4
11.3
40 mg b.i.d.
20 mg b.i.d.
40 mg h.s.
Pbo
edd106L Feb. 7, 2006
64
Small Multiples(Do Not Do This)
Mean Change From Baseline in Supine Diastolic Blood Pressure
P 50 100 150 AC
Dose (mg)
16
12
8
4
0
mm
Hg
Day 1, Hour 6
P 50 100 150 AC
Dose (mg)
12
8
4
0
mm
Hg
Day 1, Hour 24
P 50 100 150 AC
Dose (mg)
20
15
10
5
0
mm
Hg
Day 5, Hour 6
P 50 100 150 AC
Dose (mg)
15
10
5
0
mm
Hg
edd48L May 25, 2004
Day 5, Hour 24