data display how to effectively communicate your findings
DESCRIPTION
Data Display How to Effectively Communicate Your Findings. Mary Purugganan, Ph.D. [email protected] http://www.owlnet.rice.edu/~cainproj/. Leadership & Professional Development Workshop March 23, 2007. 0.0004%. 0.05%. 0.7%. The population of the earth. - PowerPoint PPT PresentationTRANSCRIPT
Data Display
How to Effectively CommunicateYour Findings
Mary Purugganan, Ph.D.
http://www.owlnet.rice.edu/~cainproj/
Leadership & Professional Development Workshop
March 23, 2007
Deevey, E. S., Jr. Scientific American (1960) 194–204.
The population of the earth
0.0004% 0.05% 0.7%
Why improve your data presentation?
• To draw accurate conclusions
• To demonstrate professionalism
• To increase your credibility
• To better analyze, synthesize, and understand your data To see hidden relationships
To appreciate limitations, gaps
To formulate new questions
Today’s plan• Examine function and design
Tables
Scatter plots and line graphs
Bar charts, histograms, frequency polygons
Photographs, micrographs
Diagrams
Video clips
• Recognize differences in contexts Written documents
Visual presentations (posters, oral presentations)
• Discuss ethical issues in data display
• Revisit your own work
TablesFunction
Organize complicated data
Show specific results
Known (units) variable/unknown (units)
TablesDesign
Legend
• Place above table contents
• Must contain table number and title
• May contain a caption as well
Avoid rules (gridlines) in small tables
Use rules cautiously in large tables
• Choose narrow and/or gray lines
• Consider blocks of light color instead of rules
Example: Small table
Day, R.A. (1998) How to Write and Publish a Scientific Paper. Phoenix: Oryx Press
Decked heading
Example: Rules in large table
Rules should be narrow, faint, and unobtrusive
J. Donnell, Georgia Tech; http://www.me.vt.edu/writing/handbook
Example: Color bars in large table
Color bars aid readers who may have to, for example, look up and compare values often
J. Donnell, Georgia Tech; http://www.me.vt.edu/writing/handbook
Bivariate graphs
• X/Y axis: independent variable (what you control or choose to observe) vs. dependent variable
• Examples: Scatter plots/ line graphs
Bar graphs/ histograms
Scatterplots and line graphs• Function
Plot two variables; x and y represent actual, continuous space
Good for showing trends / relationships
• Design Avoid legends (keys) off to side in box
• Label lines (best for projected work), or
• Place key in caption or within graph (written documents)
Scatterplot with key in graph
Sanchez et al. (2004) Chem Eng J. 104:1-6
Line graph with key in legend
Day, R.A. (1998) How to Write and Publish a Scientific Paper. Phoenix: Oryx Press
Appropriate for written work, not projection
Revise: Distribution of Extensions based on Wi
0
0.1
0.2
0.3
0.4
0 0.2 0.4 0.6 0.8 1
Fractional Extension
Frequency
0<Wi<5
10<Wi<15
Exercise: How would you revise?
Balanya et al. Science (2006) 313:1773.
Packed graphs: use with caution
Chmiola et al. Science (2006) 313:1760.
Ways to represent data sets
Valiela (2001) Doing Science: Design, Analysis, and Communication of Scientific Research. New York: Oxford University Press.
Ways to represent data sets
Valiela (2001) Doing Science: Design, Analysis, and Communication of Scientific Research. New York: Oxford University Press.
median
Upper/lower quartiles
Min
Max
Bar Graphs
Allow comparisons in values when the independent variable is a classification or category
Dependentvariable
Classification or category
Choose the right graph
If your variables are categorical (distinct, with no intermediates), you cannot plot with a line graph
Nonpoint Source News-Notes 43:5 (1995)
Histograms• Function
Plot frequency vs. intervals of values
Good for seeing shape of the distribution
Good for screening of outliers or checking normality
Not good for seeing exact values (data is grouped into categories)
• Design Bars should touch one another (unlike bar graphs)--
lower limit of one interval is also upper limit of previous interval
Use only with continuous data
Example: Histograms
Fig. 5. Frequency histograms of ΔP2/μ values using different step distances. At a step distance of 10 μ (a) the percent histogram is symmetric, i.e. positive and negative values have similar frequencies. At larger step distances the histograms become broader (50 μ) and then disintegrate (500 μ). Class size: 1 torr.
Baumgartl et al. (2002) Comparative Biochemistry and Physiology 132:75-85.
S. Hofferberth et al. Radiofrequency-dressed-state potentials for neutral atomsNature Physics 2, - pp710 - 716 (2006)
a, For the coherent splitting, a BEC is produced in the single well, which is then deformed to a double well. We observe a narrow phase distribution for many repetitions of an interference experiment between these two matter waves, showing that there is a deterministic phase evolution during the splitting. b, To produce two independent BECs, the double well is formed while the atomic sample is thermal. Condensation is then achieved by evaporative cooling in the dressed-state potential. The observed relative phase between the two BECs is completely random, as expected for two independent matter waves.
Exercise: how would you revise these histograms?
Schuck, P.J. et al. (2005) Chemical Physics. 318:7-11.
Fig. 2. (a) Histogram of total detected TPF photons from single-molecule time traces and an exponential fit to the distribution, yielding an e-1 value of 6024 ± 730 photons. A histogram of single-molecule TPF lifetimes of DCDHF-6 in PMMA is shown in (b). The lifetime distribution is fit to a Gaussian; fit parameters are given in the text.
Frequency Polygons• Function
Constructed from frequency tables
Visually appealing way of showing counts/ frequency
Better than histogram for two sets of data because the graph appears less cluttered
• Design Use a point (instead of
histogram bar) and connect the points with straight lines
May shade area underneath the line
http://www.olemiss.edu/courses/psy214/Lectures/Lecture2/lex_2.htm
Three-variable graphs
• Perspective graphs
• Contour plots
• See Doing Science: Design, Analysis, and Communication of Scientific Data (Valiela, 2001)
http://www.itl.nist.gov/div898/handbook/eda/section3/contour.htm
Kazhdan, D. et al. (1995) Physics of Fluids 7:2679-2685
No chartjunk!Graphical simplicity: keep “data-ink” to “non-data-ink” ratio high
Rate of seedling growth at three different temperatures
0
5
10
15
20
25
30
35
40
45
0 8 16 24
Days of growth
Mean seedling height (mm)
20 C
25 C
30 C
Too much non-data ink
30oC
25oC
20oC
0
5
10
15
20
25
30
35
40
45
0 8 16 24
Days of growth
Mean seedling height (mm)
Emphasis on data
No chartjunk!
• Gridlines
Rarely necessary
Better when thin, gray
0
1
2
3
4
5
6
7
8
9
10
Series1
Series2
Series3
Series4
• Fill patterns
Avoid moiré effects / vibrations
Gray shading is preferable to hatching
• Avoid 3-dimensional bars
Photographs• Function
Good for documenting physical observations
Usually qualitative but supported by quantitative data
• Design
Place title and caption below photograph(s)
Crop and arrange several photographs to facilitate understanding
Insert scale bars when necessary
Shahbazian et al., Neuron (2002)
C.R. Twidale (2004) Earth Sci Rev 67:159-218
Micrographs
Lambert et al. (2004) Virology 330:158-67
Fig. 2. GFP.S co-localizes with wild-type S at the ER. Shown is the intracellular distribution of GFP.S expressed either alone (squares a–c) or together with SHA (squares d–i) in COS-7 cells. Cells were fixed, permeabilized, and examined by fluorescence microscopy. (a, d, and g) GFP fluorescence (green); (b and e) immunostaining with a mouse antibody to PDI followed by AlexaFluor 494-conjugated goat anti-mouse IgG (red); (h) immunostaining with a mouse anti-HA antibody followed by AlexaFluor 494-conjugated goat anti-mouse IgG (red) to visualize SHA. Squares c, f, and i are the corresponding merged images so that overlapping red and green signals appear yellow. Ali et al. (1998) Thin Solid Films 323:105-109
Fig. 3. STM micrographs of Ag (100). (a) 0.1 Å~0.1 area. (b) Edge enhanced image of (a), (c) 500 ÅÅ~500 Å and (d)
100 ÅÅ~100 Å areas, respectively.
Diagrams & drawings
• Function
Show parts and relationships
Focus audience attention to what is essential
• Design
Use color to show relationships and draw eye
Avoid unintentional changes in proportion and scale
Leuptow, R.M. (June 2004) NASA Tech Briefs.
Video clips
• Function Show processes in real-time
Supplement online journal articles
May be qualitative but supported by quantitative data
• Design No conventions yet observed / published
Video clips
Shahbazian et al., (2002) Neuron 35:253-54.
Supplemental movie S2 online at:
http://www.neuron.org/cgi/content/full/35/2/243/DC1/
QuickTime™ and aH.263 decompressor
are needed to see this picture.
Design data display for your context
Written documents
Theses
Manuscripts
Reports
Visual presentations
Seminars/ oral presentation
Posters
Conventions for written documents
• Number and title (caption) each graphic Table 1. Xxxxxxx…
Figure 3. Xxxxxxx…
• Identify graphics correctly Tables are “tables”
Everything else (graph, illustration, photo, etc.) is a “figure”
Conventions for written documents
• Refer to graphics in the text “Table 5 shows…”
“… as shown in Figure 1.”
“… (Table 2).”
• Incorporate graphics correctly Place graphics close to text reference
Caption correctly• Above tables
• Below figures
Tips for written documents
• Design graphics for black-and-white printers and photocopies
• Figure and table captions can be long and informative (follow individual discipline and journal conventions)
• Remember audience when designing Journals: learn as much as possible about
audience to identify needs, areas of expertise
Thesis: design for “outside” committee member
Tips for visual presentations
Uniqueness of posters and oral presentations
• User is not a reader Is not able to assimilate great detail May not have time to process confusing data
• Oral communication accompanies what is printed / projected
• “Free” and “guaranteed” color Use color purposefully Avoid overuse of decorative color Avoid too much color (e.g., background fill) Avoid layering two colors of similar intensity (e.g., red on blue) Be sensitive to red/green color blindness
Replace titles and captions with message headings
Visual explanations
• Tag image with explanations
• Interpret (don’t just show) data (esp. on posters!)
Exercise: How would you revise for PPT?
Farchioni et al. Eur. Phys. J. C (2006) 47:461.
Ethics in data display
Putting data in the best light vs. trying to deceive through display
Data can be
• Distorted (perceived visual effect different from numerical representation)
• Misrepresented (particularly visual data)
• Cooked (selecting from among observations)
– Mendel?
• Trimmed (ignoring extreme values in a data set)
Distortion
Readers do not compare areas in circles correctly
(larger circle does not appear to have the increased area it actually does)
Number of people on Drug A
Number of people on Drug B
Distortion
3-dimensional graphs may fool the eye
0
10
20
30
40
50
60
70
80
90
A B C
Series1
Cleveland’s experiments (1985)
Accuracy in perceiving graphical cues:
Position along axis
Length
Angle / slope
Area
Volume
Color / shade
most accurateperception
least accurateperception
How to avoid distortion
• Show enough data
• Be aware of potential sources of distortion Scale of graph (limits; log)
Placement of origin
Shape (length of axes)
Omission of data range in a continuum (implied continuum)
Linear and logarithmic scales
Schulze and Mealy (2001) American Scientist 89: 209.
Taking a log spreads out small values and compresses larger ones!
Ethics in display of visual data
Photographic data: Particularly vulnerable to trimming field of view selection
cropping
software (e.g., Photoshop) manipulation of contrast, brightness, etc.
• Editorial in Nature (Feb 23, 2006)
“In Nature’s view, beautification is a form of misrepresentation”
Concise guide to image handling in Guides for Authors (Nature family of journals)
http://www.nature.com/nature/authors/infosheets.html
Accessed 10/12/06
Summary
• Consider function when choosing visual
• Follow design conventions
• Adapt visual for context (written vs. visual)
• Design for audience
• Question your data selection and representation; avoid cooking, trimming, and distortion
Resources• Burnett, Rebecca (2001) Technical Communication. Fort Worth: Harcourt College
Publishers.
• Cleveland, W.S. (1985) The Elements of Graphing Data. Wadsworth.
• Technical Writing: Resources for Teaching (esp. Illustration section written by J. Donnell, Georgia Tech). Accessed 11/18/04. http://www.me.vt.edu/writing/handbook/
• Goodstein, David. Conduct and Misconduct in Science. Accessed 11/19/04. http://www.physics.ohio-state.edu/~wilkins/onepage/conduct.html/
• Klotz, Irving M. (1992) Cooking and trimming by scientific giants. FASEB J 6:2271-73.
• Not picture-perfect: Nature’s new guidelines for digital images encourage openness about the way data are manipulated. Editorial. (2006) Nature 439:891-92.
• Tufte, Edward R. (1983) The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press.
• Valiela, Ivan (2001) Doing Science: Design, Analysis, and Communication of Scientific Research. New York: Oxford University Press.
SAMPLES
Fig.1: Loading plot for the first three PCs vs. the assay index
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
fraction of LIVE cells
I I I
20/0 18/5 10/13
% HEA/%AAm
15/15
Cytocompatibility: Direct contact assay
Ave. Peak Force vs. Pulling Velocity for Various Spring Constants
100
120
140
160
180
200
220
240
260
280
300
1 10 100 1000 10000 100000
Pulling Velocity (nm/s)
Ave. Peak Force (pN)
Fernandez
k = 0.017 N/m
k = 0.068 N/m
k = 0.071 N/m
Log. (k = 0.017 N/m)
Log. (Fernandez)
Log. (k = 0.071 N/m)
Log. (k = 0.068 N/m)