data visualization analysis of a highway loss data institute report by drew knoblauch cds 301...
TRANSCRIPT
Data Visualization Analysis
of a Highway Loss Data Institute
report
By Drew Knoblauch
CDS 301
December 12, 2014
Objective
• Evaluate an existing vehicle information report and make critical assessments of the data visualizations.
• Examined factors may include (but not be limited to) scope of data, clarity of message, level of detail, audience, font, color, white space, consistency, annotation, titles and labels.
• Provide alternative visualizations as necessary, with well-reasoned explanations for changes.
Description of Data
The Highway Loss Data Institute vehicle information bulletin examined in this report:
• Highway Loss Data Institute. 2012. Spring 2011 Tornado Losses. Loss Bulletin Vol. 29, No. 6. Arlington, VA.
The complete bulletin is contained in Appendix B.
The raw data is contained in Appendix C.
Visualization Analysis
The Highway Loss Data Institute bulletin subject to analysis herein discusses U.S. automobile insurance losses due to tornadoes. The time periods evaluated are Spring 2007-10 and Spring 2011.
Visualization Analysis
Each data visualization from the bulletin will be examined through the following procedure:
• Description: The data visualization will be shown and described.
• Analysis: A review will be conducted discussing
multiple factors.
• Modification: Potential changes will be discussed, if necessary.
• Modified visualization: Changes will be shown, if necessary and possible.
Procedure
Description 1
Map 1 shows March-June weather-related insurance losses for 2007-10 and 2011. These losses are in dollars and shown by state.
Analysis 1
• Map 1 is the first visualization in this report and as such needs to give an introductory and overall sense of the data to follow. This map does that. Additionally, clear differences in both states and over time can be seen.
Scope
• The scope is appropriate. The entire United States is shown but the visualization clearly highlights that certain smaller sections of the country will benefit from further examination.
Initial Impression
Clarity
• The clarity could be improved. Several of the larger 50 paired bar charts appear to extend into other charts, where some of the very small bar charts are so small as to unclear that they are actually charts. Also, while the large bars clearly demonstrate large losses, extending the bars across state boundaries creates some visual noise.
Analysis 1, cont’d
Level of Detail
• The level of detail may be too fine. The difference in losses over time for the entire United States is shown. A relative value to compare each state might have been simpler to show here but the legend indicates the bar charts are the actual loss values. While the audience for this HLDI report is sophisticated, it might be too much to ask them to “do the math” for a state while also comparing it to another state’s “math”.
• The choice of a lighter color as the background was good. But while the bar chart colors of gold and silver contrast each other adequately, they almost seem to clash with the underlying yellow.
Color
• The text refers to the losses as being automobile claims, but adding “automobile” to the map title might be valuable.
Title
Modification 1
Clarity
• Instead of bar charts in each state, convert the difference in time frame 1 and time frame 2 into a value. This gives 50 values. These may be percentage changes or actual dollars if that level of detail is determined to be necessary. Then depending on how the values deviate, assign a color legend that shows greater differences with greater intensity of that color
• While the legend was not mentioned in the analysis, the first modification requires a second modification, one to the legend. By simplifying the visualization, a value can be added to the legend or title indicating the national values. Not including the “national” on the initial visualization was appropriate though given the complexity.
Legend/Title
Modified Visualization 1
Map 1: Change in March-June weather-related insurance losses from 2007-10 to 2011
National = 46
Description 2
Table 1 shows March-June weather-related losses for 2007-10 and 2011. These losses are shown as claim frequency, claim severity and overall losses by state.
(Only the first section of Table 1 is shown for space purposes. All states are included in the entire table)
Analysis 2
• Table 1 is the underlying data for the initial map, as well as some additional detail. A table is a smart choice for the complex data represented.
Initial Impression
Clarity
• The clarity is fair. With 500 data elements (5 categories, 2 time frames, 50 states), determining which states have extreme values in any given category is not simple, but is viable.
White Space
• While this table is not overly cluttered, one option would have been to follow the Wall Street Journal (WSJ) Guide to Information Graphics rule regarding repetition of units in a table: “It is only necessary to display the units, such as a dollar sign or percentage sign, once with the first entry.” The elimination of dollar signs might create a better white space balance.
Modification 2
White Space
• Dollar signs were eliminated according to the WSJ recommendation to see if a better white space balance was created. That might be the case if there were more dollar signs, but the change in white space balance is not significant here. (In this instance the modified visualization is shown to demonstrate that the modification is not necessarily a substantial improvement)
Title (column header)
• The first column in the table is “Exposure”. This term is a term of art within the automotive insurance industry. However, exposure is not defined in the bulletin. While the audience for this report may be familiar with the term, a footnote was added to both provide the definition and clarify the unit of measure.
Modified Visualization 2
1 Exposure is the length of time a vehicle is insured under a given coverage type and is measured in insured vehicle years
1
Description 3
Figure 1 shows March-June weather-related insurance losses from 2004-11.These losses are shown as claim frequency, claim severity and overall losses.
Analysis 3
• Figure 1 effectively demonstrates that 2011 March-June weather losses were significantly greater than the previous 7 years.
Initial Impression
Clarity
• The clarity is solid. The message, that the major component of the difference was not the value of each claim being larger but the amount of claims being more than twice the average, is clear.
•
• The color choices work. There is sufficient contrast and no distraction.
Color
Modification 3
Modifications to this visualization are not necessary.
Description 4
Map 2 shows the magnitude of 2011 tornadoes in the United States, by county.
Analysis 4
• Map 2 shows the presence and severity of tornadoes in the U.S. While initially it seemed that this visualization was simply intended to show causation for the losses shown in Map 1, it is actually also a general presentation of information that will lead to a specific presentation of information.
Scope
• As with Map 1, the scope is appropriate. The entire country is shown but the visualization clearly points towards the smaller sections of the country that will be the subject of further examination.
Initial Impression
Clarity
• The message of this map is simple and clear: “here is where the tornadoes were, generally.”
Modification 4
Modifications to this visualization are not necessary.
Description 5
Map 3 shows the magnitude of April 25-28, 2011 tornadoes by county in selected states.
Analysis 5
• Map 3 is a detail of Map 2 and shows the presence and severity of tornadoes in a specific section of the country.
Scope
• As with prior maps, the scope is appropriate. The promise of a smaller section of the country to be the subject of further examination is delivered upon.
Initial Impression
Clarity
• The message of this map is simple and clear: “here is where the tornadoes were, specifically.”
Modification 5
Modifications to this visualization are not necessary.
Description 6
Table 2 shows March-June weather-related losses for 2007-10 and 2011 by selected states and counties. These losses are shown as claim frequency, claim severity and overall losses. Major events are also noted.
Analysis 6
• Table 2 takes a detailed look at the data from Table 1. An additional column is added for Major Events, and that is clearly why these counties were selected. The information structure, as with Table 1, suits a table best.
Initial Impression
• The need for Table 2 is questionable, or at the very least, the structure is questionable. Providing the county level data where Major Events occurred is valuable and as previously stated, the loss data is complex enough to necessitate a table. However, since essentially all rows of the table have a major event, eliminating the Major Events column and simply adding “where Major Events occurred” to the title streamlines the visualization.
• Additionally the Major Event information could likely be added to Table 1, possibly by bolding the rows of states where Major Events occurred.
Table Structure
Modifications 6 and 6a
• While not necessary for inclusion in a visualization, the text should include the definition (for the purposes of this bulletin) of “major weather event” as a magnitude 4 or 5 tornado.
Text
• For Table 2, the Major Events column was eliminated and “with major weather event” was added to the title. (6a)
• For Table 1, rows of states with major weather events were bolded and noted. (6b)
Table Structure
Modified Visualization 6a
Modified Visualization 6b
*Bold rows indicate states with major weather events.
Description 7
Map 1 shows, by county, weather-related insurance losses for March-June 2011 and the magnitude of April 25-28 tornadoes.
Analysis 7
Scope
• The scope initially appears appropriate. A subsection of United States is shown, delivering on the promise of earlier visualizations. However, there seem to be two sets of data shown together that do not show a pattern. This can lead to issues with clarity of message.
• Map 4 shows a natural progression towards detailed information. However, why the particular counties that were chosen is not automatically clear. The assumption is that these are the “worst” losses.
Initial Impression
Analysis 7, cont’d
• The choice of color scale from blue to red is effective in demonstrating the magnitude of the tornadoes. However, a shade of blue that might appear to indicate a level of magnitude is used to represent the insurance losses. This presents issues of clarity as well as creating a more monochromatic palette than necessary.
Color
Clarity
• The clarity could be improved. While the large bars clearly demonstrate large losses, extending the bars across county boundaries creates some visual noise. While the intense colors clearly indicate large magnitude tornadoes, assessing this information simultaneously with the loss information is difficult, particularly as no pattern seems obvious.
Modification 7
Clarity
• Instead of showing both magnitude and loss information in the same visualization, separate them into two maps. To show any correlation between magnitude and losses, juxtapose the maps side by side or top to bottom. If there is a clear, similar pattern in both, it should be visible.
• Instead of bars in each county, assign a color legend that shows greater values with greater intensity of that color.
• By separating the maps, the similarity of the blues is less disconcerting. However, it would probably be best to assign a more contrasting color, such as green or yellow, to the bars in the loss value map.
Color
Modified Visualization 7
No modified visualization is provided as the software used to generate the county level detail on this map was inaccessible at the time of this report.
Appendix A
The author of the reports was contacted and asked to offer insight into the visualization decisions from the initial report as well as comment on any suggested modifications.
Comments were provided regarding Analysis 1 and Analysis 7
Author feedback
Appendix A cont’d
Analysis 1 Comments
Using two bars to show the 2007-10 and 2011 weather losses provides more information to the reader than just a thematic map of the percent or absolute differences. The reader can compare 2007-10 average losses, 2011 losses, the percent differences, and the absolute differences. An increase from $100 to $300 is much different than an increase from $1 to $3. The bars for states with very low losses provide little information on the difference between 2007-10 and 2011, but still inform the reader of the low weather losses in these states. For that reason they were not deleted from the map. The colors of the bars were altered from my original colors of orange and blue, but maintain their contrast without being distracting.
Author feedback
Appendix A cont’d
Analysis 7 Comments
The report first examines the entire prime tornado season and then several of the large tornado outbreaks. April 25-28 was one of these large outbreaks. A broader map of the affected counties along with their maximum tornado magnitude was shown first, followed by the map discussed here. This map contains a closer view of the most affected counties and their losses. By placing the loss bars on top of the tornado magnitude information, the reader does not need to cross reference between two maps to examine how the two measures compare.
Author feedback
Appendix BOriginal Report
Appendix CRaw Data
Microsoft Excel 97-2003 Worksheet