data visualization guide
TRANSCRIPT
Climate Science Investigations – BEACON Data Visualization Guide
1 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Climate Science Investigations
Data Visualization Guide for the Berkeley Atmospheric
CO2 Observation Network (BEACON)
A Classroom Guide developed by
Chabot Space & Science Center
For the University of California Berkeley BEACON Project
November 2011
Climate Science Investigations – BEACON Data Visualization Guide
2 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Contents
Introduction ......................................................................................... 4
Purpose ............................................................................................ 4
Sensors ............................................................................................ 4
Visualization ...................................................................................... 4
Greenhouse Gas Overview .................................................................. 5
What can be done with the BEACON sensor data? .................................. 6
Sources of Carbon Dioxide in the San Francisco Bay Area ....................... 7
Using Excel To Crunch Numbers and Make Graphs .................................... 8
Importing to Excel ............................................................................. 8
Examining the Data ........................................................................... 8
Selecting What You Want .................................................................... 9
Mins, Maxes, Averages ..................................................................... 10
If There Are Unrealistic Numbers In Your Data .................................... 12
A Simple Graph ............................................................................... 13
Two-Data Graph .............................................................................. 16
X Versus Y ...................................................................................... 18
Excel Tips and Tricks Summary ......................................................... 19
Different Ways to Graph Data Sets ....................................................... 21
Using Google Earth to locate BEACON sensors and identify potential CO2
sources and sinks ............................................................................... 26
Introduction .................................................................................... 26
Pinpointing a Sensor Node ................................................................ 27
Exercise: Pin a Sensor Node to a Map ............................................. 27
BEACON Sensor Node Site Profile....................................................... 30
Visualizing Data With Tableau Software ................................................. 31
About Tableau ................................................................................. 31
Tableau for Teaching ..................................................................... 31
Tableau Public .............................................................................. 31
Climate Science Investigations – BEACON Data Visualization Guide
3 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Basics ............................................................................................ 32
Installing ..................................................................................... 32
Starting Up .................................................................................. 32
Preparing Data for Import .............................................................. 33
Creating a New Workbook .............................................................. 34
Importing Data ............................................................................. 35
Dimensions and Measures .............................................................. 36
Columns and Rows ........................................................................ 36
Marks .......................................................................................... 38
Measures ..................................................................................... 39
Georeferencing and Map Plotting ..................................................... 46
Time ........................................................................................... 46
General Graphs ............................................................................. 49
Show Me ...................................................................................... 49
Filters .......................................................................................... 51
Recipes .......................................................................................... 54
Recipe 1: Carbon Fingerpainting .................................................... 55
Recipe 2: Blowing Bubbles ............................................................. 56
Recipe 3: Coloring Within the Lines ................................................ 57
Climate Science Investigations – BEACON Data Visualization Guide
4 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Introduction
Purpose
―BEACON‖ (BErkeley Atmospheric
Carbon dioxide Observation
Network) is a project of the
University of California Berkeley to
develop and deploy an array of
miniature sensors that continually
monitor and wirelessly transmit
greenhouse gas (GHG)
concentrations to a central data
server.
With a network of closely spaced sensor nodes measuring GHG
concentrations at frequent intervals, GHG levels over a small geographic
region can be monitored continually and in great detail.
Sensors
The sensors and their supporting
electronics and wireless transmitters are
packaged in protective containers that
can be set up practically anywhere with
available electrical power and wireless
Internet access. These sensor ―nodes‖
are deployed in a grid, covering a
selected geographic area at a desired
spacing.
Each sensor node reports the date
and time, the atmospheric relative
humidity and temperature, and GHG
(primarily CO2) concentration as often
as once each second.
Visualization
Scientific measurements generally
come down to large batches of numbers, and searching for meaningful
Climate Science Investigations – BEACON Data Visualization Guide
5 | P a g e B. Burress – Chabot Space & Science Center
November 2011
relationships between what the numbers represent is often made easier
through visualizations: color-coded text, graphs, and maps, just to mention
some traditional methods—but the ways to visualize changing numbers and
relationships are only limited by the imagination. Shapes, colors, motion,
even sound are also potential modes of conveying numerical information.
Greenhouse Gas Overview
Heat-trapping gases—also
called ―greenhouse gases‖
(GHGs) for how they collect
and contain solar energy as
heat, like inside a
greenhouse--in Earth’s
atmosphere keep our planet
warmer than it would be
without them. Without
these gases—carbon dioxide,
methane, ozone, water
vapor, and others—Earth’s
surface would be on average
about 50 degrees Fahrenheit
colder than they are, and
our planet would be a frozen
ball. We can thank the
presence of these heat-trapping gases for maintaining a livable environment.
Human industrial activity over the last century and a half has added large
amounts of GHGs to the atmosphere —mainly carbon dioxide from the
burning of fossil fuels like coal, oil, and natural gas. This has increased the
atmosphere’s ability to trap and store heat. As a result, the global average
temperature has been steadily rising. It’s a bit like having too many
blankets on your bed: the added insulation traps more heat and can make
you uncomfortably hot.
This change in global climate has had wide ranging effects around the
planet, including rising sea levels as glaciers and land-based ice sheets melt,
shifts in local climates that force plant and animal species to migrate or
Climate Science Investigations – BEACON Data Visualization Guide
6 | P a g e B. Burress – Chabot Space & Science Center
November 2011
become extinct, the spread of insect-borne disease into new areas, increases
in severe weather events such as droughts, hurricanes, flooding, and
tornadoes, and a change in the chemistry of the oceans that have serious
impacts on sea life.
Human response to controlling our own GHG emissions has been slow,
inconsistent, and wrought with challenges. Coming up with ―clean‖
alternatives to meet our transportation, energy generation, manufacturing
and agricultural needs is not simple. While scientists and engineers tackle
the problems of achieving cleaner, ―greener‖ ways to supply our society with
energy we need, the status quo of over a century of fossil fuel burning has
continued as the cheapest means.
How can we know if our efforts achieve the desired results: a slowing or
reduction in average GHG levels?
We can sample the atmosphere at different locations around the Earth, as
we have been for decades. Because air circulation constantly mixes
atmospheric gases, sampling the air in a few places around the Earth can tell
us how the overall, or average, gas concentrations change—but this doesn’t
tell us anything about the actual sources of gas emissions, just the global
result of their output.
To understand the levels of emission from specific sources, and how changes
in their output actually match up with efforts to reduce emissions or
government mandates to do so, GHG concentrations must be monitored with
adequate resolution, both in space and time, to detect local changes.
What can be done with the BEACON sensor data?
The BEACON GHG sensor nodes were created with a very specific
investigation in mind: to continually monitor atmospheric carbon dioxide
concentrations at specific locations across a small geographic region.
Combined with measurements of related atmospheric conditions, geography,
surrounding urban environment, time of day, time of week, and season of
the year, the landscape of raw numbers produced by the sensor grid
becomes a field of exploration to investigate. As with any exploration, start
by asking questions….
How do CO2 levels vary from place to place?
Climate Science Investigations – BEACON Data Visualization Guide
7 | P a g e B. Burress – Chabot Space & Science Center
November 2011
How do CO2 levels vary at different times of the day, week, or year,
and why?
What are the sources of CO2 within the sensor grid?
What factors may cause variations in CO2 levels, from place to place or
time to time?
Are there correlations in CO2 levels with factors like weather, time of
day, traffic, weekday versus weekend?
Sources of Carbon Dioxide in the San Francisco Bay Area
Sources of human-generated CO2 in the San Francisco Bay Area include
cars, factories, oil refineries, and possibly landfills. Another potential source
that is indirectly related to human activity are fires—forest fires, structure
fires, even residential chimney output.
What about CO2 ―sinks‖—places and processes that remove CO2 from the
atmosphere? How might sinks affect the CO2 levels measured by the
BEACON sensor nodes? Forests are CO2 sinks; during photosynthesis, CO2 is
converted into oxygen and plant sugars, and so is effectively removed from
the atmosphere. How large an effect on local CO2 levels might forests have?
How would the effectiveness of a forest CO2 sink change with time of day
and time of year? How might we search for signs of the effect in the BEACON
sensor node data?
Are the ocean or bay CO2 sinks?
What else might affect local CO2 concentrations? Spare the Air days? Wind
patterns? Other weather conditions, like temperature and humidity?
All of these—the sources, the sinks, the patterns of air flow—are what you
will be hunting for as you explore the BEACON sensor grid data. I wonder
what we’ll find….
Climate Science Investigations – BEACON Data Visualization Guide
8 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Using Excel To Crunch Numbers and Make Graphs
Raw numerical data may be viewed, browsed, and ―crunched‖ in a number
of ways. We are going to handle the basic number management and
crunching using Microsoft Excel. Familiarity with using Excel or a similar
spreadsheet program is an advantage, but this section of the guide offers a
basic introduction and how-to manual.
Importing to Excel
First, download a set of data from the data access website. For the purposes
of this practice session, select and download exactly one day’s worth of
data—say, from midnight on one day all the way around to midnight again.
Import the selected data to Excel by selecting Open and browsing for the
raw data file you have downloaded. Excel should load your columns of data
into individual columns in the spreadsheet.
Examining the Data
Take a look at what you have imported into Excel. Become familiar with the
format of the data, the time range covered, the time interval, and what the
actual measurements look like, at a glance.
Here is a sample portion of the data you might be looking at:
Time Relative
Humidity (%) Temperature
(°C) Raw-CO2 (ppm) Calibrated-CO2
(ppm) Slope-CO2
(ppm) Final-CO2
(ppm)
5/20/2011 17:30 57.82 18.96 402 596.0074933 488.52 488.52
5/20/2011 17:32 56.3 19.29 405 599.5199625 492.1 492.1
5/20/2011 17:34 55.49 19.43 400 593.6658471 486.31 486.31
5/20/2011 17:36 55.8 19.43 406 600.6907856 493.4 493.4
5/20/2011 17:38 55.02 19.84 401 594.8366702 487.62 487.62
5/20/2011 17:40 53.93 20.18 407 601.8616087 494.71 494.71
5/20/2011 17:42 53 20.47 406 600.6907856 493.6 493.6
5/20/2011 17:44 52.06 20.76 407 601.8616087 494.84 494.84
5/20/2011 17:46 51.11 20.97 402 596.0074933 489.05 489.05
5/20/2011 17:48 50.72 21.23 396 588.9825547 482.09 482.09
5/20/2011 17:50 50.49 21.18 397 590.1533778 483.33 483.33
5/20/2011 17:52 49.94 21.4 403 597.1783164 490.42 490.42
5/20/2011 17:54 49.81 21.48 406 600.6907856 494 494
5/20/2011 17:56 49.51 21.56 400 593.6658471 487.04 487.04
Climate Science Investigations – BEACON Data Visualization Guide
9 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Your file will contain more than a dozen or so rows. Depending on the
frequency of the data and the range you select, of course, there may be
thousands of rows.
There may also be data parameters that you are not interested in—such as
the CO2 columns labeled ―raw,‖ ―calibrated,‖ and ―slope.‖ In this example,
these are the raw sensor measurements and intermediate numbers
calculated in a process of data calibration that leads to the ―Final CO2‖ value,
which is the data we are interested in. (These intermediate numbers are a
bit like the steps you write down in a mathematical problem where your
teacher asks you to show your work as well as the final result.)
Ask yourself if you can detect any relationships or dependencies between the
data parameters. When one parameter (say, relative humidity) goes up or
down, is there a corresponding behavior in another parameter—say,
temperature, or time, or CO2 concentration?
Selecting What You Want
While you can certainly work with the huge table of raw data that you
acquire, you will find it handy and sometimes necessary to copy out only the
data values and ranges that you are interested in and paste them into a
blank spreadsheet, leaving the raw data file untouched and unchanged, and
also eliminating columns or rows that you will never use (and so are just in
your way!).
In Excel, you can select specific ranges (columns or rows) by holding down
the CONTROL key and dragging the mouse across the cells that you want to
select. If you keep the CONTROL key pressed, you can select additional cells
elsewhere in the table for copying.
For example, let’s say I only want the first 5 rows of data in the table above,
and only the DATE/TIME, TEMPERATURE, and RAW CO2.
First select the first five cells in each of those columns by holding down
the CONTROL key and dragging with the mouse, one column at a time:
Climate Science Investigations – BEACON Data Visualization Guide
10 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Time Relative
Humidity (%) Temperature
(°C) Raw-CO2
(ppm) Calibrated-CO2
(ppm) Slope-CO2
(ppm) Final-CO2
(ppm)
5/20/2011 17:30 57.82 18.96 402 596.0074933 488.52 488.52
5/20/2011 17:32 56.3 19.29 405 599.5199625 492.1 492.1
5/20/2011 17:34 55.49 19.43 400 593.6658471 486.31 486.31
5/20/2011 17:36 55.8 19.43 406 600.6907856 493.4 493.4
5/20/2011 17:38 55.02 19.84 401 594.8366702 487.62 487.62
5/20/2011 17:40 53.93 20.18 407 601.8616087 494.71 494.71
5/20/2011 17:42 53 20.47 406 600.6907856 493.6 493.6
5/20/2011 17:44 52.06 20.76 407 601.8616087 494.84 494.84
5/20/2011 17:46 51.11 20.97 402 596.0074933 489.05 489.05
5/20/2011 17:48 50.72 21.23 396 588.9825547 482.09 482.09
5/20/2011 17:50 50.49 21.18 397 590.1533778 483.33 483.33
5/20/2011 17:52 49.94 21.4 403 597.1783164 490.42 490.42
5/20/2011 17:54 49.81 21.48 406 600.6907856 494 494
5/20/2011 17:56 49.51 21.56 400 593.6658471 487.04 487.04
Notice that I selected the label header cell at the top of each column as well
as the first 5 cells; you don’t have to do this, but it’s a good idea to keep
them for later reference.
Next, COPY the cells
Open a new, blank spreadsheet, click in a cell where you want to place
the data, and PASTE
This is what you should get:
Time Temperature
(°C) Final-CO2
(ppm)
5/20/2011 17:30 18.96 488.52
5/20/2011 17:32 19.29 492.1
5/20/2011 17:34 19.43 486.31
5/20/2011 17:36 19.43 493.4
5/20/2011 17:38 19.84 487.62
Mins, Maxes, Averages
Let’s get some practice with Excel mathematical functions.
What is the minimum, maximum, and average relative humidity on the day
you have selected?
Climate Science Investigations – BEACON Data Visualization Guide
11 | P a g e B. Burress – Chabot Space & Science Center
November 2011
To get the minimum in the range, use the formula ―MIN()‖. In an empty
cell—preferably at the bottom of the column of numbers you are working
on—type:
=MIN(cell range)
Note: always start a function with an ―=‖ sign to tell Excel you’re typing a
mathematical function for it to compute.
―Cell range‖ is typed as the first and last cells in the range separated by a
colon. For example, if the numbers we want to specify start in column B,
row 2—or cell B2—and ends at cell B500, then the MIN() function should be
typed like this:
=MIN(B2:B500)
Once you have typed in the formula and hit ENTER, the formula will be
replaced with the number it has calculated.
Next, compute the maximum value in the range, and the average of the
range. The formulae are (assuming the same range of numbers):
=MAX(B2:B500)
=AVERAGE(B2:B500)
Now that you really know what you’re doing, go ahead and calculate the
minimum, maximum, and average for temperature and CO2 concentration.
Summarize all your findings in this table.
Measurement Minimum Maximum Average
Relative Humidity (%)
Temperature (oC)
CO2 (ppm)
Any surprises? Anything look strange? Do all of the numbers look realistic? I
ask this because you may have detected a number that is not realistic. For
example, you might have found a minimum of something like ―-999‖ in the
Climate Science Investigations – BEACON Data Visualization Guide
12 | P a g e B. Burress – Chabot Space & Science Center
November 2011
CO2 concentration data, and of course a negative number for this is
meaningless.
If There Are Unrealistic Numbers In Your Data
If you did find a non-realistic number using the Excel functions, first scan
through the actual data to find where the offending number is found.
You have just been introduced to one of the ways that scientists, or
computer programmers, signify a data point for which no data exists. It
may be that the CO2 sensor wasn’t working at the time the data point was
sampled, and so there is no measurement for CO2, even though the other
sensors (humidity, temperature) were functioning and reporting data.
You might ask, why not just enter a ―0‖ in cases where no data was taken?
The answer to that is that 0 isn’t a good choice as a ―stand-in‖ data point (or
space filler) because 0 may be a realistic value for that data parameter.
After all, it’s possible to measure 0 parts per million for CO2 concentration,
however unlikely. Likewise, it’s possible to measure 0% relative humidity
and 0 degrees Celsius temperature. Entering a zero to fill in the spot for
missing data could be misleading; how would you know it’s a ―no data‖ filler
and not just an unusual data point?
A wildly unrealistic number, like -999, is used because it is clearly
unrealistic—not only for CO2 concentration, but for relative humidity and
temperature. -999 is not a possible real value for any of those parameters.
Now that you may have detected that there are non-real numbers in your
data that are simply fillers, how do we deal with them in terms of calculating
the average value? Even though -999 is physically unrealistic as a CO2
concentration, it’s still a real number, and when you calculate the average of
the range, Excel includes it in the average. How do you calculate the
average or a range while at the same time telling Excel to ignore the
unrealistic values?
Answer: Use the AVERAGEIF function.
This function averages the numbers in the given range if they meet a
specified condition. In this case, to ignore the unrealistic value of -999 and
average only the real data values, the function might average the numbers
Climate Science Investigations – BEACON Data Visualization Guide
13 | P a g e B. Burress – Chabot Space & Science Center
November 2011
that are greater than -999--assuming that only -999 has been chosen to
stand as the no-data filler.
To be safe, if possible we should set the qualifier at the spot separating the
realistic from the non-realistic values. In the case of CO2 concentration, that
would be anything greater than or equal to 0 is realistic, anything less than
0 not.
The function, AVERAGEIF(RANGE,‖QUALIFIER‖), becomes:
=AVERAGEIF(B2:B500,‖>=0‖)
Note that the range and qualifier are separated by a comma, and the
qualifier is in double-quotes. The qualifier in this case, >=, is how you type
―greater than or equal to‖ in Excel. So, this formula calculates the average
of all the numbers in the range B2 through B500 that are greater than or
equal to 0.
Question: What qualifiers would you choose to compute an AVERAGEIF for
relative humidity and temperature (Celsius), assuming that -999 again is the
no-data filler value?
A Simple Graph
Looking at and crunching numbers is a lot of fun, but if you’re a visually-
oriented person like me, those numbers start to look much friendlier when
they are graphed. Graphing numbers is the first step to visualizing
relationships between data points and parameters. Let’s get started….
A quick way to make a graph with your data is simply to select the data with
the mouse, choose Insert > Chart, and finally choose the type of chart you
want. You might choose a bar graph, a line graph, a scatter graph, or
another, depending on how you want to visualize the data. I recommend
that you start with a scatter graph and see where that takes you.
Climate Science Investigations – BEACON Data Visualization Guide
14 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Here’s the less quick, but more
selective way to make a simple
graph. For this example,
assume that you have data in
two columns of your
spreadsheet: the date/time in
column A and the temperature
in column B, as shown on the
right.
What to do:
Choose Insert, Chart, Scatter Graph; a blank chart should appear.
Right-click somewhere in the blank chart, and in the menu that
appears, choose Select Data.
A dialog box will appear:
For starters, enter the range of the data you want to graph in the
―Chart data range‖ box. If you don’t know the formula that Excel
expects here, then simply click on the button to the right of that field.
A box titled ―Select data source‖ will appear, and at this point all you
need to do is click and drag the mouse to select the data you want to
graph—in this case temperature from cell B2 to B10 in the example
A B
1 Date/Time Temperature(°C)
2 5/20/2011 17:30 16.74
3 5/20/2011 17:32 16.46
4 5/20/2011 17:34 16.54
5 5/20/2011 17:36 16.56
6 5/20/2011 17:38 16.63
7 5/20/2011 17:40 16.5
8 5/20/2011 17:42 16.31
Climate Science Investigations – BEACON Data Visualization Guide
15 | P a g e B. Burress – Chabot Space & Science Center
November 2011
table above. After selecting the cells, click the button to the right of
the Select Data Source field again. Notice that as you drag across the
data, Excel fills in the data range formula for you.
You should now have a quick and simple graph of the selected data. But
we want to graph the temperature versus date/time, so we need to define
the X-axis series:
Return to the Select Data Source dialog box; you should now see the
data series you entered in the left-hand list (under Legend Entries). It
will probably have a default name, like Series1.
Select that series, then click Edit. The ―Edit Series‖ dialog box will
appear.
The Series Y Values formula should already be filled in, since you
already defined the data series.
Here, you can give the data series a name, if you like—something
more interesting than Series1. Type a name for the series into the
Series Name field—something like ―Temperature.‖
Now, define the X-axis values. Click the data range selection button to
the right of the empty Series X Values field, then click and drag the
mouse to select the appropriate cells—in our example, this would be
Date/Time, or cells A2 through A10. Click the selection button again.
Climate Science Investigations – BEACON Data Visualization Guide
16 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Now your graph should be in good shape.
If you’re having trouble reading the date and time because those strings
are overlapping and blurred together, you can change the format of the
X-axis labels by right clicking on the X-axis, choosing Format Axis, and
selecting Alignment. In the dialog box you can play with label directions
and custom angles—play around with that until you have what you like.
Two-Data Graph
To compare two different data series you could simply produce two separate
one-data-series graphs and compare them side by side. Excel lets you
graph two different series of data with common X-values on the same graph,
each with its own Y-axis.
Here’s how to set that up in Excel. In this example, we’ll graph both the
temperature and relative humidity measured over the same date/time
range.
A B C
1 Date/Time Temperature(°C) Relative Humidity(%)
2 5/20/2011 17:30 16.74 66.58
3 5/20/2011 17:32 16.46 66.16
4 5/20/2011 17:34 16.54 67.06
5 5/20/2011 17:36 16.56 66.43
6 5/20/2011 17:38 16.63 66.43
7 5/20/2011 17:40 16.5 66.64
8 5/20/2011 17:42 16.31 67.71
9 5/20/2011 17:44 16.21 70
10 5/20/2011 17:46 16.24 68.61
What to do:
Climate Science Investigations – BEACON Data Visualization Guide
17 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Start by making the simple one-data graph of temperature versus
date/time as you did in the previous example.
Once that is done, right-click in the somewhere graph and choose
Select Data. The Select Data Source box will appear, and you should
see the Temperature data series that you created before.
Choose Add. The Edit Series dialog will appear; repeat the procedure
you did before to define the name, X-values, and Y-values for the data
series that you are adding to the graph. In this example, you will
select cells C2 through C10, relative humidity, for the Y-axis values,
and again select A2 through A10 for the X-axis values.
When you’re done, click OK.
You will see the data series that you just added on the list under Legend
Entries. Click OK.
Now, you will find the second data series plotted on the same graph as
the first. But both series are plotted on the same Y-axis scale, with the
temperatures in Celsius and the relative humidity in percent, placing the
temperature far down near the bottom of the scale and the humidity up
high.
To put the second data series on its own axis:
Right-click on the graphed line for the second data series (relative
humidity), then select Format Data Series from the menu that pops
up.
Under Series Options, select Secondary Axis, then Close.
Now the graph should be a bit more readable, with separate Y-axis scales
for each data series on the left and right sides of the graph.
If you want to adjust the Y-axis ranges for either series:
Right-click on that Y-axis, select Format Axis, and under Axis Options,
select ―Fixed‖ for the Minimum and Maximum values, and type in what
values you want them to have.
Finally, under Chart Tools, Layout, you can select and change the
Chart Title and Axis Titles, giving those appropriate names.
When you’re done, you might end up with something like this:
Climate Science Investigations – BEACON Data Visualization Guide
18 | P a g e B. Burress – Chabot Space & Science Center
November 2011
How much time does the graph cover? Do you see any relationships between
the behavior of temperature and humidity?
X Versus Y
So far you’ve been plotting data series (temperature, relative humidity)
versus date/time to see how those data values change with time. But you
can, of course, plot any data series against any other to see relationships
between the two, if any. Let’s practice doing this by plotting relative
humidity (in the Y-axis) versus temperature (in the X-axis).
First, see if you can do this yourself! If you need help, follow the same
instructions for the One-Data graph that you did before, but choose relative
humidity for the Y-axis and temperature for the X-axis.
After creating appropriate chart and axis labels, you might end up with
something like this:
65.5
66
66.5
67
67.5
68
68.5
69
69.5
70
70.5
16.1
16.2
16.3
16.4
16.5
16.6
16.7
16.8
Re
lati
ve H
um
idit
y (%
)
Tem
pe
ratu
re (
Ce
lsiu
s)
Date and Time
Temperature and Relative Humidity
Temperature
Relative Humidity
Climate Science Investigations – BEACON Data Visualization Guide
19 | P a g e B. Burress – Chabot Space & Science Center
November 2011
What does this graph tell us, if anything? Is there a relationship between the
behavior of relative humidity and temperature in this data? There might be—
at least, it appears that higher humidity occurs at lower temperatures.
There may not be enough data in this sample to suspect a relationship—so
maybe what we need is to plot more data, or from several different times,
and see how the picture takes shape….
Excel Tips and Tricks Summary
To calculate the average of a range of numbers in Excel, choose an
empty cell and type ―=AVERAGE(range)‖. For example, to calculate
the average of the numbers in column B, rows 5 through 100, type
―=AVERAGE(b5:b100)‖.
―MAX()‖ and ―MIN()‖ are two useful functions, to calculate the
maximum and minimum of a range of numbers. For example,
―=MAX(b5:b100)‖.
To jump to the top or bottom of a column of numbers, hold down the
CONTROL key and press the up or down arrow.
To quickly select a large column of numbers, click in the topmost cell
of the number range and press ―SHIFT-CONTROL-DOWN ARROW‖.
To select two (or more) columns of numbers (such as to make a
graph), select the first column normally by clicking and dragging from
65.5
66
66.5
67
67.5
68
68.5
69
69.5
70
70.5
16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8
Re
lati
ve H
um
idit
y
Temperature (Celsius)
Relative Humidity versus Temperature
Climate Science Investigations – BEACON Data Visualization Guide
20 | P a g e B. Burress – Chabot Space & Science Center
November 2011
top to bottom, then while keeping the first column selected, hold down
the CONTROL key and select the second column by click and drag.
To create a graph, after having selected one or more ranges of
numbers, select INSERT, then choose the type of chart you want to
make.
Climate Science Investigations – BEACON Data Visualization Guide
21 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Different Ways to Graph Data Sets
Just to shake things up a bit, let’s make a graph in a different style than the
conventional X-Y scatter graph.
Here’s the assignment: Graph one day (24 hours) of CO2 concentration data
for two or more sensor sites using a radial graph.
Here’s how to do it:
Prepare a data table for each sensor site you are graphing, containing
columns for date/time and final CO2 data, including headers for each
column. For the purposes of making a single graph of the data from
multiple data collection sites, it will be easier to copy each table of
data to the same blank spreadsheet (for example, side by side or one
below the other).
For each data table, copy and paste one row of data for each hour of
the day, so that you have 24 rows of data in each table. Here are a
few of the top rows for one such table:
Berkeley Botanical Garden Time Final-CO2 (ppm)
5/20/2011 17:30 483.1882
5/20/2011 18:30 478.1499 5/20/2011 19:30
… 483.1882
…
For this example I have chosen to create data tables from 5 different sensor
node sites, so at this point I have five tables, each containing 24 rows (not
including the column headers). Also, to keep the example simple, I merely
copied one data point from each hour of the day—17:30, 18:30, 19:30,
etc.—and ignored the data from the rest of each hour. One could also take
the average of the data for each hour and use that instead…
To create the radial graph (or ―radar‖ graph), begin the same way you
created the one-data X-Y scatter graph: select the date/time and final
CO2 columns from the table of data for the first sensor node site.
Insert a Chart of type ―radar‖—choose the style that shows both
plotted lines and data points.
Climate Science Investigations – BEACON Data Visualization Guide
22 | P a g e B. Burress – Chabot Space & Science Center
November 2011
This should create a radial graph--a wheel with 24 ―spokes‖. See the final
graph below. The date/time ―axis‖ is the perimeter of the circle, starting
from the top position and going around the circle clockwise, like the time on
an analog clock dial. In fact, since we have selected one data point per hour
for 24 hours, the radial graph should read like a 24-hour clock dial!
The CO2 data is plotted around the date/time circle, the ―vertical‖ scale
being the radial spokes of the wheel, the center of the wheel being one limit
of the range and the perimeter of the circle being the other limit.
To add the series of data from a second table, follow the same steps
as when you added a second series of data to the X-Y scatter graph:
right-click on the graph and choose SELECT DATA. The dialog box that
appears will not only let you add another data series, but you can edit
the one you already created to give it a name—the name of the sensor
node site.
Once you have added the second series, that one too will be plotted on
your graph, around the circle.
Add each additional data series in the same manner.
Here’s the graph I created from data from 5 different sensor node sites:
Climate Science Investigations – BEACON Data Visualization Guide
23 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Where does the data time series begin—where on the graph, and at
what date/time? Where and when does it end?
What is the range of the radial scale—the scale that the CO2 is plotted
to?
What does this graph tell you about the CO2 concentrations at different
times of the day at different sites?
The same data can, of course, be graphed in an X-Y scatter graph. That
would look something like this:
400
420
440
460
480
500
5205/20/11 17:30
5/20/11 18:305/20/11 19:30
5/20/11 20:30
5/20/11 21:30
5/20/11 22:30
5/20/11 23:30
5/21/11 0:30
5/21/11 1:30
5/21/11 2:30
5/21/11 3:305/21/11 4:30
5/21/11 5:305/21/11 6:30
5/21/11 7:30
5/21/11 8:30
5/21/11 9:30
5/21/11 10:30
5/21/11 11:30
5/21/11 12:30
5/21/11 13:30
5/21/11 14:30
5/21/11 15:305/21/11 16:30
CO2 Concentration
Berkeley Botanical Downtown Berkeley MSRI-SSL CoryHall VLSB
Climate Science Investigations – BEACON Data Visualization Guide
24 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Why choose the radial graph instead of the scatter graph? Does one of them
make it easier to see certain relationships in the data than the other?
Maybe, maybe not—that may depend on what relationship you’re looking
for. Or, it may come down to a personal preference on the style of data
visualization.
I will mention two reasons I might choose the radial graph for this particular
set of data.
1. The circular form is suggestive of a repeating cycle, as opposed to the
ongoing linear time shown by the X-Y scatter graph. If the period of
time you are graphing is a natural cycle, like a day or a year, then the
plotted data ends up in the same part of the cycle that it began.
2. Another thing that this radial graph does is to emphasize the larger
values of CO2 concentration: the trace for the Downtown Berkeley set,
which has the highest values of the bunch, is the biggest, while that
for the VLSB site—the lowest values—is smallest.
400
420
440
460
480
500
520
5/20/11 14:24 5/20/11 19:12 5/21/11 0:00 5/21/11 4:48 5/21/11 9:36 5/21/11 14:24 5/21/11 19:12
Botanical Garden Downtown Berkeley MSRI-SSL Cory Hall VLSB
Climate Science Investigations – BEACON Data Visualization Guide
25 | P a g e B. Burress – Chabot Space & Science Center
November 2011
There are other types of graphs you might have fun trying out on your data.
Some won’t make sense to use, but others may allow you to show aspects of
the data that are not easily revealed by ―conventional‖ graphs. Try some!
Climate Science Investigations – BEACON Data Visualization Guide
26 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Using Google Earth to locate BEACON sensors and identify potential CO2
sources and sinks
Introduction
The data produced by the greenhouse gas sensor network is ―geo-
referenced‖—meaning, it comes along with the geographic coordinates
where the sensor is located. As with any geo-referenced data, it can be
represented, in one form or another, on a map.
For pinpointing sensor node locations and exploring the surrounding
environment, my tool of choice is Google Earth. If you haven’t used Google
Earth, don’t worry; it’s fairly easy to use. It’s also free to download at
www.google.com/earth.
Climate Science Investigations – BEACON Data Visualization Guide
27 | P a g e B. Burress – Chabot Space & Science Center
November 2011
I won’t go through how to use Google Earth in great detail—you can learn a
lot just by playing around with it. Here are a few things for you to try that
will move you on your way to becoming a Google Earth Guru:
Double-click on the Earth globe—anywhere you like. What happens?
Enter a place you want to find in the Search pane on the left. You can
enter a street address, the name of a famous place, or even the
latitude and longitude coordinates of a place. Then click the search
button (magnifying glass). You will get a list of places Google Earth
thinks might be what you’re looking for. Double click the best choice.
The Layers pane contains sets of geo-referenced data that can be
turned on and turned off as needed; spend some time exploring what
is here, and what happens when you turn them on. Some data layers
are ―real-time‖, like Weather and Traffic.
If you want to save a location you have found, right-click on it and
select Save To My Places. That place will be moved into the Places
pane under My Places. As you build a library of favorite places, you
can organize them by creating folders to put them in, by category or
whatever criterion you desire.
Play with the navigation controls at the upper right corner of the
screen. With them you can rotate, move, and zoom your view. The
only way to learn how to use them is to just jump in and start using
them….
Pinpointing a Sensor Node
Now let’s do something practical using Google Earth. Let us say you have
downloaded the CO2 data from one of the BEACON sensor nodes and want to
place it on the map, and find out what’s going on in its vicinity.
Exercise: Pin a Sensor Node to a Map
Given that the geographic coordinates of this sensor are 37.832133° N
latitude and 122.255525° W longitude, let’s go hunting.
In the Search pane, under Fly To, enter the coordinates in the search
box: 37.832133 N 122.255525 W and click the search button.
Double-click on the found location.
Since you entered an unambiguous geographic coordinate, that’s all that
should show up in the list of found places (whereas if you enter
Climate Science Investigations – BEACON Data Visualization Guide
28 | P a g e B. Burress – Chabot Space & Science Center
November 2011
something ambiguous, like ―Wal-Mart‖, you’ll get a long list of all
suspected Wal-Mart stores).
What happened? Where are we?
If you zoom in far enough—and assuming you entered the coordinates
correctly—you should find yourself looking at a sports field somewhere in
Oakland. Look around—what else can you identify? Buildings? Tennis
courts? Swimming pools? Parking lots? Streets?
To help identify where we are, let’s turn on a data layer that might be able
to tell us:
In the Layers pane, check the box for Places. Wait a moment while
Google Earth finds things and adds them to the map.
You should see some icons appear. You can hover the mouse over an
icon to view its label, or click on the icon to call up an information
bubble. See if you can identify where this sports field is located.
Now that you know where the sensor node is located, let’s stick a thumbtack
in the map so that we can find it easily the next time we need to:
Click on the ―thumbtack‖ button found in the top menu icon bar. This
brings up a thumbtack and sticks it on the map, and also a properties
window that sets that thumbtack’s properties: location, name, size,
appearance, and others.
While the properties window is open, a box around the thumbtack will
flash; this means the thumbtack is being edited. You can drag the
thumbtack to any spot you desire while you are editing it—so go ahead
and drag it where you want it. Or, you can just enter the coordinates
in the Latitude and Longitude boxes in the properties window. Either
method is fine—but if you already know the sensor node’s exact
geographic coordinates, it’s probably easier just to type them into the
properties window.
Type a name for your thumbtack in the Name field—for this practice,
call it something like ―BEACON Sensor Node X‖.
If you like, you can change the icon from the yellow thumbtack to
something else. To do this, click on the button to the right of the Name
field and select the icon you want to use.
When you’re done editing the thumbtack, click OK.
Climate Science Investigations – BEACON Data Visualization Guide
29 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Now your thumbtack marker is permanent (until you delete it or edit its
properties). Notice that the marker appears automatically under My Places
in the Places pane. From here, you can move it into another folder you
create, if you like—like a folder called ―BEACON Sensor Nodes,‖ or
something like that.
Explore the node’s surroundings a bit. Zoom in and zoom out, drag the map
around, tilt, rotate—whatever; just explore!
If there’s something you want to investigate up close, you might be able to
use Street View. Most city streets in the United States can be viewed with
Street View. Let’s try it.
If you’re looking at the view above BEACON Sensor Node X as shown
in the picture above, let’s say we want to see what the surroundings
Climate Science Investigations – BEACON Data Visualization Guide
30 | P a g e B. Burress – Chabot Space & Science Center
November 2011
look like from street level—like the street in the lower right corner.
(Which street is that?) All you have to do is click and drag the little
―person‖ icon that’s part of the navigation controls in the upper right
corner and drop it
onto the spot you
want to view
from. If there is
a Street View
image available,
you’ll be dropped
right into a
surrounding
panoramic picture
taken from that
spot. Try it!
(Hint: While
you’re dragging
the little person icon around the map, blue lines will appear over all
streets where Street View imagery exists.)
BEACON Sensor Node Site Profile
Google Earth is a great tool for exploring the area around each BEACON
sensor node, in search of possible GHG sources and sinks. Not only can you
see photographic imagery that reveals streets, residential areas, tree
coverage, industrial areas, and nearby bodies of water, the data layers can
provide specific information about features in the area.
Another piece of information Google Earth can provide us regarding a sensor
node site is the altitude of the location. In the main Google Earth image
view, look to the bottom to find the coordinates (latitude and longitude) and
the altitude of the location the mouse cursor is pointing.
You will explore the regions surrounding sensor node sites in more detail in
one of the investigation exercises.
Climate Science Investigations – BEACON Data Visualization Guide
31 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Visualizing Data With Tableau Software
About Tableau
Tableau data visualization software is a commercial product. It is quite easy
to use to manipulate data and create compelling visualizations. The Tableau
home website is at www.tableausoftware.com. You may take a look at
Tableau by downloading a free 15-day trial version, or using the free on-line
Tableau Public version.
Tableau for Teaching
Though Tableau is commercial software, and requires a purchased license for
most uses, there is a program called Tableau for Teaching in which a limited
term free license may be obtained for use in public secondary schools,
colleges, and universities. For more information on eligibility, terms of use,
and how to qualify for a free TfT classroom license, go to
www.tableausoftware.com/academic.
Advantages and Limitations
The TfT license provides the full-function Tableau desktop software for a
teacher and classroom of students for a single teaching term (quarter or
semester). Its use is limited to in-class teaching instruction and student
project work.
Tableau Public
Tableau Public is a web-based version of the Tableau visualization software.
To get started, go to www.tableausoftware.com/products/public.
Advantages and Limitations
Tableau Public is free to everyone, may be used for an indefinite period, and
has most of the features of the desktop version. Workbooks are stored on a
Tableau web server and not your computer.
The amount of data that may be imported into a single Tableau Public
workbook is limited to 100,000 rows, and the total amount of workbooks
that may be stored on the web server is limited to 50 megabytes. This
should be ample space for most student projects.
Climate Science Investigations – BEACON Data Visualization Guide
32 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Basics
Installing
All installation directions can be found on the Tableau website at the links
given earlier, whether you choose the Desktop or Public versions. This
training guide will focus on the free Tableau Public web-based version,
though the desktop version functions very similarly.
Starting Up
Below is a screenshot of the Tableau Public startup page—what you see
when you click on the icon created when you installed Tableau Public. This
is where you create new workbooks or open workbooks you have already
made. The graphical icon labeled ―TestBook‖ in this screenshot is an
example of an existing workbook that I created on my account. Below the
Open Data and workbooks icons section are helpful links to training,
templates, examples and tutorials.
Climate Science Investigations – BEACON Data Visualization Guide
33 | P a g e B. Burress – Chabot Space & Science Center
November 2011
This training will lead you through some of the basic actions and functions
you need to import, work with, and produce visualizations of data. We will
not attempt to cover every menu command and feature Tableau contains;
Tableau’s Help guide is very useful for answering questions we will not
cover.
Preparing Data for Import
You may import data from different sources, such as spreadsheets and
databases. The Desktop version of Tableau can handle many data sources,
while the Public web-based version can only import from ―flat‖ data sources,
like simple spreadsheets and formatted text files. We’ll be working with flat
data files: downloaded csv (―comma-separated values‖) files or Excel
spreadsheets that you have put together.
After downloading the raw data, assemble the selected data you want to
work with into a single Excel spreadsheet. The easiest way to do this is to
copy and paste from the source file to a blank spreadsheet. Make sure that
your data contains columns for latitude and longitude, and that you copy in
the coordinates for each data collection site into the proper files before
assembling your master file. The master file can contain data from some or
all of the data source sites. When you are done, the final file should contain
continuous columns of data from all desired sites.
The table below is a sample compiled data spreadsheet, with five data points
from each of two different collection sites. Your table, depending on the
scope of your project, may contain thousands of rows from dozens of
collection sites!
Climate Science Investigations – BEACON Data Visualization Guide
34 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Name Time Relative
Humidity (%)
Temperature (°C)
Final-CO2 (ppm)
Latitude Longitude
Botanical Gardens 5/20/2011 17:30 66.58 16.74 483.1882306 37.87515 -122.23861
Botanical Gardens 5/20/2011 17:32 66.16 16.46 479.1575977 37.87515 -122.23861
Botanical Gardens 5/20/2011 17:34 67.06 16.54 479.1575977 37.87515 -122.23861
Botanical Gardens 5/20/2011 17:36 66.43 16.56 479.1575977 37.87515 -122.23861
Botanical Gardens 5/20/2011 17:38 66.43 16.63 481.1729141 37.87515 -122.23861
Cory Hall 5/20/2011 17:30 54.99 19.48 463.9747733 37.87523 -122.25755
Cory Hall 5/20/2011 17:32 54.58 19.66 458.7626394 37.87523 -122.25755
Cory Hall 5/20/2011 17:34 54.37 19.64 459.8050662 37.87523 -122.25755
Cory Hall 5/20/2011 17:36 54.24 19.62 458.7626394 37.87523 -122.25755
Cory Hall 5/20/2011 17:38 54.21 19.58 462.9323465 37.87523 -122.25755
… … … … … … …
Creating a New Workbook
To create a new workbook from the main start page (shown earlier), either
click on the Open Data button, or select from the menu bar ―File> New.‖
You may also create a new workbook, or open an existing one, from the File
menu of any workbook.
Below is a screenshot of a Tableau workbook. This is what it looks like
―empty,‖ before you import any data or create any visualizations. Refer to
this picture when going through the overview below.
Climate Science Investigations – BEACON Data Visualization Guide
35 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Importing Data
Once your Excel spreadsheet of assembled data is finished and saved to
your computer, you can import it into your empty workbook by choosing:
Data > Connect To Data > Microsoft Excel > OK
An Excel Workbook Connection dialog window will appear.
Step 1: Browse for you Excel data file and Open it
Step 2: Select Single Table, then the Sheet number of your data within
the Excel file
Step 3: Choose Yes or No depending on whether the first row of data
in your spreadsheet contains the data field header names
Climate Science Investigations – BEACON Data Visualization Guide
36 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Click OK
Your data should proceed to load. When it is
done, you will see items appear in the
Dimensions and Measures windows on the left
side of your workbook.
Dimensions and Measures
When the data is loaded into your workbook,
Tableau assigns the data fields as either
Dimensions or Measures. In the language of
conventional data graphing, Dimensions are
independent variables and Measures are
dependent variables.
So, Measures are the values that are functions
of one or more Dimensions.
Tableau usually assigns fields containing time,
date, and text as Dimensions and fields
containing numeric values as Measures.
You can reassign data fields as you need to.
For example, in the data I have loaded, the
latitude and longitude fields have been assigned
to Measures, but I would rather treat the
location coordinates as independent variables,
and reassign them as Dimensions.
To do this, I can either right-click on each of
them and select Convert to Dimension, or I can
simply click and drag them from the Measures
window to the Dimensions window. When I’m
done, both Latitude and Longitude fields will
appear in the Dimensions window.
Columns and Rows
Now, take a look at the elements of the workbook labeled Columns and
Rows, and the table-like graphic with the ―Drop field here‖ labels. These are
where you place your data when you want to create a graph.
Climate Science Investigations – BEACON Data Visualization Guide
37 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Data placed in Columns will be assigned to the horizontal axis of the graph,
and data placed in Rows is applied to the vertical axis. Don’t confuse the
Columns and Rows in the Tableau graphing area with the columns and rows
of data in your original spreadsheet.
To assign any data field from the Dimensions and Measures windows to the
visualization, you can drag and drop the field into the desired location—
either the bars labeled Columns and Rows above, or directly into one of the
―Drop field here‖ spaces in the graphic.
Also, you can let Tableau help you decide by simply double clicking on a data
field; Tableau will then place that field in the visualization where it thinks it
should go. This is not always where you might want it, but since moving
data around is as easy as dragging and dropping, it’s simple to make
changes and experiment with many different arrangements. You can learn a
lot simply by throwing data fields around the page! Try it….
Exercise: Quick Map Plot
I previously assigned my Latitude and Longitude fields to the Dimensions
window, making them independent variables.
Double-click on the Latitude Dimension
Double-click on the Longitude Dimension
See the screenshot below to see what happened when I did it and see if you
got the same result.
My results: The Longitude Dimension was placed in the Columns field, the
Latitude in the Rows field, and a map with five blue dots appeared, along
with a menu for Map Options.
Tableau recognized the nature of the Latitude and Longitude fields and
automatically assigned them to the axes to make map-sense: with
longitude along the horizontal axis and latitude in the vertical. It also
automatically produced a geographical plot of the locations.
See what happens if you drag the Latitude to Columns and the Longitude to
Rows (or, click the Swap button along the top menu bar). The map goes
away and is replaced by a more conventional looking graph. So, the
Climate Science Investigations – BEACON Data Visualization Guide
38 | P a g e B. Burress – Chabot Space & Science Center
November 2011
geographical display only appears automatically if the coordinate fields are in
the appropriate axes.
Marks
The Marks window controls the nature of the plot symbols in the graph (in
our present example, the map).
You can control the color and the size of the plot points by clicking in the
Color field and dragging the bar below the Size field, respectively. You can
change the transparency or the color of the border or even add a halo to the
plot symbols using the dropdown menu next to Color. And you can set the
type of graph (line, point, bar, pie, etc.) using the dropdown at the top of
the Marks window.
But where Tableau really becomes interesting is when you manipulate plot
color and size and labels by assigning other data fields to those qualities.
Let’s do an exercise to see what I mean.
Climate Science Investigations – BEACON Data Visualization Guide
39 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Exercise: Data-Defined Plot Symbols
In the Measures window—the dependent variables that I loaded from my
master spreadsheet—there are Final CO2 concentration, Relative Humidity,
and Temperature. To express these values on the geographical map that
was created in the previous exercise, we can assign one or more of them not
to one of the plot axes, but to a plot symbol quality—like color or size, or
even as a text label.
So, let’s make some assignments. I’ve decided that I want temperature to
be represented by the color of the circles, and the size of the circles to
represent the relative humidity.
Drag and drop the Relative Humidity field into the Size bubble of the
Marks window
Drag and drop the Temperature field into the Color bubble
What happened? Tune into the next section, Measures, to find out.
Measures
In our last episode, we dragged Relative Humidity and Temperature into the
Size and Color bubbles in the Marks window. This is what happened:
Climate Science Investigations – BEACON Data Visualization Guide
40 | P a g e B. Burress – Chabot Space & Science Center
November 2011
While we don’t see much, if any, difference in the sizes of the plot symbols,
the colors do show some variation: different shades of green (or gray, if you
are reading a black and white print).
Also, two new windows showed up under the Marks window: one labeled
―SUM(Relative Humidity),‖ with a stack of partial circles and associated
numbers, and the other ―SUM(Temperature)‖ with a numbered color scale.
I’ll mention here that you can change the range of sizes and colors by
clicking the dropdown menu at the top of each of these new windows, but
first I’d like to focus on a matter of grave importance.
If you point the mouse at one of the plot symbols on the map and hover
there, an information window appears that reveals all of the data parameters
of that point—latitude, longitude, relative humidity, and temperature. Take
a look at the values of the last two—the dependent variables.
Knowing what the value of any single measurement of either of those
parameters should be, do the reported values for these two make sense?
Climate Science Investigations – BEACON Data Visualization Guide
41 | P a g e B. Burress – Chabot Space & Science Center
November 2011
The key to the puzzle is in the ―Measure‖ of the data being viewed—not to
be confused with the term Measure used earlier to mean dependent variable.
Since the plot point for each geographic location contains all the data points
for each parameter you loaded, we have to define how those ranges of data
points are handled before being plotted.
Tableau’s default Measure for a range of multiple values is to Sum them—
add them all up. That’s why those data fields, now in the Color and Size
bubbles, show the function
SUM(parameter).
You can choose a different Measure
(data handling function) by clicking
the dropdown menu at the right end
of each of the MEASURE(parameter)
icons. Click the dropdown arrow,
then click Measure.
You will get a menu of the available
functions—Sum, Average, Median,
Minimum, Maximum, Deviations,
Counts, etc. When you select one,
that function that will be applied to
the data.
Exercise: Change a Data Handling Function
As an exercise, let’s change the function for both Temperature and Relative
Humidity to something other than SUM. For now, I’m going to choose
AVERAGE.
Click the dropdown menu icon at the right side of the SUM(Relative
Humidity) data icon
Click Measure on the menu that appears
Click Average from the next menu
Do the same thing for the SUM(Temperature) data icon
Now, back to the appearance of the plot we got. First, I’ll mention that in
the Map Options menu to the right of the workbook, I chose to select Streets
and Highways as well as Place Names, so these items have been added to
Climate Science Investigations – BEACON Data Visualization Guide
42 | P a g e B. Burress – Chabot Space & Science Center
November 2011
my base map—in case you were wondering where that information came
from. There will be more discussion of Map Options later.
Exercise: Customize Data Point Appearance
Let’s customize the color scale of the temperature data a bit:
Clicking the dropdown at the top of the Temperature color scale
window, select Edit Colors. Here’s what I get:
I can change the color of the existing palette by clicking the green square
and choosing a different color. I can also choose a different color palette by
clicking the dropdown menu under Palette.
Click the Palette dropdown menu
Choose the color palette labeled ―Red-Blue Diverging‖
You may notice that in the color scale preview, Tableau has assigned the red
end to the lower temperature values and the blue end to the higher
temperatures. This isn’t as intuitive as I’d like, so:
Check the box labeled ―Reversed‖
Clicking OK, what do we get? If you’re following along, you will see the
change—but I’m not going to waste space with another screenshot until
we’ve dealt with the plot symbol sizes, right now:
Click the dropdown menu at the top of the Relative Humidity symbol
sizes window and click Edit Sizes. Here’s what we get:
Climate Science Investigations – BEACON Data Visualization Guide
43 | P a g e B. Burress – Chabot Space & Science Center
November 2011
First, you can select how the sizes of the plot symbols vary through the top
dropdown: automatically, by range, or from zero. The option that gives the
greatest control over the range of sizes is ―by range.‖
Select ―By range‖
Now, the ―Mark size range‖ slider has two controls, one to set the smallest
symbol size and the other the largest.
Drag the Smallest and Largest size slides around until you are satisfied
with the size range
Click Apply when
you want to see the
change
What I settled on is
shown in the picture to
the right.
Notice first that the
changes I made to the
Measure data handling
function is shown:
AVG(parameter). So,
what the map is showing
for these two dependent
Climate Science Investigations – BEACON Data Visualization Guide
44 | P a g e B. Burress – Chabot Space & Science Center
November 2011
variables is the average value of the data—and of course, without further
manipulation, these would be the averages of all the data points in the entire
time series.
So now, in a glance, we see the locations of the five sites I have chosen, the
average temperature measured at each site (color), and the average relative
humidity (size). If the data is accurate, what this tells us is the sites
farthest to the east (in the Berkeley Hills) had the highest average humidity
and lowest average temperatures, and the downtown Berkeley and UCB
main campus sites had higher average temperature and lower average
humidity.
Not bad, for starters. We can do a lot more. Keep in mind that the
techniques covered in this example can be applied to all other forms of
graphs.
Exercise: Add a Third Dependent Variable to the Map
As a final exercise for this section, let’s add one more thing to the mix.
Drag and drop the AVG(Relative Humidity) data field from Size to
Label in the Marks window
Drag the Final-CO2 data from the Measures window to the Size bubble
in Marks and change its data handling function from the default SUM
to AVERAGE
Climate Science Investigations – BEACON Data Visualization Guide
45 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Now the meaning of the plot symbol sizes has changed; no longer do they
represent different levels of average humidity, but CO2 concentration.
Humidity, instead, is expressed through numerical labels. We can now look
for geographic dependencies on all three measured dependent variables.
Exercise: Text Labels
The ―Label‖ Mark bubble is also useful for displaying text labels. As a quick
extension of the previous exercise, remove the AVG(Relative Humidity) from
the Label bubble (either by right-clicking it and selecting ―Remove‖ or
dragging it back to the Measures window) and drag in the ―Name‖ item from
the Dimensions window and drop it into the Label bubble. My result:
Climate Science Investigations – BEACON Data Visualization Guide
46 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Georeferencing and Map Plotting
You’re already well into plotting georeferenced data on maps; the preceding
exercises dealt with many of the basics.
The Map Options menu lets you change the color scheme of the base map
with three options: normal, gray, and dark.
You can also add to the base map any of the checkbox items in the list.
Mostly, these items—borders, labels, and street maps—help with the visual
location of data sites, but don’t show the stunning detail of Google Earth.
Used in concert, however, Google Earth and Tableau can do a lot.
Another useful addition to your base map can be found in the Data Layer
dropdown menu, from which you can choose from a number of
demographical data maps.
Exercise: Add a Demographic Data Layer to Your Map
Choose Population from the Data Layer dropdown under Map Options
In the expanded window that appears, choose ―Block Group‖ from the
―By‖ menu
You may also change the color palette through the ―Using‖ menu
Population density is now shown by block groupings.
Time
Let’s throw in another independent variable, or Dimension: Time.
Climate Science Investigations – BEACON Data Visualization Guide
47 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Time data in our uploaded data set contains date and time, down to hours
and minutes.
Like the other data fields, we can drag and drop Time anywhere we like,
even though some arrangements will not make sense. Tableau dares you to
be adventurous.
Exercise: Add Time to the Visualization
With the data still being shown on the map as in the last exercise:
Drag the Time field from the Dimensions window and drop it into Rows
What happens?
What should have happened: a column labeled Year of Time appeared
alongside the map, with the year of the data set inside the column. Not very
useful yet? Let’s keep going….
Hover your mouse over the header ―Year of Time‖; you should see a
―+‖ symbol appear
Click the ―+‖
What happened?
A second column, labeled
―Quarter of Time,‖ should
have appeared, the column
displaying the Quarter of the
year (Q1, Q2, Q3, or Q4) that
the data was taken in. The
Quarter may not be very
useful to our visualization, but
let’s keep going….
Hover the mouse over
Quarter of Time and
click its ―+‖; a ―Month
of Time‖ column
appears
You can keep expanding the
Climate Science Investigations – BEACON Data Visualization Guide
48 | P a g e B. Burress – Chabot Space & Science Center
November 2011
time by clicking the ―+‖ symbols; Month expands into Day, Day into Hour,
Hour into Minute, as long as those time divisions are present in the data set.
You may also collapse the breakout by clicking on the ―-― symbol of an
expanded time division bubble.
When you reach a point where the data in your set is spread across more
than one time division, the data will be divided into multiple rows, each row
representing the smallest time division.
If you find that you don’t want to display every one of the available time
divisions on your graph, there’s an easy way to show only the ones you’re
interested in. Here’s how:
First, collapse the time bubbles back to years by clicking the ―-―
symbol in the YEAR(Time) bubble
Right-click on the YEAR(Time) bubble and select from the pop-up
menu which time division you want displayed—choices are Year,
Quarter, Month, Day, and
More. More contains a sub-
menu with Hour, Minute, and
Second. For this exercise,
select Day.
You will end up with a single time
bubble--DAY(Time)—and a single
time column in your plot: Day of
Time. You may change this single
column to a different division in
the same way (right-click it), or
you may expand to HOUR by
clicking ―+‖ as before.
See the picture for an example
result. What this portion of the
example shows is the mapped data
for hour 17:00 and 18:00 on the
20th day of the month.
Recall that our data Measures were
Climate Science Investigations – BEACON Data Visualization Guide
49 | P a g e B. Burress – Chabot Space & Science Center
November 2011
set to AVERAGE of temperature, humidity, and CO2 levels, so what each map
plot in this example shows us is the average of these for each hour.
If you expand further, to MINUTE(Time), each row will show a map plot of
the average data values for each minute.
Use the scrollbar to the right to scroll through all the map plots in the data
range.
General Graphs
And now for something completely different.
We’ve had some practice with plotting georeferenced data on maps. If you
recall, what got us started on map plotting was when we double clicked the
Latitude and Longitude Dimensions and saw Tableau place them in the Rows
and Columns of our plot and automatically treat those Dimensions as
geographic locations, creating a map.
But if you also recall, when we deliberately dragged the Latitude to Columns
and Longitude to Rows, Tableau did not create a geographic map, but a
simple graph. That, in general, is what happens when you drag data into
your plot—the automatic creation of a geographical map was a special case.
If you start with a blank workbook and start clicking on data values, you can
see where Tableau decides to put them, or
you can drag the data where you like:
Columns, Rows, Label, Color, and Size.
Tableau will [attempt to] plot your data as
you choose.
Show Me
You can change the style of your
visualization and how the data is
represented manually, as we’ve been
doing. You can also change an existing
visualization to another style quickly by
using the Show Me button.
Show Me contains a library of pre-defined
data visualizations. When you click Show
Climate Science Investigations – BEACON Data Visualization Guide
50 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Me, a menu will appear showing the library, some of which may be grayed
out. Tableau lets you choose visualizations it thinks will make sense based
on the data you’ve already placed in your current visualization.
Here’s the fun part: try some! See what you get. Some may not work out
perfectly, but you may stumble upon one that really pops—and maybe
shows you relationships or patterns in your data that you didn’t see before.
Hovering over one of the library icons brings up a quick description of the
visualization type.
Exercise: A Quick Plot Style Change Using Show Me
Choose the ―Line (Discrete)‖ icon—first column, third row of the library
This is what I got:
Climate Science Investigations – BEACON Data Visualization Guide
51 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Notice how Tableau arranged the data: Date/Time went into Columns, all
three measured parameters went into Rows, and Latitude and Longitude
were thrown into a box called Level of Detail.
Each measured parameter is graphed versus time in its own row, and there
is a data curve for each of the five locations in my data set. Tableau also
placed the hours scale at the top and the minutes scale on the bottom.
The Level of Detail window allows you to separate the data in a series by
another data parameter—in this case each Latitude-Longitude pair. In this
case, this has produced a separate curve for each data site location.
Continuing to play around with the data:
Drag the Longitude field in Level of Detail and drop it into Color in the
Marks window
What did this do? Each data curve still represents one of the five data
collection sites, but they are now color coded by Longitude.
There are many other Show Me’s try out, and many nuances of settings and
functions in Tableau that you might learn given enough time. But for now,
before we use what we’ve learned to proceed with some investigations,
there’s one last tool you should be familiar with: Filters.
Filters
You can place a Filter on any data parameter and specify exactly how you
want that data to be filtered—what values or ranges of values in a data
series to include in the visualization.
Exercise: Apply a Data Filter
First, let’s set up our earlier visualization showing the data plotted on the
map. You can start that from scratch if you like, or rearrange the
visualization from the last exercise to get there:
Drag Longitude to Columns
Drag Latitude to Rows
Drag Time to Rows (make sure to collapse time into a single bubble
before dragging—or find out what happens when you drag DAY(Time)
and leave HOUR(Time) behind)
Drag AVG(Relative Humidity) to Label
Climate Science Investigations – BEACON Data Visualization Guide
52 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Drag AVG(Temperature) to Color
Drag AVG(CO2) to Size
You should now be back at the earlier map
we visualized in the exercise from Time
section, showing plotted maps for every hour
of the day that you can scroll through from
top to bottom.
Now, let’s say that we’re only interested in
seeing the data taken at, say, a certain hour
of the day or with values within a certain
range.
Let’s apply a filter to HOUR(Time) so that we
only see plots for the hour of 12 (noon in the
24 hour time format used in our data):
Right-click on the HOUR(Time) bubble
Select Filter from the pop-up menu
In the General
tab of the window that
pops up, click ―None‖
to uncheck all the
hours
Check the box
next to 12
Click OK
You should now be
looking at maps
showing your data for
only the hour of 12 for
each day in the data
set, as in the picture
below.
Take this one step
further by adding a
Climate Science Investigations – BEACON Data Visualization Guide
53 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Filter for CO2. Let’s only plot CO2 data with values greater than 460 ppm:
Right-click on AVG(CO2)
Select Filter
From the window that pops up, click ―At Least‖ and enter ―460‖ in the
text field at the lower (left) end of the slide scale (or drag the slider to
460)
Click OK
I don’t know what you got, but on my map, some of the data points
vanished! Those ones, presumably, are the ones whose average values were
less than 460 ppm.
Climate Science Investigations – BEACON Data Visualization Guide
54 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Recipes
Just to give you some ideas, here are a few quick visualization ―recipes‖ I
cooked up. In case you’re reading a black and white print, be warned, these
look much better in color....
Climate Science Investigations – BEACON Data Visualization Guide
55 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Recipe 1: Carbon Fingerpainting
Drag Time to Columns
Drag AVG(CO2) to Rows
Drag AVG(Temperature) to Color
Drag AVG(Relative Humidity) to Size
Edit Color scale with blue for cooler and red for warmer temperature
Edit Size scale to allow ease of seeing differences across the humidity
range
Edit Time axis limits to show only one full day of data
Look for trends in CO2 levels and how they correlate with weather conditions
and time of day.
Keep in mind that the plotted data is for all data collection sites, so trends
and other behaviors may be regional effects, or arise from changes at
specific sites whose locations are not visualized, and so contribute
anonymously.
Climate Science Investigations – BEACON Data Visualization Guide
56 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Recipe 2: Blowing Bubbles
Drag Time to Columns
Right-click Time and select DAY(Time)
Drag AVG(Relative Humidity) to Rows
Drag AVG(Temperature) to Color
Edit Colors with blue showing coolest and red showing warmest
temperatures
Drag AVG(CO2) to Size
Edit Size for easy viewing of CO2 differences
Keep in mind that the data plotted is from all collection sites, averaged over
an entire day.
Climate Science Investigations – BEACON Data Visualization Guide
57 | P a g e B. Burress – Chabot Space & Science Center
November 2011
Recipe 3: Coloring Within the Lines
Drag Time to Columns
Drag AVG(CO2) to Rows
Drag AVG(Temperature) to Color
Edit Color scale with blue as cooler and red as warmer temperature
Edit vertical scale (right click on it and Edit) and set scale range to
400-500 ppm
Look for a correlation between CO2 levels (data curve) and temperature
(color of data curve).
There is an obvious temperature correlation with time of day, as might be
expected, shown by the repeating color patterns: cold at night, warm at day.
The measured CO2 levels also have a daily repeating pattern, as well as a
longer term trend.
Climate Science Investigations – BEACON Data Visualization Guide
58 | P a g e B. Burress – Chabot Space & Science Center
November 2011
The question may be asked, are the changing CO2 levels dependent on
temperature, or maybe on one or more other factors with a daily cycle?