supplementary information appendix: supplementary methods · 2019. 2. 22. · 3 1 weather tweets...

22
www.pnas.org/cgi/doi/10.1073/pnas. 116 1 Supplementary Information Appendix: Supplementary Methods 1 Data Sources and Processing 2 1.1 Weather Data 3 Data on daily maximum temperature and total daily precipitation for the period 1981-2016 are 4 from the PRISM data set, gridded at 0.25 degrees resolution (1). Gridded data are aggregated to 5 the county level (spatial weighting) and then averaged over weeks to give county by week 6 observations. We focus on maximum temperature rather than average or minimum 7 temperatures since they occur during the day and therefore are likely more relevant for 8 behavior on social media. We also include data on percent cloud cover and relative humidity, 9 also averaged to the county by week level, from the NCEP Reanalysis II (2). 10 Temperature projections under RCP 8.5 (used in Figure 4) are 40 realizations of the Community 11 Earth System Model (CESM1) Large Ensemble Project (3). Annual temperatures are aggregated 12 to the national level weighting by 2015 population density from CIESIN (4). 13 1.2 Twitter Data 14 All tweets between March 2014 and the end of November 2016 geolocated within the 15 continental United States were downloaded from the Twitter API (geolocated tweets exclude 16 retweets). Tweets with device locations within the continental US were identified using a 17 bounding box filter. The total sample is 2.18 billion tweets, coming from 12.8 million unique 18 users. The number of geolocated tweets is gradually increasing over this time-period, with the 19 exception of a sharp drop in late 2014, likely associated with a change in the Twitter’s opt-in 20 policy for geolocating tweets (SI Appendix, Fig. 1). Geolocated tweets are less than 1% of total 21 tweets, and therefore likely represent a non-random sample of both tweets and users (5). 22 Limited information about the demographics of users opting in to geolocation compared with 23 the Twitter population, however, mean any effect of this sampling on our estimates is unclear. 24 Tweets can be geolocated either by specific coordinates or by a place description (e.g. a coffee 25 shop, city, county, or state). Tweets geolocated only by state would introduce measurement 26 error into our estimates. A review of a random sample of 26 million tweets, however, showed 27 1816541

Upload: others

Post on 21-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

www.pnas.org/cgi/doi/10.1073/pnas. 116

1

Supplementary Information Appendix: Supplementary Methods 1

Data Sources and Processing 2

1.1 Weather Data 3

Data on daily maximum temperature and total daily precipitation for the period 1981-2016 are 4

from the PRISM data set, gridded at 0.25 degrees resolution (1). Gridded data are aggregated to 5

the county level (spatial weighting) and then averaged over weeks to give county by week 6

observations. We focus on maximum temperature rather than average or minimum 7

temperatures since they occur during the day and therefore are likely more relevant for 8

behavior on social media. We also include data on percent cloud cover and relative humidity, 9

also averaged to the county by week level, from the NCEP Reanalysis II (2). 10

Temperature projections under RCP 8.5 (used in Figure 4) are 40 realizations of the Community 11

Earth System Model (CESM1) Large Ensemble Project (3). Annual temperatures are aggregated 12

to the national level weighting by 2015 population density from CIESIN (4). 13

1.2 Twitter Data 14

All tweets between March 2014 and the end of November 2016 geolocated within the 15

continental United States were downloaded from the Twitter API (geolocated tweets exclude 16

retweets). Tweets with device locations within the continental US were identified using a 17

bounding box filter. The total sample is 2.18 billion tweets, coming from 12.8 million unique 18

users. The number of geolocated tweets is gradually increasing over this time-period, with the 19

exception of a sharp drop in late 2014, likely associated with a change in the Twitter’s opt-in 20

policy for geolocating tweets (SI Appendix, Fig. 1). Geolocated tweets are less than 1% of total 21

tweets, and therefore likely represent a non-random sample of both tweets and users (5). 22

Limited information about the demographics of users opting in to geolocation compared with 23

the Twitter population, however, mean any effect of this sampling on our estimates is unclear. 24

Tweets can be geolocated either by specific coordinates or by a place description (e.g. a coffee 25

shop, city, county, or state). Tweets geolocated only by state would introduce measurement 26

error into our estimates. A review of a random sample of 26 million tweets, however, showed 27

1816541

Page 2: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

2

92% of tweets were geolocated using either exact coordinates (45.6%) or places descriptions 1

more precise than the county level (46.4%). Based on the fraction of tweets with coordinates at 2

state centroids, we further estimate that the fraction of the remaining 8% of tweets geolocated 3

by state is small (less than 1%), meaning geolocation measurement error is unlikely to 4

substantially affect our results. 5

Tweets discussing weather were identified using a simple bag-of-words approach. If the tweet 6

contained one of a list of words (given in Supplementary Methods) it was classified as a 7

‘weather tweet’. A total of 60.1 million weather tweets were identified, representing 2.8% of 8

the sample. We tested our classification using a manual classification of 6,000 tweets. We are 9

particularly concerned with classification accuracy that varies systematically with the variation 10

used to identify parameters in the regression analysis (i.e. the residual variation in temperature 11

after regressing on all control variable and fixed-effects, shown graphically in SI Appendix, Fig. 12

2). If classification accuracy is systematically different for tweets about unusually hot 13

temperatures compared to unusually cold temperatures, then this could bias our coefficient 14

estimates. In contrast, classification errors that are uniform across the sample, conditional on 15

regression fixed-effects and controls, will add noise but not bias to our estimation. 16

Therefore, we used a stratified sampling scheme to identify tweets for validation. We first 17

identify county-weeks associated with unusually hot or cold temperatures, conditional on all 18

fixed-effects and controls by identifying county-weeks in the top and bottom 2.5% of the 19

residual distribution from a regression of county-week temperature on baseline temperature, 20

all control variables, and fixed effects. The county-weeks in the tails of this residual distribution 21

are those with largest influence in the estimation of the effect of temperatures and 22

temperature anomalies on posting about the weather (Supplementary Methods, Section 2). 23

Therefore, contrasting the classification accuracy for tweets from county-weeks in the hot and 24

cold tails of this distribution allows us to identify any systematic errors that will bias estimation 25

of the effect we are interested in. This is therefore the focus of our validation exercise. 26

Using this sample of high leverage county-weeks, we randomly selected 3,000 tweets each 27

from the set of hot and cold county-weeks, evenly divided between those we classified as 28

Page 3: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

3

weather tweets and those we classified as not about weather. This sample was classified 1

manually into weather / not weather tweets using workers on Amazon Mechanical Turk. Each 2

worker classified 150 unique tweets and each tweet was classified by 3 different workers (for a 3

total of 18,000 classifications). 4

SI Appendix, Table S1 shows the results of this validation. Although the fraction of false 5

positives in our automated classification is high (~46%), there is no evidence for systematic 6

differences in the classification accuracy for hot vs cold temperatures. This suggests that 7

classification errors should not strongly bias our results, except in that they introduce 8

measurement error and so may bias results towards zero, meaning we would be reporting 9

under-estimates of the true effect. The false negative fraction is negligible (<0.5%) and is the 10

same at both tails of the distribution. 11

1.2.1 Sentiment Analysis 12

We use sentiment analysis techniques to measure the emotional content of tweets. We focus 13

on two sentiment analysis tools in particular: VADER (Valence Aware Dictionary and sEntiment 14

Reasoner) and LIWC (Linguistic Inquiry and Word Count). Both of these tools are lexicon-based, 15

meaning that they consist of a dictionary with words scored as either positive, negative, or 16

neutral. Sentiment analysis is conducted by averaging the scores of the words in the tweets 17

that correspond to the lexicon of each tool. VADER is specifically designed to analyze text from 18

social media (6), while LIWC provides a more general dictionary built from the evaluations of 19

human judges (7). VADER additionally includes a rule-based component that modifies the 20

estimates sentiment of words based on their relationships to other words in the phrase, which 21

improves the precision (6). 22

Prior to analyzing the sentiment of the tweets, we exclude any tweets which include any of the 23

following weather-related words, to avoid picking up statements directly referencing changes in 24

the weather: 25

blizzard, breeze, chilly, clear, clouds, cloudy, cold, damp, dew, downpour, drizzle, drought, dry, 26

flurry, fog, freezing, frigid, frostbite, frosty, gail, gust, hail, heat, hot, humid, hurricane, icy, 27

lightning, misty, moist, monsoon, muddy, overcast, pouring, precipitation, rain, rainbow, 28

Page 4: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

4

showers, sleet, snowflakes, soggy, sprinkle, sunny, thunder, thunderstorm, typhoon, weather, 1

wet, wind, windstorm, windy 2

VADER and LIWC both provide separate measures of positive and negative sentiment. We 3

obtain the average positive and negative scores for each measure, and then construct a 4

composite score for each, the difference between the positive and negative scores for each 5

tool. Sentiment analysis is performed at the core-based statistical area (CBSA) level, rather than 6

the county level, matching previous work on the relationship between temperature and 7

sentiment (8). There is some evidence that CBSA sentiment is more consistent and less noisy 8

then county-level aggregation of sentiment, possibly because some metro-areas are split over 9

multiple counties. SI Appendix, Fig. S10 reproduces Figure 3, but using sentiment data 10

aggregated to the county level. Findings for the cold sample are robust to the level of 11

aggregation but the decline in sentiment at hot temperatures in the hot sample is not 12

reproduced with this different aggregation scheme. 13

14

List of words used to identify discussion of weather: 15

arid, aridity, autumnal, balmy, barometric, blizzard, blizzards, blustering, blustery, blustery, 16

breeze, breezes, breezy, celsius, chill, chilled, chillier, chilliest, chilly, cloud, cloudburst, 17

cloudbursts, cloudier, cloudiest, clouds, cloudy, cold, colder, coldest, cooled, cooling, cools, 18

cumulonimbus, cumulus, cyclone, cyclones, damp, damp, damper, damper, dampest, dampest, 19

deluge, dew, dews, dewy, downdraft, downdrafts, downpour, downpours, drier, driest, drizzle, 20

drizzled, drizzles, drizzly, drought, droughts, dry, dryline, fahrenheit, flood, flooded, flooding, 21

floods, flurries, flurry, fog, fogbow, fogbows, fogged, fogging, foggy, fogs, forecast, forecasted, 22

forecasting, forecasts, freeze, freezes, freezing, frigid, frost, frostier, frostiest, frosts, frosty, 23

froze, frozen, gale, gales, galoshes, gust, gusting, gusts, gusty, haboob, haboobs, hail, hailed, 24

hailing, hails, haze, hazes, hazy, heat, heated, heating, heats, hoarfrost, hot, hotter, hottest, 25

humid, humidity, hurricane, hurricanes, icy, inclement, landspout, landspouts, lightning, 26

lightnings, macroburst, macrobursts, meteorologic, meteorologist, meteorologists, 27

meteorology, microburst, microbursts, microclimate, microclimates, millibar, millibars, mist, 28

Page 5: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

5

misted, mists, misty, moist, moisture, monsoon, monsoons, mugginess, muggy, nor'easter, 1

nor'easters, noreaster, noreasters, overcast, parched, parching, precipitation, rain, rainboots, 2

rainbow, rainbows, raincoat, raincoats, rained, rainfall, rainier, rainiest, raining, rains, rainy, 3

sandstorm, sandstorms, scorcher, scorching, shower, showering, showers, sleet, slicker, 4

slickers, slush, smog, smoggier, smoggiest, smoggy, snow, snowed, snowier, snowiest, snowing, 5

snowmageddon, snowpocalypse, snows, snowy, sprinkle, sprinkling, squall, squalls, squally, 6

storm, stormed, stormier, stormiest, storming, storms, stormy, stratocumulus, stratus, 7

subtropical, summery, sun, sunnier, sunniest, sunny, temperate, temperature, tempest, thaw, 8

thawed, thawing, thaws, thermometer, thunder, thundering, thunderstorm, thunderstorms, 9

tornadic, tornado, tornadoes, tropical, troposphere, tsunami, turbulent, twister, twisters, 10

typhoon, typhoons, umbrella, umbrellas, vane, warm, warmed, warms, weather, wet, wetter, 11

wettest, wind, windchill, windchills, windier, windiest, windspeed, windy, wintery, wintry 12

Regression Analysis 13

The general regression specification used in this paper is as follows: 14

log(Wcwmys) = f(Tcwmys, T̅cwms) + Precipcwmys + Humidcwmys + Cloudcwmys15

+ log(Userscwmys) + δms + θy + ϑc + εcwmys 16

The dependent variable is the log of the number of weather tweets in county c, in week w, in 17

month m, in year y, in state s. (For clarity in subsequent equations, the month and year 18

subscripts are omitted). Using logs requires us to drop any county by week observations that 19

have no weather tweets. In total this is 55,279 county weeks, or 12.9% of the initial sample. The 20

remaining sample size is 373,625 county weeks. 21

The number of weather tweets is modeled as a function of maximum temperature (Tcwmys) 22

and an average of temperature in previous years in that county at that time of year (T̅cwms). 23

The exact functional form used varies and is described below. Additional control variables are 24

the average daily precipitation (Precipcwmys), the average relative humidity (Humidcwmys), and 25

the average % cloud cover (Cloudcwmys). We control for the large differences in the number of 26

Twitter users across counties using log of the number of users in each county and week 27

Page 6: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

6

(Userscwmys). A slightly different set of weather controls are available for the sentiment 1

regressions. Controls in these regressions, following previous work (8), are linear and quadratic 2

controls for average precipition and dew point temperature. 3

In all regressions, a set of fixed effects control for unobserved variation: state by month-of-year 4

fixed effects (δms) control for any state-specific intra-annual seasonal differences, year fixed-5

effects (θy) control for average differences across years in the sample (2014, 2015, and 2016) 6

related to, for instance, Twitter penetration, and county fixed-effects (ϑc) control for all 7

unobserved, time-invariant differences between counties. SI Appendix, Fig. S2 shows 8

graphically how these fixed-effects determine the residual variation in temperature used to 9

identify our model coefficients. Residuals (εcwmys) are clustered at the state level, allowing for 10

both spatial correlation between counties in the same state and for correlation within a state 11

over time. Fixed-effects and treatment of standard errors are common across all regressions 12

presented in this paper. 13

1.1 Dynamic Non-Linear Model 14

A finite-dynamic lag model is used to estimate the timescale on which perceptions of weather 15

events adjust (9, 10). For this specification we focus on anomalies relative to the reference 16

period (i.e. we defined the temperature anomaly as Acwy = Tcwy-Bcw) and allow the response 17

to vary flexibly as a function of the magnitude of the current temperature anomaly and the 18

history of previous anomalies experienced in that county at that time of year. A concern here is 19

that this model estimates a common response to a particular temperature anomaly, but 20

Supplementary Figure 3 shows this response is not common across the whole temperature 21

distribution (for example, cold anomalies are more remarkable at cold temperatures but less 22

remarkable at hot temperatures). This heterogeneity arises because the response curves are 23

not symmetric about the reference temperatures. We therefore split the sample into regions in 24

which the same temperature anomaly might reasonably be expected to have a similar effect on 25

social media posts. At cold temperatures, we estimate a response for the coldest quarter of the 26

sample, based on temperatures in the reference period (i.e. places colder than 13.6°). At hot 27

temperatures, we limit the sample to both hot and humid locations, based on the fact that the 28

Page 7: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

7

same temperature might feel very different physiologically depending on the relative humidity. 1

Our hot and humid sample is locations with reference temperatures in the top quarter of the 2

sample (28.3°C) and relative humidity above 80%. These thresholds correspond to county-3

weeks with a Heat Index of approximately 92°F (33°C), the level at which NOAA recommends 4

“extreme caution”. At these physiologically-uncomfortable conditions, temperature anomalies 5

are likely to elicit similar social-media responses across the subsample.(11). 6

Because the response to particular temperature anomalies differs depending on whether the 7

temperature is cool or warm (i.e. the response is not symmetric about the reference for all 8

temperatures, see SI Appendix, Fig. S3a), we split the data in order to estimate separate 9

responses for the coolest and warmest quarter of the sample. The warmest sample is also 10

subset further into county-weeks with high humidity (relative humidity > 80%). 11

The dynamic non-linear model estimates an interaction between two smooth functions – one 12

of the magnitude of the anomaly and one of lagged history of exposure to anomalies: 13

log(Wcwy) =∑f(Ac,w,y−k) ∗ g(k)

𝑘

14

Where Ac,w,y−k is the temperature anomaly experienced in county c in week-of-year w, k years 15

ago. Values of k range between 0 (i.e. current temperature) and 15 (i.e. temperature 15 years 16

ago). Functions f() and g() are smooth, continuous functions and their interaction allows the for 17

the effect of a particular temperature anomaly to vary non-linearly and to vary as a function of 18

how long-ago it was experienced. Our preferred specification uses a cubic polynomial for f() and 19

a cubic spline with three internal knots for g() (knots at 0, 0.9, 2.3, 5.9 and 15). The former uses 20

3 degrees of freedom and the latter 5, so the interaction surface estimated in the regression 21

uses 15 degrees of freedom. Decay of the effect of temperature anomalies that we identify in 22

Figure 2, with opposite effects for warm and cold anomalies, is robust to these choices for the 23

cold sample (SI Appendix, Fig. S5). The model is estimated including all controls and fixed-24

effects described above and standard errors are clustered at the state level. Given the 25

estimated surface, the lag coefficients and standard errors are predicted for integer 26

temperature anomalies in the central 90% of the temperature anomaly distribution (-6 and +6 27

Page 8: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

8

for the cold sample and -4 to +3 for the hot sample) and for lags between 0 and 15. Lag 1

coefficients for + / - 3 degrees are shown in SI Appendix, Fig. S4. These temperature anomalies 2

are chosen for the plot because they are inside the support of the anomaly distribution for both 3

samples (-3 degrees is the 20th percentile of the cold sample temperature anomalies and +3 4

degrees is the 95th percentile of the hot sample). Figure 2b and 2d shows the cumulative sum of 5

these lag coefficients at a specific temperature anomaly, representing the contemporaneous 6

effect of a temperature anomaly on weather tweets in places with different histories of 7

exposure to that anomaly. 8

For analysis of how sentiment changes in response to continued exposure to temperature 9

anomalies, we estimate a similar dynamic, non-linear model using the sentiment data at the 10

core-based statistical area (CBSA) by week level. The dependent variable is the average 11

composite sentiment score (positive sentiment score minus negative sentiment score) in that 12

CBSA-week, calculated using two different sentiment algorithms. The sample is similarly divided 13

into cold and hot and humid subsets. Control variables in the regression are average rainfall, 14

the square of average rainfall, and the dew-point temperature. Fixed effects and treatment of 15

standard errors are the same as the weather speech regressions. The functional form for the f() 16

and g() functions describing the instantaneous and lagged effect of temperature anomalies are 17

as described above. 18

Description of Interactions Model (SI Appendix, Fig. S3, SI Appendix, Table S2): 19

Our first set of models allow the effect of temperature to differ as a function of reference and 20

recent temperatures using a set of interaction terms in the estimating equation. 21

We allow the effect of temperatures to vary non-linearly with the reference (1981-1990) 22

climatology, which accounts for the fact that people’s response to weather might be mediated 23

by the kinds of conditions that might be expected in that location at that time of year: 24

log(Wcwy) = β1Tcwy + β2T2cwy + β3Bcw + β4B

2cw + β5TcwyBcw + β6T

2cwyBcw25

+ β7TcwyB2cw + β8T

2cwyB

2cw 26

Page 9: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

9

Where Bcw is the average temperature of county c in week-of-year w in the reference period 1

and other variables are as defined above. This specification fully interacts both the linear and 2

squared terms of the actual temperature (Tcwy) and the reference temperature (Bcw), allowing 3

both the location of the quadratic minimum (or maximum) as well as its steepness to vary non-4

linearly with reference temperature. Results are shown in SI Appendix, Fig. S3a and SI 5

Appendix, Table S2, column 2. 6

To test the effect of decadal climate trends in mediating the effect of observed temperatures, 7

we add the difference between reference (1981-1990) and recent (average of the previous 5 8

years) periods as an explanatory variable to the quadratic model. This specification uses the 9

exogenous variation shown in Figure 1 to test whether counties that have had recent 10

experience of unusually hot or cold temperatures (relative to the reference period) respond to 11

weather differently than counties that have not. Our specification is: 12

log(Wcwy) = β1Tcwy + β2T2cwy + β3Bcw + β4B

2cw + β5TcwyBcw + β6T

2cwyBcw13

+ β7TcwyB2cw + β8T

2cwyB

2cw + β9(Rcwy − Bcw) + β10(Rcwy − Bcw)Bcw14

+ β11(Rcwy − Bcw)Tcwy 15

Where (Rcwy − Bcw) is the difference in the county week temperature between the recent and 16

reference periods. 17

Results of this model are shown in Supplementary Figure 3b and Supplementary Table 2, 18

column 3. All estimated effects are statistically significant in the expected direction. Recent 19

experience of warming (i.e. positive (Rcwy − Bcw)) increases the number of weather tweets at 20

cold temperatures (positive β9) but decreases it at hot temperatures (negative β11) (i.e. cold 21

temperatures have become more remarkable and hot temperatures less remarkable), with that 22

effect mediated in the expected direction by reference temperatures (positive β10). 23

24

Relationship Between Temperature Anomalies and Belief in Climate Change 25

Page 10: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

10

Data on county-level fractions of the population answering “yes” or “no” to the question “Do 1

you think that global warming is happening?” were obtained from the Yale Program on Climate 2

Change Communication (12, 13). These opinion data pool survey data collected between 2008 3

and 2018 and fit a multi-level regression model explaining variation in the responses as a 4

function of demographics, geographic and political variables. Regression results are combined 5

with county-level data to project opinion at the county level (13). 6

We model these opinion data as a function of mean temperature anomalies over the period of 7

the surveys (2008-2017), calculated either relative to the empirically-estimated shifting baseline 8

or to the fixed reference period baseline (1981-1990). Standard errors are clustered at the 9

state-level to allow within-state autocorrelation. Results are robust to the inclusion of state 10

fixed-effects, which control for all average differences between states. Results are shown in 11

Supplementary Table 3. Since the two measures of temperature anomalies are strongly 12

correlated (ρ=0.70), identification in models that include both measures of temperature 13

anomalies comes from places that have recently experienced different climate trends, 14

compared to the average since the reference period (i.e. places that have either recently 15

warmed faster than average, or places that recently cooled but overall have warmed since the 16

1980s). 17

18

Page 11: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

11

Supplementary Information Appendix: Figures and Tables

Figure S1: Number of tweets per week geolocated in the continental United States

Hot Anomalies Manual Classification

Weather Not Weather

Automated Classification Weather 806 694

Not Weather 6 1494

Cold Anomalies Manual Classification

Weather Not Weather

Automated Classification Weather 811 689

Not Weather 5 1495

Table S1: Results of the manual validation of 6,000 tweet classifications. Tweets were randomly selected from

county-weeks with unusually hot and cold temperatures after controlling for all regression controls as well as

reference temperature. Each tweet was classified by three different people and the modal classification used in

the validation. Additional information in SI AppendixMethods.

Page 12: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

12

Figure S2: Graphical depiction of residual variation in temperature used in the regression model, for Cook County

IL. Raw temperature values are shown in grey. County fixed-effects remove the mean for each county over the

period of twitter data to center the temperatures around zero (green line). State by month-of-year fixed-effects

remove the seasonality for the state. This residual variation (purple line), interacted with average temperatures in

the reference and recent time periods is used to identify model coefficients.

Page 13: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

13

Figure S3: Effect of reference temperatures and recent changes on number of weather tweets. a) Effect of

average weekly daytime temperature on tweeting for three different reference temperatures corresponding to the

25th, 50th, and 75th percentile of the sample. Curves are shown for +/- 8 degrees from reference, which includes

>97.5% of the observed weekly average temperature anomalies in our sample. Because the dependent variable is

logged, movement along the y axis can be interpreted as % change in the number of weather posts. The histogram

shows the distribution of reference temperatures in the sample. b) Percent change in the number of weather

tweets in response to contemporaneous temperatures (x-axis), shown for a location that has warmed from 22°C to

25°C (+3°C) between the reference (1981-1990) and recent (last 5 years) time periods. Dashed lines show the 95%

confidence interval. Regression coefficients are given in Table S2.

a)

b)

Page 14: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

14

Naïve Model Informed Model Change Model

Tmax -2.952e-2*** (4.000e-3)

-1.800e-2*** (2.692e-3)

-1.808e-2*** (2.676e-3)

Tmax2 6.941e-4*** (7.991e-5)

7.702e-4** (2.431e-4)

8.663e-4*** (2.506e-4)

Reference -4.707e-3 (4.054e-3)

4.424e-3 (4.930e-3)

Reference2 1.108e-3** (3.989e-4)

1.064e-3* (4.172e-4)

Tmax * Reference -3.496e-3*** (5.268e-4)

-3.839e-3*** (5.415e-04)

Tmax2 * Reference 7.912e-5*** (1.073e-5)

8.124e-5*** (1.054e-5)

Tmax * Reference2 5.065e-5* (2.138e-5)

5.360e-5* (2.112e-5)

Tmax2 * Reference2 -1.887e-6*** (4.011e-7)

-1.915e-6*** (3.974-07)

Change (Recent – Reference)

1.917e-2*** (3.967e-3)

Tmax * Change -1.670e-3*** (4.590e-4)

Reference * Change 1.147e-3* (4.977e-4)

Adjusted R2 (projected

model): 0.291 0.294 0.294

F-Test of Informed vs Naïve Model:

19.23*** (4, 48 dof)

F-Test of Change vs Informed Model:

26.32*** (3, 48 dof)

Table S2: Regression results comparing a naïve model excluding the effect of reference temperatures, an informed

model that allows the response to temperature to differ depending on reference temperatures for that county for

that time of year, and a change model that adds the change in temperature between reference and recent periods

as an explanatory variable. Dependent variable is the logged number of weather tweets. All specifications include

controls for mean precipitation, relative humidity, % cloud cover, and the number of Twitter users (logged), as well

as county, state by month-of-year, and year fixed effects. Standard errors are clustered at the state level.

Significance codes: *p<0.05; **p<0.01; ***p<0.001

Page 15: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

15

Figure S4: Lagged effects of temperature anomalies estimated using the finite dynamic lag model. Lagged effects

are shown for anomalies with contemporaneous effects that are significantly different from zero (Figure 2a and 2c)

- a) The effect of +3 degree anomalies in the cold sample, b) the effect of -3 degree anomalies in the cold sample

and c) the effect of +3 degree anomalies in the hot sample. Figure 2b and 2d shows the cumulative sum of the

coefficients shown in b and c respectively. The identified learning periods are those where the identified lag

coefficients are statistically significant and in the opposite direction from the contemporaneous effect.

Page 16: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

16

Figure S5: Robustness to alternative functional form specifications. Solid lines and error bars are the same as

shown in Figures 2c and 2d. Dashed lines show alternative specifications of the dynamic lag model. Alternative

specifications are a quadratic and quartic effect of temperature anomalies (instead of a cubic), and changing the

number of knots in the spline modeling lagged effects to 2 and 4 (rather than 3).

Figure S6: As Figure 2c and 2d but using all county-weeks in the top quarter of baseline temperatures, not

restricting to only those with high humidity.

Page 17: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

17

Figure S7: Weights in learning model based on coefficient values during the learning period defined using the

estimated lag coefficients shown in Figure S4. Standard errors show the 95% confidence interval calculated using

the delta method.

Page 18: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

18

Figure S8: County-level variation in fraction of the population answering yes (left) and no (right) to the

questions “Do you think that global warming is happening?”. Note percentages do not sum to 100

because “Don’t know” is also an option. Data from the Yale Program on Climate Change Communication

(12, 13).

Page 19: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

19

Figure S9: County-level variation in the mean temperature anomaly 2008-2017 calculated relative to the

empirically-estimated shifting baseline (left) or the fixed 1981-1990 reference baseline (right). These two

anomalies are used to explain the pattern of variation in the fraction of the population that does or does

not agree that climate change is happening (Figure S9, Table S3).

Page 20: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

20

Table S3: Coefficients from regression models relating county-level stated belief (upper panel) or disbelief (lower

panel) in global warming to temperature anomalies over the 2008-2017 period calculated using either the shifting

baseline estimated in this paper or the reference period (1981-1990).

Page 21: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

21

Figure 10: As Figure 3 in main text, but estimated using county-level aggregation of sentiment rather than CBSA-

level aggregation.

References

1. PRISM Climate Group (2018) Northwest Alliance for Computational Science and Engineering (Oregon State University, Corvallis, OR).

2. Kanamitsu M, et al. (2002) NCEP-DOE AMIP-II Reanalysis (R2). Bull Am Meteorol Soc 83(11):1631–1643.

3. Kay JE, et al. (2013) The Community Earth System Model (CESM) Large Ensemble Project: A Community Resource for Studying Climate Change in the Presence of Internal Climate Variability. Bull Am Meteorol Soc 96:1333–1349.

4. CIESIN, University C, CIAT Gridded Population of the World, Version 3: Population Density Grid (Pallisades, NY) Available at: http://sedac.ciesin.columbia.edu/gpw.

5. Graham M, Hale SA, Gaffney D (2014) Where in the World Are You? Geolocation and Language Identification in Twitter. Prof Geogr 66(4):568–578.

6. Hutto CJ, Gilbert EE (2014) VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (Ann Arbor, MI).

7. Tausczik YR, Pennebaker JW (2010) The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. J Lang Soc Psychol 29(1):24–25.

8. Baylis P, et al. (2018) Weather impacts expressed sentiment. PLoS One 13(4):e0195750.

9. Gasparrini A (2011) Distributed Lag Linear and Non-Linear Models in R: The Package dlnm. J Stat Softw 43(8):1–20.

10. Almon S (1965) The Distributed Lag Between Capital Appropriations and Expenditures. Econometrica 33:178–196.

Page 22: Supplementary Information Appendix: Supplementary Methods · 2019. 2. 22. · 3 1 weather tweets and those we classified as not about weather. This sample was classified 2 manually

22

11. NOAA (2018) Heat Index. Available at: www.weather.gov/safety/heat-index [Accessed November 16, 2018].

12. Marlon J, Howe PD, Mildenberger M, Leiserowitz AA, Xianran W (2018) Yale Climate Opinion Maps 2018. Available at: http://climatecommunication.yale.edu/visualizations-data/ycom-us-2018/ [Accessed November 19, 2018].

13. Howe PD, Mildenberger M, Marlon J, Leiserowitz AA (2015) Geographic variation in climate change public opinion in the U.S. Nat Clim Chang 5:596–603.