disaster analytics: disaster preparedness and … · generated content, social media usage on gps...

DISASTER ANALYTICS: DISASTER PREPAREDNESS AND MANAGEMENT THROUGH ONLINE SOCIAL MEDIA

Final Report

SOUTHEASTERN TRANSPORTATION CENTER

Samiul Hasan, Ph.D.

Kamol Roy

DECEMBER 2018

US DEPARTMENT OF TRANSPORTATION GRANT DTRT13-G-UTC34

2

DISCLAIMER

The contents of this report reflect the views of the authors,

who are responsible for the facts and the accuracy of the

information presented herein. This document is disseminated

under the sponsorship of the Department of Transportation,

University Transportation Centers Program, in the interest of

information exchange. The U.S. Government assumes no

liability for the contents or use thereof.

3

Report No. Government Accession No. Recipient’s Catalog No.

Title and Subtitle DISASTER ANALYTICS: DISASTER PREPAREDNESS AND MANAGEMENT THROUGH ONLINE SOCIAL MEDIA

Report Date December 2018

Source Organization Code Budget

Author(s)

Hasan, Dr. Samiul; Roy, Kamol

Source Organization Report No. leave blank

Performing Organization Name and Address Work Unit No. (TRAIS)

Southeastern Transportation Center

UT Center for Transportation Research

309 Conference Center Building

Knoxville TN 37996-4133

Contract or Grant No. DTRT13-G-UTC34

Sponsoring Agency Name and Address

US Department of Transportation

Office of the Secretary of Transportation–Research

1200 New Jersey Avenue, SE

Washington, DC 20590

Type of Report and Period Covered Final Report: Jan 2017 – Dec 2018

Sponsoring Agency Code USDOT/OST-R/STC

Supplementary Notes:

Abstract

Due to their ubiquitous use, social media platforms offer a unique opportunity to observe, model, and predict human behavior during a disaster. Moreover, social media usage on GPS enabled smartphones allow us to collect human movement data which can help understanding mobility during a disaster. This study explores the opportunity of using social media data towards understanding and managing disaster more dynamically. In this project, we (i) assess the effectiveness of social media communication during disasters and identify the contributing factors leading to effective crisis communication strategies; (ii) develop a method to infer individual evacuation behaviors (e.g., evacuation decision, timing, destination) from social media data; (iii) we develop a method to detect extreme events from geo-located movement data and to measure mobility resilience and loss of resilience due to those events; and (iv) develop evacuation demand prediction model using social media and traffic data. Our findings show that (i) besides higher number of followers, predictable activity pattern, high bot score of organizational users, and lower bot score of personal users can contribute to gaining attention; (ii) using geo-tagged posts and text data a hidden Markov model can be developed to capture the dynamics of hurricane evacuation decision; (iii) mobility resilience and loss of resilience due to a disaster can be measured from geo-tagged social media posts; and (iv) trained from social media and traffic data, a deep learning model can predict well evacuation traffic demand up to 24 hours ahead.

4

Key Words Crisis communication, evacuation, social media, Twitter, hurricane Sandy, disaster management, Hurricane Irma, Florida, human mobility, resilience, geo-location data, machine learning.

Distribution Statement

Unrestricted; Document is available to the public through the National Technical Information Service; Springfield, VT.

Security Classif. (of this report)

Unclassified

Security Classif. (of this page)

Unclassified

No. of Pages 64

Price

…

Form DOT F 1700.7 (8-72) Reproduction of completed page authorized

5

Table of Contents EXECUTIVE SUMMARY ................................................................................................................................. 7

1 CHAPTER 1: UNDERSTANDING CRISIS COMMUNICATION .................................................... 8

1.1 Problem Description ..................................................................................................................... 8

1.2 Data .............................................................................................................................................. 8

1.3 Methods ........................................................................................................................................ 9

1.3.1 Activity, Attention, and Efficiency Metrics in Twitter ......................................................... 9

1.3.2 Extraction of User Features .................................................................................................. 9

1.3.3 Contributing Features to Efficiency .................................................................................... 10

1.4 Results ........................................................................................................................................ 10

1.4.1 Distributions of User Features ............................................................................................ 10

1.4.2 Correlations between Activity and Attention ..................................................................... 12

1.4.3 User Efficiency Analysis .................................................................................................... 12

1.4.4 User Attributes Contributing to Efficiency ......................................................................... 17

1.5 Conclusions ................................................................................................................................ 22

2 CHAPTER 2: MODELING HURRICANE EVACUATION DYNAMICS ......................................................... 23

2.1 Problem Description ................................................................................................................... 23

2.2 Data Preprocessing and Description ........................................................................................... 23

2.3 Preparing Evacuation Data ......................................................................................................... 24

2.4 Data Exploration ......................................................................................................................... 24

2.5 Methodology .............................................................................................................................. 26

2.6 Model Development ................................................................................................................... 28

2.7 Inferring Evacuation Context from a Tweet ............................................................................... 28

2.8 Model Estimation ....................................................................................................................... 29

2.9 Results ........................................................................................................................................ 30

2.9.1 HMM .................................................................................................................................. 30

2.9.2 IO-HMM ............................................................................................................................ 32

2.10 Conclusions ................................................................................................................................ 34

3 CHAPTER 3: QUANTIFYING HUMAN MOBILITY RESILIENCE ............................................................... 35


3.2 Data and Methodology ............................................................................................................... 35

3.2.1 Extracting Location Time Series of a User ......................................................................... 37

6

3.2.2 Displacement Metric .......................................................................................................... 37

3.2.3 Extraction of Typical and Actual Displacements Time Series ........................................... 37

3.2.4 Extreme Event Detection .................................................................................................... 38

3.2.5 Resilience Calculation ........................................................................................................ 38

3.3 Results ........................................................................................................................................ 39

3.4 Conclusion .................................................................................................................................. 43

4 PREDICTING EVACUATION TRAFFIC DEMAND ..................................................................... 44


4.2 Study Area and Data Description ............................................................................................... 45

4.3 Data Preparation ......................................................................................................................... 46

4.4 Methodology .............................................................................................................................. 49

4.5 Model Development ................................................................................................................... 50

4.6 Results ........................................................................................................................................ 52

4.7 Conclusions ................................................................................................................................ 59

5 CONCLUSIONS AND RECOMMENDATIONS ............................................................................. 60

REFERENCES ............................................................................................................................................... 61

APPENDIX: LIST OF PUBLICATIONS ............................................................................................................. 64

7

EXECUTIVE SUMMARY

Natural disasters with increasing severity are occurring more frequently now a days with recent examples including hurricanes Irma, Matthew, Harvey, Michael, and Florence. Each disaster causes significant economic losses, fatality, and infrastructure damage. Disaster management is critical to minimize the impacts of a disaster. It involves many interconnected processes where new technologies can make them more efficient and faster. For example, social media such as Facebook, Twitter, Instagram provide valuable information and insights about the unfolding situation during a disaster. Besides, the availability of user generated content, social media usage on GPS enabled smartphones allows us to study human mobility and location/activity choices. The objective of this project is to develop data analytics approaches for social media (e.g., Twitter) data for improving disaster management strategies. This project contains four inter-connected investigations focusing on several aspects of disaster management including:

• Chapter 1 focuses on the crisis communication during a disaster. About 52.5 million tweets related to hurricane Sandy are analyzed to assess the effectiveness of social media communication during disasters and identify the contributing factors leading to effective crisis communication strategies. Our results indicate that during a disaster event, only few social media users become highly effective in gaining attention. In addition, effectiveness does not depend on the frequency of tweeting activity only; instead it depends on the number of followers and friends, user category, bot score, and activity patterns.

• In chapter 2, we present a method to infer individual evacuation behaviors (e.g., evacuation decision, timing, destination) from social media data. We develop an input output hidden Markov model to infer dynamic evacuation decisions from user tweets during a hurricane. We have found that average evacuation distance is around 950 kilometer and if the users’ home location is under voluntary or mandatory evacuation order, they are likely to travel more distance during evacuation. We have also found that users from mandatory evacuation order return to their homes earlier than the users from locations under voluntary or no evacuation order. The proposed IO-HMM method can be useful in inferring evacuation behavior in real time from social media data.

• In chapter 3, to quantify the impacts of an extreme event to human movements, we introduce the concept of mobility resilience which is defined as the ability of a mobility infrastructure system to manage shocks and return to a steady state in response to an extreme event. We present a method to detect extreme events from geo-located movement data and to measure mobility resilience and loss of resilience due to those events. Applying this method on geo-located social media data, we quantify resilience for multiple types of disasters occurred all over the world. Quantifying mobility resilience may help us to assess the higher-order socio-economic impacts of extreme events and guide policies towards developing resilient infrastructures.

• In chapter 4, we have used traffic and Twitter data during hurricanes Matthew and Irma to predict evacuation traffic demand for a longer forecasting horizon (=> 1 hour). We have developed a Long-Short Term Memory Neural Network (LSTM-NN) and trained it using different combinations of input features and forecast horizon. We have compared our prediction results against a baseline prediction and existing machine learning models. Results from our study shows that the proposed LSTM-NN model can predict evacuation traffic demand reasonably well up to 24 hours ahead.

This report summarizes the major findings on the application of social media data to improve disaster management by making crisis communication more efficient, quantifying mobility resilience to make better future policies, and predicting evacuation decision choices and traffic demand in real time.

8

1 CHAPTER 1: UNDERSTANDING CRISIS COMMUNICATION 1.1 Problem Description

Increased use of social media in disasters will require a better understanding of the effectiveness of information spreading to an affected community. Finding the factors of information spreading is crucial for understanding the dynamics of social media systems. A better understanding of the underlying factors will provide insights into effective crisis communication strategies. Users more efficient in spreading information can play an important role during crisis, since user activities can draw a significant amount of attention to relevant topics/content from other users. Understanding the interplay among user activities, network properties and the attention received will help to identify the contributing factors in successful crisis communication in emergency situations [1], [2]. Thus, understanding the influence of social media users has significant implications in a disaster management context.

Although influence of social media users has been studied in many different contexts, efficiency of information/awareness spreading in a disaster context still needs to be investigated. Previous studies have focused on the popularity of content instead of analyzing the effects of user behaviors on how other users respond to them [3]. Moreover, to the best of our knowledge, few studies have considered user categories and activity patterns while measuring the efficiency of information spreading in the context of disaster management. In social media dynamics, information diffusion creates sudden bursts of connections (e.g., friends or followers) by creating new edges or deleting existing edges [4]. Similarly, during a disaster, information diffusion about situational awareness drives significant changes in the underlying social media connections of friends and followers. Such bursts in new followers may happen due to common interest (textual similarity) [4] or attention [3] to users or information source. In this study, using Twitter data, we analyze the efficiency of social media users in information spreading in the context of hurricane Sandy. We investigate user activity against the attention gained in pre-disaster, during (warning and response phase) and post-disaster periods. The efficiency of a user is defined as the ratio between attention gained over the number of tweets within a period. In this study, Twitter data before, during and after hurricane sandy have been analyzed to understand the factors contributing to the overall efficiency of a user in crisis communication. A model is also proposed to classify efficient users based on their attributes. This method has potential to be used to identify effective social media users during disasters for rapid communications.

1.2 Data

Hurricane Sandy, a late season post-tropical cyclone was the deadliest and most destructive hurricane of the 2012 Atlantic hurricane season. On October 20, Sandy’s origin was primarily associated with a tropical wave that was assessed as a high potential for it to become a tropical cyclone within 48 hours [5]. The hurricane was first classified and officially assigned its name as Sandy on October 22 [6]. After leaving a trail of damage over Jamaica, Cuba, and Bahamas, Sandy made its landfall on the United States at 23:30 UTC on 29 October 2012 near Brigantine, New Jersey. Sandy was responsible for 147 direct fatalities and damage in excess of $50 billion, including 650,000 destroyed or damaged buildings [5]. Sandy received a lot of media coverage both in traditional media and social media.

The dataset was collected from the publicly accessible data via doi:10.5061/dryad.15fv2 (DRYAD repository) [6]. Kryvasheyeu et al. [6] collected the dataset through an analytics company Topsy Labs who deals with Twitter data. The collected data contains tweets posted between October 15, 2012 and November 12, 2012. This duration covers the period before the formation of the hurricane to after the landfall in the United States. The dataset contains user id, timestamp, tweet text, tweet id, user followers count, user

9

friends count, sentiment scores and locations. In total, there were 52,493,130 tweets from 13,745,659 unique twitter users.

1.3 Methods

1.3.1 Activity, Attention, and Efficiency Metrics in Twitter

We define attention as the number of new followers received and activity as the number of tweets or new followee added. For this study, tweet frequency is selected as an activity metric since this is the most frequent activity among all types of users; whereas followee addition is very low or zero for some organizational and personal users. A well-connected user (high initial followers in our study) has a large audience, thus a tweet posted by a well-connected user can reach to many users. But the existing followers may not be the targeted users during an emergency, thus not creating sudden burst in new connections (friends, followers etc.) as study[4] shows that information diffusion creates sudden burst in new connections. For that reason, we have used new follower gain as attention and existing followers as one of the factors. To measure the performance of the users in gaining attention, we use a metric called as efficiency. As shown in Equation (1.1), efficiency 𝜂 of a user u for the time frame (𝑡#𝑡𝑜𝑡&) is defined as the ratio between total attention received and total activity performed within that time frame.

𝜂'(𝑡#, 𝑡&* = , 𝑎𝑡𝑡.(u)

23.425

, 𝑎𝑐𝑡.(u)23.425

(1.1)

where 𝑎𝑡𝑡.(u) and 𝑎𝑐𝑡.(u) represent, in time period k by user u, attention gained and activities posted, respectively.

Although, equation similar to (1.1) are commonly used in fields like physics and economics, Vaca et al. [3] used this term in a social media setting. Unlike most of the fields where efficiency is upper bounded to 1, it can take any value. Higher efficiency values indicate better engagement and higher influence in social media communication.

1.3.2 Extraction of User Features

From raw data, for all the tweets of each unique user activity frequency, initial follower count, initial followee count, total follower received, total followee added by the user and efficiency metrics were computed for a selected time interval. To measure the regularity of a user’s activity approximate entropy of activities has been estimated. Approximate entropy is a statistical parameter that can quantify the predictability or regularity of a time series data. A repetitive pattern of fluctuation in a time series makes it more predictable than a time series without such patterns. Approximate entropy calculates the likelihood that similar patterns of observation will not found in the data in the subsequent observations. Thus, a higher value of approximate entropy implies less regularity and a smaller value indicates strong regularity [7]. Approximate entropy has been used in many fields such as medical data [8], finance [9], psychology [10], complex system analysis [11] etc. Daily activity frequencies of a user for the whole analysis period were used as an input. Approximate entropy was best fitted as it has low computational demand, applicable on small observation (points < 50) and can be applied in real-time. The detailed procedure of computing approximate entropy can be found in this study [8]. Furthermore, since a significant number of users are being operated autonomously (bot) [12] , we have collected the bot score using truthy botornot-python API to evaluate whether a user account is controlled by human or machine [13].

10

1.3.3 Contributing Features to Efficiency

To find the linear relationship between different variables and the outcome variable, univariate and multivariate linear regressions were fitted with the extracted variables. The general form of such models is shown in Equations (1.2) -(1.4).

𝑌 = 𝜃9 + 𝜃;𝑋 + 𝜀 (1.2) 𝑌 = 𝜃9 + 𝜃;𝑋 + 𝜃>𝑋> + 𝜃?𝑋? + 𝜀 (1.3) 𝑌 = 𝜃9 + 𝜃;𝑋; + 𝜃>𝑋> + 𝜃?𝑋? +⋯+ 𝜃.𝑋. + 𝜀 (1.4)

Here Y is efficiency, treated as a dependent variable; X, X1, X2 etc. are the independent variables affecting efficiency; 𝜃9 is a constant term; and 𝜃;, 𝜃>, 𝜃? are the coefficients of the corresponding variables. While univariate linear regression describes the relationship of each independent variable with the dependent variable, a multiple linear regression model reveals the relationship of the combined effect of the explanatory variables. The independent variables in the best models are considered as the most influential and explanatory variables in determining efficiency. The best model is selected based on the adjusted R squared value.

Users are categorized based on their aggregate efficiency during the whole period. Besides understanding the effect of predictor variables in continuous change in efficiency, we estimate an ordered logit model to understand the effect of the extracted features in predicting the category of the efficiency of a user. An ordered logit or proportional odd model was chosen as the outcome variable is ordered from low efficiency to high efficiency. This model gives the output as the probability or odd of falling an outcome in an efficiency category. The basic equation [14]–[16] for interpreting this model is given in Equation (1.5).

log D𝑝#

1 − 𝑝#H = 𝑎# +𝑏;𝑥; + 𝑏>𝑥> + 𝑏?𝑥? +⋯+ 𝑏.𝑥. (1.5)

𝑤ℎ𝑒𝑟𝑒, 𝑝# = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦𝑜𝑓𝑎𝑛𝑜𝑢𝑡𝑐𝑜𝑚𝑒 ≤ 𝑖

𝑎# = 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡𝑓𝑜𝑟𝑜𝑢𝑡𝑐𝑜𝑚𝑒 ≤ 𝑖

b1, b2, etc. are the co-efficient whereas x1, x2, x3 are the explanatory independent variables. The best model is selected based on its AIC value.

1.4 Results

1.4.1 Distributions of User Features

This section describes the features collected for a user. Figure 1.1 shows the distributions of activity, followers followees, bot score, and activity entropy found in the data. Both X axis and Y axis are plotted in log scale for (a), (b), and (c). Counter cumulative probability (CCDF) is plotted in Y axis which represents the probability of a value x of being greater than the corresponding value in X axis. Both X axis and Y axis are plotted in normal scale for (d) and (e).

11

Figure 1.1: (a) Activity, (b) Follower, (c) Followee Distributions (d) Bot Score and (e) Activity Entropy.

Here the follower and followee counts are based on the counts when a user was first observed in the data set. The activity distribution is based on the total number of tweets observed during the whole period in our dataset. Bot score and activity entropy distribution are plotted using values from 646,563 users. A bot score represents the likelihood of being a bot. An extreme value (0 or 1) represents more confidence of the bot-ness of the user. Higher the value, higher the likelihood of being a bot. In our study, we have used 0.5 as a threshold to separate bot-like behavior.

12

The empirical distributions of activity, initial followers and followees were best fitted to truncated power law among the fitted distributions shown in Figure 1.1(a), (b) and (c). For bot score and activity entropy (Figure 1.1 (d) and (e)), empirical distributions are best fitted to log-normal distribution. Log likelihood ratio tests were used to find the goodness of fit for the fitted power law, lognormal and truncated power law distributions.

1.4.2 Correlations between Activity and Attention

Activity frequency and followee added play roles in gaining attention. Attention gains may also vary over time and context. Figure 1.2 shows attention gains for different ranges of activities in pre, during and post disaster periods. Though the duration of pre (9 days), during (10 days) and post (10 days) periods disaster are almost same, followers received is the highest during disaster period compared to other two periods (compare the maximum values of the z scales in Figure 1.2).

Figure 1.2: Correlations between Activity and Attention in (a) Pre-Disaster Period (Oct 14, 2012 to Oct 22, 2012), (b) During Disaster (Oct 23, 2012 to Nov 1, 2012) Period and (c) Post Disaster Period (Nov 2, 2012 to Nov 11, 2012). The X-axis and Y-axis represent the range of number of tweets and number of followees, respectively. The Z axis shows the average number of followers received by the users falling in the corresponding x and y bin as color intensity. The Z axis values are shown in log scale.

In general, we do not observe any particular trend between activity frequency and gaining attention in the pre and post-disaster periods. Unlike pre and post disaster periods, during the disaster, higher activities tend to help gaining higher attention. Very high activity frequencies (activity>700) in pre and post disaster periods are not necessarily associated with a high number of followers received. During pre and post disaster periods, users with activity frequency less than 700 have received the highest number of followers (black rectangles in Figure 1.2); whereas during the disaster, the highest number of followers was gained for a user with activity frequency greater than 700. Followee added less than 100 has no impact in gaining attention but users adding followees greater than 100 have received higher attention during all the three phases. Another observation is that, the highest number of followers received in three phases occurred for the users with activities greater than 100. To study more in depth, we have studied user daily and aggregate efficiency based on different features and categories at different phases.

1.4.3 User Efficiency Analysis

Daily efficiency of a user is calculated by dividing the total follower gain by the total number of tweets of that day. Similarly, aggregate efficiency is calculated by dividing total daily follower gain by the total tweet

13

in that period. Users are categorized based on their active days. Only the users who had at least one activity on each of the three periods (pre, during, post disaster) are selected in these categories. Figure 1.3 shows the daily efficiency distribution for the users categorized by their active days. It is found that a significant number of users have daily efficiency value equal to or less than zero. As the number of active day increases, the probability of having a user with efficiency less than or equal to zero decreases. It also shows that the probability of having efficiency less than or equal to zero is maximum for the users who were active less than 8 days. Although, the probability of having daily efficiency greater than 10 is low across all category of users, this probability increases as the number of active days for a user increases.

Figure 1.3: Daily Efficiency Distribution for the Users Categorized by Active Days. X-axis shows the daily efficiency and Y-axis shows the cumulative probability which means the probability of being daily efficiency is equal or less than of the corresponding daily efficiency of X-axis. X-axis are plotted in log scale.

We further investigate, for each user, how daily efficiency varies over time. To find if there is any trend in the data, we have categorized users based on their overall efficiency values (i.e., ratio of sum of daily attention and activity measured over the whole observation period). We calculate average daily efficiency by taking the average of the daily efficiencies of the users for a particular category. For hurricane related tweets, average daily efficiency for the first two categories (overall efficiency less than or equal to zero) does not change that much (see the inset plot at figure 1.4). But daily efficiency for the highly efficient users provide an interesting insight. Users had higher values of average daily efficiency during hurricane declaration and landfall days (figure 1.4). The spikes on hurricane declaration and landfall days indicate that some users received higher attention for their activities on those days. But this trend shows a significant number of spikes even before the formation of Sandy and also long after its landfall. This indicates that some users might be gaining attention due to tweets unrelated to Sandy. To confirm, we analyze only Sandy related tweets (having ‘sandy’ within the text of the tweet) and found that efficiency was maximum just after the declaration day (October 23,2012) and decayed readily with a spike at landfall date (see Figure 1.5). It indicates that users were highly effective spreading the awareness about Sandy on the day after its declaration. We do not observe any efficiency curve before declaration because the term ‘sandy’ was not present before declaration.

14

Figure 1.4: Average Daily Efficiency of the users categorized by efficiency (for hurricane related tweets).

Figure 1.5: Average Daily Efficiency of the users categorized by efficiency values (for Sandy specific tweets).

Users are categorized by their overall efficiency values. To determine what type of users became highly effective spreading the awareness during hurricane declaration and landfall, we extract the top 5 efficient users for both hurricane related tweets and Sandy specific tweets. Figure 1.6 shows the daily efficiency values for the top 5 efficient users on hurricane Sandy declaration and landfall days and top 5 efficient users over the whole data coverage period. We find that majority of them are either political users

15

or have no significant association with hurricane updates (see Table 1.1). However, analysis on Sandy specific tweets reveals significant spikes close to landfall day (see Figure 1.7). We find that these highly effective users are either storm update centers or weather reporter having close association with hurricane updates (Table 1.2). It highlights the importance of an appropriate filtering step when identifying highly effective users, specific to a disaster. Tweets collected for a general disaster context may contain ambiguous words (e.g., power, weather, recovery etc.) overlapping with other highly conversed contexts.

Figure 1.6: Top 5 Efficient Users during Hurricane Declaration, Landfall and Overall (for hurricane related tweets).

16

Figure 1.7: Top 5 Efficient Users during Hurricane Declaration, Landfall and Overall (for Sandy specific tweets)

17

Table 1.1: Top 5 efficient users with user type and bot score during hurricane declaration and hurricane landfall. This analysis is based on the unfiltered (hurricane related tweets) data.

Hurricane Declaration Hurricane Landfall

Screen Name User Type Bot Score Screen Name User Type Bot

Score

@M3VOY Radio Operator

0.62 @carrolltrust Organization 0.43

@TonioMilano Personal 0.48 @hiphopencounter Organization 0.76

@megynkelly Anchor at NBC News

0.36 @migcfc Personal 0.06

@BloodRedPatriot Organization 0.73 @trio Organization 0.55 Table 1.2: Top 5 efficient users with user type and bot score during hurricane declaration and landfall. This analysis is based on the data which have the word sandy in the tweet.

Hurricane Declaration Hurricane Landfall

Screen Name User Type Bot Score Screen Name User Type Bot

Score Not Found NA NA @NHC_Atlantic Organization 0.43 @breakingstorm Organization 0.66 @cnnbrk Organization 0.51

@NHC_Atlantic Organization 0.43 @Jimcantore Broadcast Meteorologist 0.37

@Jimcantore Broadcast Meteorologist 0.37 @BreakingNews Organization 0.60

@kkstormcenter Weather reporter

0.71 @breakingstorm Organization 0.66

Note: Not found indicates that the user made tweets during Sandy but its screen name was not found when searched during our analysis.

1.4.4 User Attributes Contributing to Efficiency

To identify the contributing factors for an effective spreading of awareness, it is important to know the relationship between each feature and efficiency metric. Similar to the previous section, we analyze separately for hurricane related and Sandy specific tweets for understanding the factors in gaining attention in crisis communication. For hurricane related tweets, Figure 1.8 shows the relationship between efficiency and each of the variables of followee add, initial follower and initial followee, considering two types of users: bot (bot score >=0.5) and non-bot (bot score <0.5). In addition, the relationship between efficiency and active days has been modeled considering user categories based on activity entropy. The result shows good correlation (R2 >0.5) between efficiency and initial follower for the non-bot users. From Figure 1.8, we find that efficiency is positively associated with all the variables. In all cases, R2 values are lower for bot users compared to non-bot users. This reflects that efficiency of bots cannot be well predicted with a single variable. Similar associations have been found for Sandy specific tweets (see Figure 1.9).

18

Figure 1.8: Correlations between Efficiency and Different User Attributes (for hurricane related tweets)

19

Figure 1.9: Relationship between Efficiency and Different User Attributes (for sandy specific tweets)

To determine the combined effects of the explanatory variables, we estimate a multivariate linear regression model. Table 1.3 presents the results of the model involving hurricane related tweets. All the variables are significant at 90 percent significance level, as each t statistics is greater than 1.65. It is found that efficiency increases with initial number of followers, bot score of bot users, initial number of followee of non-bot users, while decreases with total activity, initial number of followees, bot score of non-bot users. A negative coefficient for activity entropy implies that entropy values have negative correlation with efficiency and users having a predictable activity pattern (lower entropy values) have higher efficiency values. The model estimated over Sandy specific tweets show similar results (see Table 1.4) except that total activity is not statistically significant in this case and bot score and activity entropy are positively correlated with efficiency.

From the regression analysis, we find how different user features influence a user’s efficiency. However, while considering overall efficiency as an outcome, a minor change in efficiency does not provide any significant information about the user’s performance in gaining attention. Thus, we have categorized efficiency into five classes: negative (efficiency<0), zero (efficiency=0), low (0<efficiency<=5), moderate (5<efficiency<=10) and high (efficiency>10) as shown in Figure 1.4. The outcome variable, thus turned into an ordered categorical variable. In the hurricane related sample about 6%, 56%, 32%, 3%, and 3% of the users fall within the negative, zero, low, moderate and high efficiency category, respectively. We have

20

estimated an ordered logit model using the same 582605 observations (see Table 1.3). All the parameters shown in the results are statistically significant at 99% significance level.

Table 1.3: Model Results for hurricane related tweets Linear Regression Model Ordered Logit Model

Explanatory variable Parameter Estimate

t statistic Parameter Estimate

Odds Ratio Estimates

Point Estimate

95% Wald Confidence Limits

Intercept 5 na -5.0145 Intercept 4 na -4.2164 Intercept 3 na -1.1988 Intercept 2 na 2.2357 Constant 0.758 1.712 na

Total activity -0.039 -6.767 -0.0110 0.989 0.989 0.989 Followee add 0.098 77.845 0.00926 1.009 1.009 1.009 Initial follower 0.002 698.21 1.000 1.000 1.000 1.000 Initial followee -0.004 -53.3 1.000 1.000 1.000 1.000 Active days na 0.2951 1.343 1.340 1.347 Activity entropy -36.754 -15.493 -1.9523 0.142 0.128 0.157 Bot score 11.622 9.793 -0.3974 0.672 0.648 0.697 Bot_Score*NonBotUser -14.419 -10.041 na Initial_followee*NonBotUser

0.007 37.423 na

Number of observations 582,605 582,605 Adjusted R2 0.47 AIC 1244359.6

Note: na= not applicable, any variable included in one mode but not included in the other one is because we did not find it significant.

21

Table 1.4: Model Results for Sandy specific tweets Linear Regression Model Ordered Logit Model Explanatory variable Parameter

Estimate t statistic

Parameter Estimate

Odds Ratio Estimates Point Estimate

95% Wald Confidence Limits

Intercept 5 na -6.6001 Intercept 4 na -3.7514 Intercept 3 na 4.6505 Intercept 2 na 5.3600 Constant -.073 -.453 na Total activity -0.001 -1.265 na na na na Followee add 0.003 5.411 0.00359 1.004 1.003 1.004 Initial follower 0.0001 231.54 0.000035 1.00 1.00 1.00 Initial followee -6.85E-5 -6.219 -2.71E-06 1.000 1.000 1.000 Active days na -0.1312 0.877 0.852 0.903 Activity entropy .704 1.940 1.2244 3.402 1.726 6.708 Bot score 1.053 3.735 2.115 8.289 5.227 13.145 Bot_Score*NonBotUser

.521 1.655 na

Initial_followee*NonBotUser

-5.27E-5 -1.794 na

Number of observations

15,792 15,792

Adjusted R2 0.77 AIC 7513.390

Note: na= not applicable, any variable included in one mode but not included in the other one is because we did not find it significant

For the interpretation of this result, regarding total activity, a negative parameter estimate represents that if all other variables in the model remain constant, for an increase in the number of tweets a user is more likely to be in a lower level of efficiency. Similar to the results from the regression model, users with a predictable tweeting pattern (i.e., smaller entropy value) are more likely to be in a higher efficiency category. Moreover, we find that users with higher number of active days and followee added are more likely to be in a higher category of efficiency. In contrast, users with higher total activity and bot score are less likely to be in higher category efficiency. An ordered Logit model estimated over the Sandy specific tweets shows similar association except that total activity is not statistically significant and activity entropy is positively correlated with efficiency (see Table 1.4). A higher number initial follower or initial followee does not result in higher or lower efficiency category for both hurricane related and sandy specific tweets.

22

1.5 Conclusions

In this study, we have analyzed twitter posts related to hurricane Sandy to understand the effectiveness of social media based communication during disasters. Effective crisis communication can ensure faster information dissemination to vulnerable communities who need timely information about disaster preparedness, evacuation warning, and recovery operations. To measure the effectiveness of a social media user in communicating information or awareness, we have estimated the efficiency of gaining attention within a specific time period as the ratio of follower gained over tweet frequency within the same time period. We consider that new follower gained represents attention received and tweet frequency represents activities made. As our data contains tweets from both pre and post-disaster periods, a comparison of user efficiencies in gaining attention among these periods has been possible.

Analyzing daily efficiencies in gaining attention, we have found that users had higher efficiency during the critical periods in hurricane Sandy such as declaration and landfall days. This indicates the potential of social media based crisis communication since higher attention to related information may help in providing situational awareness to vulnerable population. It might be the case that during Sandy some users’ efficiency became abnormally high because a high number of users started following them for hurricane related updates. These social media users could become one of the major sources of information for spreading hurricane awareness during future hurricanes.

During a disaster, general social media users seek information from other users for a timely update. When sharing information, some users gain more attention than the others. Thus, it’s critical to understand what user features influence the process of gaining attention. For understanding the contributing features, we have estimated a regression and an ordered logit model considering overall efficiency and efficiency category, respectively as a dependent variable. We have found that higher activities relating to hurricanes are not necessarily associated with higher efficiencies. However, users with predictable tweeting patterns have gained higher efficiency values. We have also found that a higher bot score (typically associated with an organizational account) results in lower efficiencies. We have observed some differences on the effect of few user attributes on efficiency values for models estimated over general hurricane related tweets and Sandy specific tweets. User efficiencies in gaining attention for a crisis event are directly related with information spreading capacity of a system. A better understanding of the factors will provide insights on crisis communication both at organizational and individual levels. These insights will also help emergency agencies when using social media as a disaster communication tool.

Thus, our findings have significant importance in social media communication specially in disaster communication. For attaining high efficiency in spreading disaster related information, concerned organizational or personal accounts can plan their activity considering the factors which will maximize the chance to attain higher attention from the targeted population in social media. Also, prior to a major disaster event concerned authorities can select some efficient social media users for disseminating information about situational awareness; a model based on user features could find the efficient users for this task.

However, our study has several limitations which can be improved in future. For example, bot scores were not collected during the period of hurricane Sandy, rather we have collected them at the time of our analysis. Bot scores could be different during hurricane Sandy than in the present when we have collected. We assume that all the tweets analyzed here were related to hurricane Sandy. User efficiency considering specific topics (e.g., evacuation) of a tweet should be analyzed in the future.

23

2 CHAPTER 2: MODELING HURRICANE EVACUATION DYNAMICS 2.1 Problem Description Extreme weather events have become more common these days due to climate change and other related causes. Recently hurricanes Harvey, Irma, and Maria have affected millions of people in several states in the USA. These extreme events have caused significant physical and socio-economic losses [17]–[20]. A real-time demand-responsive evacuation system is essential to save human lives and minimize losses [21]. Traditionally, post-hurricane surveys are used to understand evacuation compliance behavior of a population [22]. Such surveys are costly, time consuming, and not effective to manage evacuation when a hurricane unfolds in real time [23]. With the ubiquitous use of social media platforms (e.g. Twitter, Facebook etc.), a massive volume of real-time data is available. Such data can provide insights on human behaviors during extreme events [1], [24], [25].

Thus, large-scale social media data can be used for better understanding of evacuation behaviors during hurricanes [26]. However, one of the main challenges of using social media data is to reliably detect evacuation decisions from such data. To date, the methods used for inferring evacuation choices have adopted clustering approaches [23], [26] that locate a user during pre-evacuation and evacuation periods. Although these studies provide a basis that location based social media data can be useful in evacuation studies, they do not answer what, when, and how the users participate in different activities during a hurricane.

In this study, we model the dynamics of hurricane evacuation from social media data. We have developed an input-output hidden Markov model (IO-HMM) to infer evacuation behavior from social media data. We have gathered large-scale Twitter data during hurricane Irma and used the spatio-temporal and contextual sequences from this data to run the proposed model. Hurricane Irma, the largest storm ever recorded in the Atlantic Ocean, made its landfall on the southern coastal areas of Florida. The storm generated a massive amount of social media posts nationwide, especially in Florida. This paper has the following contributions:

• We implement a process to gather hurricane evacuation data from raw geo-tagged Twitter data. We validate the results by extensive manual checking. As traditional survey data is costly and often confined with small geographic region, this type of data can be used for understanding evacuation behavior during hurricane alongside with traditional approach.

• We develop a Word2Vec model to extract contexts based on the tweets collected from multiple hurricanes (Sandy, Matthew, Harvey, and Irma). The model has been trained using more than 100 million tweets having about 882.54 million words. This model can contribute in future research to determine disaster contexts from Twitter data.

• We develop an input output hidden Markov model from the sequences generated from user tweets. To the best of our knowledge, this is one of the first studies using social media data for modeling the dynamics of hurricane evacuation decisions. The model can capture the dynamics of hurricane evacuation by answering what, when and how users participate in different activities during evacuation.

2.2 Data Preprocessing and Description In this study, for inferring evacuation choices from social media posts, we have used Twitter data from hurricane Irma. Using its streaming API, during Irma we collected around 1.81 million tweets made by 248,763 users between September 5, 2017 and September 14, 2017. To get user activities during the pre-disaster period, we have collected user specific historical data using Twitter’s rest API which allows to

24

collect the most recent 3200 tweets for any specific user. We have collected user specific data for 19,000 users who were active for at least three days between the day the first evacuation order was issued and the landfall day. For our analysis, we have considered only the tweets with geo-location information. The geolocation information is provided either as point (latitude, longitude) or bounding box (area defined by two longitudes and two latitudes). The point location is the exact location whereas the bounding box has different level of precision of where a tweet has been posted. We use the center point of a bounding box as the latitude and longitude of that place. To convert all the locations to a region, we have used geohash geocoding system with precision around 5 kilometers.

2.3 Preparing Evacuation Data From historical tweets of a user, we have extracted the most visited places during office hours (9:00 AM to 6:00 PM) at weekdays and the most visited places during night time (10:00 PM to 7:00 AM). For a user, we have assigned the most frequent office hour place and night hour place as the office location and the home location, respectively. For some users, the office and home location can be same because users may not be a worker or may have their offices within 5 km from home. As we are interested only on the long distant evacuation, we choose a threshold of 200 km kilometer to identify evacuation. Starting the first evacuation order to the landfall day, if a user has not tweeted from home or office but tweeted from somewhere else and has made a displacement of at least 200 kilometers, we consider that the user evacuated and the corresponding time as evacuation time. After landfall, the return is considered as the time when an evacuated user is first seen within the 20 kilometers from home or office. We have manually checked the results of this approach. For any particular user, we have plotted the tweet texts and time at the home location, office location and locations during hurricane in an interactive map. We check whether the user has posted about evacuation or not and whether the user has made at least 200 km displacement or not. We find that, many users have traveled to multiple locations instead of going to a fixed destination during evacuation. We have manually checked 308 evacuated users and 150 non-evacuated users identified by the above discussed approach. We compare the manual checking results with the results obtained from the approach. We find that both the results match in terms of evacuation timing and displacement traveled during evacuation. We use this resulting data as our ground truths for the model.

2.4 Data Exploration After locating the home and office of the users, we found that 3921 users have their homes or workplaces within Florida. Among the Florida users, 308 users have evacuated 200 kilometers or more. Figure 2.1 shows the origins and destinations of the evacuated users. Here the identified home location of an evacuee is considered as the origin and the evacuation destination place is considered as destination. Residents of Florida evacuated to Georgia (Atlanta was one of the major destinations), Alabama, South Carolina, and North Carolina (Figure 2.1(b)). Some evacuees took flights to another county such as Canada, Germany, France, Austria etc. for safer destinations (Figure 2.1(a)); these users are most likely the tourists who came to Florida before Irma. These results seem plausible according to the news update from different source during hurricane Irma [27], [28]. Majority of the evacuees were from Miami, Tampa, West Palm Beach etc. (Figure 2.1(b)) where mandatory evacuations were ordered.

25

(a) Origin and Destination of Hurricane Evacuee

(including international flights)

(b) Origin and Destination of Hurricane Evacuee (within USA)

FIGURE 2.1 Evacuation Origin and Destination

26

Figure 2.2 shows evacuation time and return time distribution of the users who evacuated during Irma. Among the evacuation times, September 9, 2017 (one day before Irma’s landfall) has the highest frequency followed by September 8, 2017 (two days before the landfall). On the other hand, September 11, 2017 (one day after the landfall) and September 12, 2017 (two days after the landfall) are mostly chosen return times of the evacuees. The resulting distributions are aligned to the actual evacuation time and return time according to the concurrent news reports during hurricane Irma [29], [30].

2.5 Methodology We have used an Input-Output Hidden Markov Model (IOHMM) to identify activity sequence during a hurricane. We compare the results with standard Hidden Markov Model (HMM). The model architecture is shown in Figure 2.3. The IOHMM is similar to HMM, but it maps the input sequence to output sequences and applies the expectation maximization algorithm (EM) in a supervised fashion.

Hidden Markov model is a statistical model for sequential data. In this modeling framework, the system being modeled follows a Markov process with unobserved (i.e. hidden) states. Figure 2.3(a) shows a graphical representation of HMM. The solid circles represent the observed information and the transparent circles represent the hidden state latent variables, in our case the activity types of a user. Here, the hidden states, (𝐻;, 𝐻> … .𝐻Z) are assumed to follow a Markov process that means a state,𝐻2 depends only on the previous state, 𝐻2[;; i.e., 𝐻2 = 𝑓(𝐻2[;). On the other hand, for the observations (𝑂;, 𝑂>, …𝑂Z), an observation, 𝑂2 depends only on its current state, 𝐻2; i.e., 𝑂2 = 𝑓(𝐻2).

Unlike the standard HMM, in IO-HMM, the hidden state 𝐻2 at time t, depends on the previous state 𝐻2[; and the input 𝐼2 at time 𝑡; i.e., 𝐻2 = 𝑓(𝐻2[;, 𝐼2). Observation 𝑂2 at time 𝑡 depends on both the hidden state 𝐻2 and 𝐼2 at time 𝑡; i.e., 𝑂2 = 𝑓(𝐻2, 𝐼2) (see Figure 2.3(b)).

Here, 𝐼2 ∈ 𝑅` is the input vector at time t. 𝑂2 ∈ 𝑅`is an output vector, and 𝐻2 ∈ {1,2, … . 𝑇} is a discrete state. Similar to HMM, IO-HMM has three set of parameters (𝜃): initial probability

FIGURE 2.2 Distributions of Evacuation Time and Return Time During Hurricane Irma

27

parameters (𝛼), transition model parameters (β), and emission model parameters (𝛾).

The likelihood of a data sequence given the model parameters (𝜃) is given by:

𝐿(𝜃, 𝑂, 𝐼) =h(𝑃𝑟(𝐻;|𝐼;; 𝛼).l

mPr(𝐻2|𝐻2[;, 𝐼2, 𝛽) .Z

24>

mPr(𝑂2|𝐻2, 𝐼2; 𝛾)Z

24;

) (2.1)

The model parameter is learned by expectation maximization algorithm. For initial and transition models, we have used a multinomial logistic regression model. If we define that there are 𝑘 hidden states, the mathematical equation of initial probability model becomes the following:

Pr(𝐻; = 𝑖|𝐼;; 𝛼) =𝑒r5st

∑ 𝑒rvst. (2.2)

The 𝛼is coefficient matrix for initial probability model where 𝛼# represents the coefficients for the initial state being at state 𝑖.

Pr(𝐻2 = 𝑗|𝐻2[; = 𝑖, 𝐼2; 𝛽) =𝑒x5

3sy

∑ 𝑒x5vsy. (2.3)

The 𝛽 represents the transition probability matrices with the 𝑗2z row of the 𝑖2z matrix (𝛽#&) being the

coefficients for transitioning to next state 𝑗 given the current state is 𝑖.

For the output model, we have used a linear model for the continuous outcome and logistic regression for the categorical outcome.

Pr(𝑂2|𝐻2 = 𝑖, 𝐼2; 𝛾#) =1

{2𝜋𝜎#𝑒[(~y[�5.sy)

�

>�5� (2.4)

(a) HMM

(b) IOHMM

FIGURE 2.3 Graphical Model Specifying Conditional Independence Properties (a) For Hidden Markov Model (b) For Input Output Hidden Markov Model

28

Here, 𝛾# represents emission coefficient when hidden state is 𝑖. For hidden state 𝑖, 𝛾# and 𝜎# denote the arrays for model coefficient and standard deviation of the linear model.

Pr(𝑂2|𝐻2 = 𝑖, 𝐼2; 𝛾#) =𝑒�5sy

∑ 𝑒�vsy. ; 𝑖𝑓𝑂2𝑖𝑠𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑐𝑎𝑙 (2.5)

Similarly, 𝛾# denotes the model coefficient when the hidden state is 𝑖.

Detailed descriptions of HMMs and the associated solution algorithms can be found in ref [31]. The IO-HMM model architecture and its formulation can be found in this ref [32].

2.6 Model Development An IO-HMM model expects data as sequences of inputs and outputs for each user. For that we need to process the data from raw tweets in that specific form. Figure 2.4 shows the sequence generation process. For a user, {𝑙;, 𝑙>, 𝑙? …… . . 𝑙Z} represents the consecutive posted location of the tweets and {𝑡;, 𝑡>, …… . . 𝑡Z} represents the time of the posted tweets at location {𝑙;, 𝑙>, 𝑙? …… . . 𝑙Z} respectively. We have collected hurricane related information for each of the location (county level) such as whether the location had a mandatory evacuation order or not, whether the location had a voluntary order or not, whether the time is before landfall or after landfall. This information is encoded as a binary variable in the sequence. Other information associated with each location is distance from home and time difference from landfall, similarity score of the posted tweet text with the evacuation context words. For simplicity, all information is not shown in Figure 2.4. We calculate the similarity score by training a word to vector model using tweets from 4 hurricanes (Irma, Matthew, Harvey and Sandy).

FIGURE 2.4 Schematic diagram of sequence generation. Here, 𝒍𝒕=location of the user at time 𝒕, 𝒕𝒆𝒙𝒕𝒕= texts of the posted tweet at location 𝒍𝒕. 𝒕𝒊= tweet posted time at location 𝒍𝒕.

2.7 Inferring Evacuation Context from a Tweet In general, the text of a tweet may reflect the underlying context (e.g., hurricane awareness, evacuation, information sharing/seeking, power outage etc.). We use a similarity score to quantify how similar a tweet is to an evacuation context (e.g. words like ‘evacuate’, ‘evacuating’, ‘sheltering’). We have used a vector space model called word2vec to learn the word vectors of a disaster related tweet. Vector Space Model [33] is a natural language processing tool to represent texts as a continuous vector where words that appear in the same contexts share semantic meaning [34],[35]. Once a model is trained, every word in the vocabulary will have a vector representation of a length equal to the vocabulary size (see supporting information). We train a word2vec model using a huge corpus of disaster related tweets, which we collected during multiple hurricanes (Sandy, Harvey, Matthew and Irma). Similar to any two vectors, cosine similarity between two

29

word vectors can be computed. Thus, we calculate cosine similarity [36] between two word vectors 𝑊𝑜𝑟𝑑#𝑎𝑛𝑑𝑊𝑜𝑟𝑑&𝑏𝑦𝑡ℎ𝑒𝑓𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛:

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 = cos(𝜃) =𝑊𝑜𝑟𝑑#.𝑊𝑜𝑟𝑑&

||𝑊𝑜𝑟𝑑#||𝑊𝑜𝑟𝑑&||

=∑ 𝐴#𝐵#�.4;

√(∑ 𝐴#>)√(∑ 𝐵#>)�.4; �

.4; (2.6)

Here, 𝑊𝑜𝑟𝑑#,𝑊𝑜𝑟𝑑& ∈ {𝑣𝑜𝑐𝑎𝑏𝑢𝑙𝑎𝑟𝑦𝑖𝑛𝑡ℎ𝑒𝑚𝑜𝑑𝑒𝑙} , 𝑘 = 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑜𝑓𝑡ℎ𝑒𝑤𝑜𝑟𝑑𝑣𝑒𝑐𝑡𝑜𝑟. In our study, to calculate the similarity of a word to an evacuation context, 𝑤𝑜𝑟𝑑# ∈{�𝑒𝑣𝑎𝑐𝑢𝑎𝑡𝑖𝑜𝑛�,� 𝑒𝑣𝑎𝑐𝑢𝑎𝑡𝑖𝑛𝑔�, ′𝑠ℎ𝑒𝑙𝑡𝑒𝑟𝑖𝑛𝑔′} and 𝑤𝑜𝑟𝑑& ∈ {𝑤𝑜𝑟𝑑𝑠𝑖𝑛𝑎𝑡𝑤𝑒𝑒𝑡}. If a window size of 𝑛is selected, then score of a tweet is calculated by summing up to 𝑛2z top score for the words present in the tweet. We have used the top one score in this study to represent the evacuation context similarity in the posted tweets by a user. Similarity scores of the tweets posted at location 𝑙;, 𝑙>, 𝑙? …… . . 𝑙Z is denoted by {𝑆𝑐𝑜𝑟𝑒2��2t, 𝑆𝑐𝑜𝑟𝑒2��2�, …… . . , 𝑆𝑐𝑜𝑟𝑒2��2��t, 𝑆𝑐𝑜𝑟𝑒2��2�}.

2.8 Model Estimation As we get the sequences of all the information needed, we need to specify the input and output. In IO-HMM, both input and output are available to train but after training, the model should infer the output given its input. In general, the inputs are known before the start of a transition to a new state/activity, but the outputs are not known. In our model, we choose input variables that are likely to influence the decision of a user’s next activities. We choose 5 input variables, 𝐼2: time difference from landfall in hours represented as negative to positive where negative means pre-landfall (𝐼;), binary variable representing pre-landfall or post landfall (𝐼>), interaction variable representing time difference from landfall only for pre-landfall period (𝐼?), binary variable representing if the user’s home location is under mandatory evacuation order (𝐼�), binary variable representing if the user’s home location is under voluntary evacuation order (𝐼�). For our IO-HMM model, we choose two variables as outputs: current location’s distance from home (𝑂;), evacuation similarity score (word2vec score) of the tweets posted in the location (𝑂>).

For selecting the inputs dependency of the initial, transition and output model of the IO-HMM, we make several assumptions. We assume that the transition between the states depend on current state and the context variables 𝐼;(time difference from landfall), 𝐼>(post landfall), 𝐼�(user’s home is under mandatory evacuation), 𝐼�(user’s home is under voluntary evacuation). In case of output variables, we choose that outputs 𝑂;(home distance), 𝑂>(word2vec score) depend on the current state and the context variables 𝐼;(time difference from landfall), 𝐼?(time difference from landfall during pre-landfall), 𝐼�(user’s home is under mandatory evacuation) and 𝐼�(user’s home is under voluntary evacuation). Moreover, no input is chosen for the initial probability model and thus parameters will be learned by the EM algorithm only. Multinomial logistic regression is used as the transition and initial model. As both the outputs, distance from home (𝑂;) and word2vec evacuation similarity score(𝑂>) are continuous, linear regression model is used as output model.

In the given setting of IO-HMM, to unfold the dynamics of hurricane evacuation, we choose four types of latent activities (home activity, office activity, evacuation, and others) as hidden states. Home activity and office activity represent the activities when the user stays at home and office, respectively; any activities participated at other locations are defined as other activities. Starting the first evacuation order to the landfall day, if a user has only other activity but no home/office activity and at a location of 200

30

kilometers or more away from home, we labeled it as an evacuation activity. As we know how the chosen activities should be and we have some ground truth information, we train the IO-HMM model in a semi-supervised [37] way. We use 250 user’s labeled information of the latent states to force the EM learning in the specified direction. We implement the models in python programming language using IOHMM package.

2.9 Results In this section, we describe the results of our evacuation dynamics model. First, we apply the standard HMM model to find the learned distribution of the selected outputs. Then we interpret the results from IOHMM.

2.9.1 HMM Given, we have the same four latent states/activities, HMM models the observed variables/output as a mixture of gaussian. Figure 2.5 shows the posterior distribution of home distance and word2vec score for each latent activity. Mean distances from home is 0, 202.79, 402.02, 952.99 kilometers respectively for home activity, office activity, other activity and evacuation. As expected, the model has captured higher average distance of 952.99 kilometer for evacuation activities. Average distance for office activity is 202.79 kilometers with dispersion of 324.71 kilometer. We choose other activities as a broad category for simplicity of the model; it can be grocery shopping before hurricane, eating at restaurants, short or long trips etc., thus the 402.02 kilometer of average distance with the highest dispersion of 1048.5 kilometer seems reasonable. We are interested in knowing how the users respond to evacuation in their tweet texts. Word2Vec evacuation similarity scores are .52,.49,.46 and .52 for home activity, office activity, other activity and evacuation respectively. Although the difference is not that much, we see a higher score during home activity and evacuation activity. That means that users have tweeted about evacuation more during evacuation or home activity (0.52 and 0.52 word2vec similarity score). It is expected because users who evacuated are more likely to share posts about evacuation. Also, people share evacuation related updates from home during hurricane evacuation.

31

FIGURE 2.5 Posterior Distributions of Output Variables (Distance from Home and Word2Vec Score) for Different Activity Types (a) Home Activity (b) Office Activity (c) Other Activity (d) Evacuation

32

2.9.2 IO-HMM Although HMM can learn the latent activities from the observed variables, it does not allow to incorporate the contextual input variables to infer about the latent activities and the outputs. Table 2.1 shows the coefficients of the output model when applied in IO-HMM setting. For output distance from home, given the current state is home activity it has no interpretable coefficient because for any user at home location distance from home is always zero. As expected, evacuation activity has the highest intercept among all for distance from home output variable. Positive coefficients for

TABLE 2.1 Model Coefficients of Output Model for IO-HMM

both mandatory and voluntary order indicate that given all other variables remain constant, evacuated users from mandatory and voluntary evacuation zone will travel longer than a user from a zone with no evacuation order. A positive coefficient of 𝐼?, indicates that compare to post-landfall (negative coefficient of 𝐼;), during pre-landfall as the time difference to landfall increases or as it comes closer to landfall, evacuated users are more likely to travel longer distance for evacuation. For evacuation activity, positive coefficient of (𝐼�) indicates that if all other variables remain constant users from mandatory evacuation zone are likely to travel longer distance for evacuation in comparison to users from no evacuation order zone. Same is true for the users under voluntary evacuation order zone (𝐼�).

For word2vec evacuation similarity score, home activity has the largest intercept that means if all the independent variables equal to zero, which is equivalent to landfall day and no evacuation order, users are more likely to post about evacuation from home activity. For evacuation activity, positive coefficient of 𝐼;(.001) and negative coefficient of 𝐼?(-.0008) represent that compare to post-landfall period, during pre-landfall period evacuated users post less about evacuation as time approaches to landfall. It indicates that an evacuated user posts more about evacuation after landfall. Maybe it’s because of they are at evacuated state and thus share the concern more after the landfall in their tweets. The coefficients are opposite for home activity and other activity which is reasonable because users post more about evacuation before landfall from their home as the plan for evacuation decision. For home activity and other activity, mandatory evacuation order (𝐼�) has positive coefficient and voluntary evacuation order (𝐼�) have negative coefficient. That means if all other variables remain constant, during home and other activity, in comparison to the users from no evacuation order, users talk more about evacuation if user’s home location is under

Output Variable Latent Variable

Input Variables

intercept

Time difference from Landfall, 𝐼; (hour)

Time difference from landfall*Pre-landfall period, 𝐼?(hour)

Home location under Mandatory evacuation order, 𝐼�

Home location under voluntary evacuation order, 𝐼�

Distance from Home

Home Activity 0 0 0 0 0 Office Activity 520.793 -0.770 2.327 -240.382 -322.116 Other Activity 797.986 -0.478 6.206 -228.716 -406.714 Evacuation 876.667 -1.189 5.097 9.748 102.946

W2v Score

Home Activity 0.581 -0.0004 0.0005 0.0004 -0.013 Office Activity 0.529 -0.0004 -0.0003 -0.005 -0.047 Other Activity 0.509 -0.0004 0.0007 0.003 -0.070 Evacuation 0.546 0.001 -0.0008 -0.008 -0.092

33

mandatory evacuation order whereas users from voluntary evacuation order talks less about evacuation. Surprisingly, mandatory order zone and voluntary order zone have negative coefficient for evacuation activity that means users who evacuated from mandatory or voluntary order zone will post less about evacuation in comparison to a user who evacuated from region where no evacuation order is declared. It may be the reason that as they evacuate from no order zone, they perceived the risk more seriously and thus share their concern more in their post.

TABLE 2.2 Coefficients of Transition Model for IOHMM

From Activity: To Activity Intercept

Whether time is post landfall (𝐼>)

Time difference from Landfall, 𝐼; (hour)

Home location under Mandatory evacuation order, 𝐼�

Home location under voluntary evacuation order, 𝐼�

Home: Home -0.498 0.053 0.002 0.668 -0.225 Home: Office -0.159 -0.416 0.005 0.019 0.227 Home: Other 1.386 0.374 0 -0.243 0.021 Home: Evacuation -0.73 -0.011 -0.008 -0.444 -0.023 Office: Home 0.719 -0.87 0.008 0.091 0.275 Office: Office -0.032 1.662 -0.002 -0.103 -0.931 Office: Other -0.012 -0.255 0.003 -0.084 0.907 Office: Evacuation -0.675 -0.537 -0.008 0.096 -0.251 Other: Home 0.91 -0.392 0.006 0.09 -0.071 Other: Office -0.859 -0.609 0.006 -0.291 0.087 Other: Other 0.623 1.601 -0.003 0.16 0.06 Other: Evacuation -0.675 -0.6 -0.009 0.042 -0.076 Evacuation: Home -1.146 -1.383 0.016 0.369 -0.337 Evacuation: Office -0.844 0.072 0.002 -0.179 -0.057 Evacuation: Other -1.638 3.236 -0.008 -0.151 0.111 Evacuation: Evacuation 3.628 -1.924 -0.01 -0.039 0.284

Table 2.2 shows the coefficients of multinomial logistic regression (MNL) model for modeling the transition model for IOHMM. Given the current state, there are 4 MNL model to capture the transition among them. Here, any positive coefficient means that an increase in the associated variable will increase the probability to make the transition between the associated states. We have 80 different coefficients to capture the dynamics of transition between states (see Table 2.2). We will focus on interpreting the coefficients associated with evacuation. For instance, if we look at the coefficients of home: evacuation (see Table 2.2), negative sign of 𝐼>(post landfall) represents that if a user’s current state is an office activity, in comparison to a pre-landfall period, a post-landfall period decreases the likelihood to make transition to evacuation if all other variables remain constant. It is also same for office: evacuation, other: evacuation and evacuation: evacuation (see Table 2.2) transitions. The findings are reasonable because after the landfall users do not decide to evacuate. Time difference from landfall (𝐼;) variable has negative coefficient for all 4 transitions (home: evacuation, office: evacuation, other: evacuation and evacuation: evacuation). That

34

means, if everything remains constant, with increase in time difference from landfall it is less likely to occur these transitions in comparison to other transition given the current state. The context variable 𝐼�(home location under mandatory evacuation order) has positive coefficient for office: evacuation and other: evacuation but it has negative coefficient for home: evacuation and evacuation: evacuation. It is because users do not make transition directly from home to evacuation rather they first tweeted from office location or other location and then from evacuation. The positive coefficient of 𝐼� represents that if everything remains constant, in comparison to users from no evacuation order zone, users from mandatory evacuation order are more likely to make evacuation activity given the current state. Negative coefficient of 𝐼� for evacuation: evacuation state means that users who evacuate from mandatory evacuation zone are less likely to remain in evacuation state given all other variables remain constant. It may be because they are very much concerned about the damage of their home caused by the hurricane. Except evacuation: evacuation, the context variable (𝐼�) has negative coefficient for the transitions to evacuation given the current state. Negative coefficient of 𝐼� represents that if everything remains constant, given the current state users from voluntary evacuation zone are less likely to participate in evacuation in comparison to users from no evacuation zone. The positive coefficient of (𝐼�) for evacuation: evacuation represents that users who are evacuated from voluntary evacuation order are like to continue in evacuation state in comparison to users from no evacuation order zone. It may be because users from voluntary evacuation order zone make trip plan and visit more than one place during evacuation as we find by checking.

2.10 Conclusions In this study, we use Twitter data from hurricane Irma to develop a model for inferring user hurricane evacuation dynamics. We have implemented a process to collect evacuation data from social media covering a large geographical area. Based on the tweets of active users during evacuation period, we develop an input output hidden Markov model to infer what type of activity an individual participates, when they evacuate and how they participate in these activities. The modeling approach provides insights on evacuation and other activity types during evacuation both spatially and temporally. We interpret the model with four activity types (home activity, office activity, other activity and evacuation) during an evacuation.

This study provides important insights for disaster management because evacuation is one of the critical aspects of emergency risk management. There are few studies inferring evacuation dynamics from social media data which can be gathered in real time at large-scale. Identification of the evacuated and not evacuated population during a hurricane can make its preparation more effective and dynamic.

However, this study has some limitations such as Twitter data has different penetration in different areas. And twitter users are not equally distributed among different age groups which is not considered in this study. Other variants of HMM can be applied to get more accuracy.

35

3 CHAPTER 3: QUANTIFYING HUMAN MOBILITY RESILIENCE 3.1 Problem Description Resilience is broad concept, which is applied to many different fields to measure the ability to sustain adverse situation. Previously, several concepts of resilience have been proposed. Hosseini et. al. [38] have reviewed the methods of defining and quantifying resilience in various fields. Bruneau et. al. [39] developed a framework for measuring resilience considering four dimensions: i) robustness reflecting the strength or ability of the system to reduce the damage; ii) rapidity representing the rate or speed of recovery; iii) resourcefulness reflecting the ability to apply materials and human resources by prioritizing goals when an event occurs; and iv) redundancy representing the capacity to achieve goals by prioritizing objective to restrain loss and future disruptions. They have also proposed the following equation to measure resilience loss due to an earthquake:

𝑅𝐿 = � [100 − 𝑄(𝑡)]𝑑𝑡

2t

2¡

(3.1)

where, RL denotes resilience loss, Q(t) denotes a quality function at time t, and t1 is the recovery time. This formula forms the basis of a resilience triangle. Although this metric was originally proposed for an earthquake, it can be applied to many other contexts [38].

However, measuring these resilience metrics, in a mobility context, has been difficult due to the lack of appropriate data over longer time periods. Geo-location data from social media can offer a solution to this problem. In this study, by analyzing user displacements from a pre-disaster period to a post-disaster one, we measure perturbation and recovery time for multiple types of disaster. To validate our results, we have used one-month of taxi data from the New York City recording taxi movements before, during, and after hurricane Sandy. Quantifying resilience loss and recovery time from disruptions in response to an extreme event can help understanding the broader socio-economic impacts of disasters. Furthermore, these resilience metrics will help in making policy towards building resilient cities and communities.

This study makes several contributions. First, it defines the concept of mobility resilience and develops methods to detect extreme events in mobility data and to measure required metrics to measure resilience and resilience loss from movement data. Second, it applies the proposed method of measuring resilience to geo-located data collected from Twitter for multiple disasters. Thus, this paper shows that geo-located social media data can be effectively used to measure human mobility resilience to extreme events.

3.2 Data and Methodology To measure mobility resilience, we have used geo-tagged tweets from several types of disaster (Table 3.1). The data sets have been collected from Dryad digital repositories http://datadryad.org/resource/doi:10.5061/dryad.88354 [40], originally collected by Wang et. al. [41] and https://datadryad.org//resource/doi:10.5061/dryad.15fv2, collected by Kryvasheyeu et. al. [42].

36

Table 3.1: Data Description

Type Disaster Name Disaster Location No. of Tweets No. of Users

Hurricane Sandy (all tweets) USA 52,493,130 13,745,659 Sandy (geo-tagged tweets) USA 24,149,780 5,981,012

Earthquake Bohol (Bohol) Bohol, Philippines 114,606 7,942 Iquique (Iquique) Iquique, Chile 15,297 1,470 Napa (Napa) Napa, USA 38,019 1,850

Typhoon

Wipha (Tokyo) Tokyo, Japan 849,173 73,451 Halong (Okinawa) Okinawa, Japan 166,325 5,124 Kalmaegi (Calasiao) Calasiao, Philippines 21,698 1,063 Rammasun (Manila) Manila, Philippines 408,760 27,753

Winter storm Xaver (Norfolk) Norfolk, Britain 115,018 8,498 Xaver (Hamburg) Hamburg, Germany 15,054 2,745 Storm (Atlanta) Atlanta, USA 157,179 15,783

Thunder storm

Storm (Phoenix) Phoenix, USA 579,735 23,132 Storm (Detroit) Detroit, USA 765,353 15,949 Storm (Baltimore) Baltimore, USA 328,881 14,582

Wildfire New South Wales (1) New South Wales,

Australia (1) 64,371 9,246

New South Wales (2) New South Wales, Australia (2)

34,157 4,147

To validate our approach of using social media data, we collected New York City taxi data which includes taxi movement for the period same as the hurricane Sandy twitter data. The data was collected from a repository hosted by New York City Taxi and Limousine Commission (http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml). In the data, each observation represents a trip and there were total 12,892,877 trips in the study period. Hurricane Sandy data have tweets from several places including USA, Canada, Mexico and other countries. For measuring resilience for a city or a state in response to hurricane Sandy, we have applied appropriate location filters. For example, a trip can be made within the New York City or having only an origin or destination in it. Since displacements are calculated in six-hour periods, when calculating resilience for the New York City, if a location filter is applied, only the displacement within the New York City will be considered in a six-hour period. If a location filter is not applied, both displacements within the New York City and having origins or destinations at the New York City will be considered in a six-hour period. Except hurricane Sandy data, the rest of the data consist city-specific tweets where those cities were subject to a disruptive event. Thus, a location filter or constraint is not required for these cases.

In this study, we apply the concept of resilience for understanding human mobility under a disaster. Following the basic definition of resilience, we define mobility resilience as the ability of a mobility infrastructure system responsible for the movement of a population to manage shocks and return to a steady state in response to an extreme event. These events include a hurricane, earthquake, terrorist attack, winter

37

storm, wildfire, flood, and others. We propose a simple method based on human movement data using normalized per user displacement as a key indicator of human mobility. Comparing the difference between per user displacements from typical displacements, the proposed method can detect a disruptive event from movement data and calculate the maximum deviation from normal conditions and the recovery time. Finally, applying the concept of resilience triangle, we estimate resilience and resilience loss for an event detected by the method. The proposed method can take any kind of movement data as inputs including coordinates from mobile phone call recordings, GPS observations, social media posts and many others. In this paper, we present our resilience analysis based on social media data from multiple types of disasters.

3.2.1 Extracting Location Time Series of a User First, the coordinates of a user are sorted in an ascending order by timestamps. If there are not enough users for an hourly based analysis, we can divide each day in 4 periods such as 12 AM to 6 AM, 6 AM to 12 PM, 12 PM to 6PM and 6PM to 12AM. From the sorted time series, locations (i.e., latitude and longitude) of each user are extracted in six-hour interval for each day.

𝑃'¢,2 = {(𝑥, 𝑦)|(𝑥, 𝑦) ∈ (𝑙𝑎𝑡𝑖𝑡𝑢𝑑𝑒, 𝑙𝑜𝑛𝑔𝑖𝑡𝑢𝑑𝑒𝑜𝑓𝑎𝑟𝑒𝑔𝑖𝑜𝑛)} (3.2)

where, 𝑃'¢,2denotes the set of locations of a user 𝑢 in day 𝑑 at period 𝑡

𝑑𝜀(𝑑𝑎𝑦𝑠𝑖𝑛𝑡ℎ𝑒𝑑𝑎𝑡𝑎𝑠𝑒𝑡), 𝑡𝜀(𝑝𝑒𝑟𝑖𝑜𝑑𝑠𝑖𝑛𝑎𝑑𝑎𝑦), 𝑢𝜀(𝑢𝑠𝑒𝑟𝑠𝑖𝑛𝑑𝑎𝑡𝑎𝑠𝑒𝑡)

3.2.2 Displacement Metric From the set of locations of a user, distances between two consecutive points are calculated using the Harvesine formula [43] shown in Equation (3.3). For calculating displacements, a user must have at least two locations within a six-hour interval. Otherwise, the user is not considered in that interval.

𝐶 = 2𝑟 × sin[; §¨sin> ©𝜙> − 𝜙;

2 « + 𝑐𝑜𝑠𝜙;𝑐𝑜𝑠𝜙>sin>(𝜑> − 𝜑;

2) (3.3)

where 𝑟 is radius of earth, 𝜙 is latitude and 𝜑 is longitude. Displacement between two consecutive points will be calculated for each user at every six-hour interval. The average of the displacements for an interval is calculated by dividing the sum of the displacements by the total number of users contributing to that displacements. Thus,

𝐷¢,2 = ¯∑𝐶¢,2

∑𝑢¢,2° (3.4)

where 𝐷¢,2 represents the average displacements at period t for day d. The term ∑𝐶¢,2 indicates the summation of the displacements at period t for day d and the term ∑𝑢¢,2 represents the total number of users contributing to these displacements within this period.

3.2.3 Extraction of Typical and Actual Displacements Time Series The mobility dataset to be used for a resilience analysis should cover pre-disaster, disaster and post disaster periods. Using the average displacements value in the pre-disaster period, we can make four sets of typical values for the four periods considered in a day. These four typical values are calculated separately for weekdays and weekends.

𝐷±��.¢²³2 = {𝐷¢,2𝑤ℎ𝑒𝑟𝑒, 𝑑 ∈ (𝑝𝑟𝑒 − 𝑑𝑖𝑠𝑎𝑠𝑡𝑒𝑟𝑤𝑒𝑒𝑘𝑑𝑎𝑦𝑠)} (3.5) 𝐷±��.��¢2 = {𝐷¢,2𝑤ℎ𝑒𝑟𝑒, 𝑑 ∈ (𝑝𝑟𝑒 − 𝑑𝑖𝑠𝑎𝑠𝑡𝑒𝑟𝑤𝑒𝑒𝑘𝑒𝑛𝑑𝑑𝑎𝑦𝑠)} (3.6)

where 𝐷±��.¢²³2 represents the set of displacements at period t considering only weekdays in the pre-disaster period. Similarly,𝐷±��.��¢2 represents the set of displacements at period t considering only

38

weekends in the pre-disaster period. For instance, if we have 4 periods per day, and if we select first 7 days as a pre-disaster period, for each period, we have a set of 5 values of displacement for weekdays and a set of 2 values for weekends. The mean and standard deviation of these sets of displacement are used to compare the actual displacement at the corresponding periods of a day to check whether the displacement is typical or not. To capture this effect, we can compute standardized displacement, Z score, for each actual displacement using the equation given below:

𝑍¢,2 = {

𝐷¢,2 − 𝑚𝑒𝑎𝑛𝑜𝑓𝐷±��.¢²³2

𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛𝑜𝑓𝐷±��.¢²³2 𝑖𝑓𝑑 ∈ (𝑤𝑒𝑒𝑘𝑑𝑎𝑦𝑠)

𝑒𝑙𝑠𝑒𝐷¢,2 − 𝑚𝑒𝑎𝑛𝑜𝑓𝐷±��.��¢2

𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛𝑜𝑓𝐷±��.��¢2 } (3.7)

where 𝑍¢,2 represents the 𝑍 score at day 𝑑 and period 𝑡. If 𝑑 is a weekday, typical displacements for weekdays are used to compare; and if 𝑑 is a weekend day, typical displacements for weekends are used.

3.2.4 Extreme Event Detection An extreme event can disrupt human mobility by either increasing mobility or decreasing mobility. We consider two parameters for detecting an extreme event: a threshold z score 𝛼 and the number of time intervals 𝜏. The first parameter checks the amount of deviation from typical values and the second parameter checks how long this deviation persists.

𝐸𝑣𝑒𝑛𝑡¢5,2·¢3,2¸ = ¹𝑍¢,2:𝑍¢,2 ≤ 𝛼»and h 𝑑, 𝑡

¢3,2¸

¢5,2·

≥ 𝜏¿ (3.8)

or,

𝐸𝑣𝑒𝑛𝑡¢5,2·¢3,2¸ = ¹𝑍¢,2:𝑍¢,2 ≥ 𝛼'and h 𝑑, 𝑡

¢3,2¸

¢5,2·

≥ 𝜏¿ (3.9)

Equation (3.8) and (3.9) represent the event detection for decreased and increased mobility, respectively; where 𝐸𝑣𝑒𝑛𝑡¢5,2·

¢3,2¸ represents an extreme event from day 𝑑# period 𝑡À to day 𝑑& period 𝑡Á; 𝑑#, 𝑑& ∈

(𝑑𝑎𝑦𝑠𝑖𝑛𝑑𝑎𝑡𝑎𝑠𝑒𝑡) and 𝑡À, 𝑡Á ∈ (𝑝𝑒𝑟𝑖𝑜𝑑𝑠𝑖𝑛𝑎𝑑𝑎𝑦); 𝛼», 𝛼' represent the lower and upper threshold of Z score; and 𝜏represents the threshold number of periods when Z score is above or below the threshold Z score. These parameters (𝛼, 𝜏) can be selected to identify shorter or longer extreme events depending on the type of a disaster and the area affected by it.

3.2.5 Resilience Calculation Once an extreme event has been detected, maximum deviation and recovery time can be easily calculated. Bruneau et. al. [39] introduced a equation for calculating resilience loss as shown in Equation (3.1)

𝑅𝐿 = � [100 − 𝑄(𝑡)]𝑑𝑡2t

2¡

where, RL is the resilience loss which is the area (see Figure 3.1(a)) between the horizontal line from 100 and the curve Q(t) for 𝑡9 to 𝑡; which is the recovery period for any event.

39

Figure 3.1: Resilience and Resilience Loss Calculation. (a) Resilience Triangle (adopted from [39]), (b) Human Mobility Resilience (Decreased movement), (c) Human Mobility Resilience (Increased movement). A schematic representation of this equation (see Figure 3.1 a) is known as a resilience triangle. R and RL indicate resilience and resilience loss, respectively. From this triangle, the loss of resilience in any extreme event can be calculated as the area formed by the dashed lines and the vertical line (see Figure 3.1 a). Inspired from the resilience triangle, we represent the resilience by dividing this area into smaller trapezoids (see Figure 3.1b and 3.1c) having height equal to the increment of time (six hours) considered in the analysis. This assumption is required since, unlike an idealized quality function, a real-world quality function indicating human mobility gradually drops from and improves to its typical values. Thus, assuming smaller trapezoids will minimize the loss in calculation.

In our analysis, we define quality as the ratio of actual displacements to typical displacements. If an actual displacement is equal to a typical displacement, the value quality function is 100 or the ratio is 1. The summation of the areas of all the small trapezoids is the resilience loss (indicated by RL in figure 4.1b and 3.1c). The residual area (indicated by R in Figure 3.1) represents the value of resilience during the recovery period. For increased mobility area considered in resilience calculation are defined by the maximum quality percentage/ratio (see Figure 3.1c).

3.3 Results The approach to calculate resilience has been applied over location-based social datasets (see Table 3.2). During these events, we observe two types of responses in the mobility function which either significantly drops (decreased mobility) or significantly rises (increased mobility). To represent both types of events, two thresholds 𝑧 scores (𝛼 values) have been used for detecting an extreme event. For decreased mobility cases, a threshold 𝑧 score value of 40 percentile (𝛼» = 40) and for increased mobility cases, a threshold z score of 90 percentile (𝛼' = 90) have been chosen to detect an extreme event. However, when no event was detected with these thresholds, 𝛼» = 60 percentile have been chosen; this relaxes the lower threshold of z score. As the threshold duration of the extreme event when the 𝑧 value is below 𝛼» has been chosen as 7 time periods (i.e., 𝜏 = 7𝑜𝑟42ℎ𝑜𝑢𝑟𝑠) and when the z value is above 𝛼' has been chosen as 3 time periods (i.e., 𝜏 = 3𝑜𝑟18ℎ𝑜𝑢𝑟𝑠).

Figure 3.2 shows the major steps in calculating resilience for three types of disasters namely: Each figure has three panels; the first panel shows the actual and typical values; the second panel shows the event detection by z score; and the third panel shows the resilience and resilience loss. Hurricane Sandy (Figure

40

3.2a), earthquake at Bohol (Figure 3.2b) and a thunder storm at Phoenix, Arizona (Figure 3.2c). Table 3.2 presents the results of resilience calculation for multiple types of disasters along with the threshold values used to detect the events. Events detected by 60 percentile thresholds are not comparable with the events detected by 40 percentile thresholds. The 40 percentile events are more severe than the 60 percentile events. Among 40 percentile events, the highest recovery time was found 144 hours for hurricane Sandy for the state of New York and the highest resilience loss was found 344.89 for earthquake Iquique. We have also calculated the ratio between resilience loss and resilience ÉÊË

ÊÌ. The highest ratio of resilience loss over

resilience has been found as 2.73 for the state of New York for hurricane Sandy. Among the 60 percentile events, the state of New Jersey during hurricane Sandy had the highest recovery time, resilience loss and resilience loss over resilience ratio. These metrics indicate the magnitude of impact of hurricane Sandy on the mobility systems of the sates of New York and New Jersey.

In addition to Twitter data, we have used taxi trips data to calculate the resilience metrics. Figure 3.2d shows the resilience and recovery time for taxi movements in the New York City. For measuring resilience in taxi data, taxi trips have been used instead of the taxi trip distance. Most of the trips in taxi occurred between some frequently visited places and thus, the average traveled distances per trip were almost same for the disrupted days although there were significantly less number trips in those days. For taxi trips, the maximum deviation at the landfall day is found as 0.052 which means only 5.2 percent of the typical trips occurred at the landfall day of hurricane Sandy; the recovery time is found 96 hours. A recent study [44] measuring transportation system resilience by taxi data using pace as a quality indicator found recovery time as 132 hours for hurricane Sandy. From Table 4.2, we can see that human mobility recovery time and resilience loss for New York city is 66 hours and 42.37, respectively. The two results between taxi resilience and human mobility resilience is not directly comparable because taxi is just one of the modes of human mobility.

During hurricane Sandy, among the states, the state of New York suffered the highest resilience loss followed by the states of New Jersey and Pennsylvania. For hurricane Sandy both recovery time and resilience loss are higher when a location constraint is not applied. Except hurricane Sandy data, typhoon, winter storm and rain storm data are location constrained. Thus, resilience losses for these events are lower compared to hurricane Sandy’s unconstrained resilience loss. This finding is consistent with previous study [45] that during these types of disasters, short tips are less affected compared to long trips. These events discussed above faced a significant amount of decrease in mobility from a typical mobility function.

However, in an earthquake, instead of a decreasing mobility function, we observe a significant increase in human mobility- probably due to the long-distance migration of people forced by severe infrastructure damages. Figure 3.2b shows the resilience calculation for an earthquake happened at Bohol, Philippines in 2013. The recovery time and resilience loss for this event are 54 hours and 162.31, respectively. Our method has detected one more event after around 3 days. This event may represent the increased mobility when displaced people returned to their places as studies found that natural disaster like earthquake cause human migration. Table 3.2 shows the other earthquake resilience and recovery time results. Among the earthquakes analyzed in this study, Iquique had the highest deviation and resilience loss, 38.167 and 344.89, respectively and Napa had the lowest resilience loss and deviation. A

41

Figure 3.2: Resilience and Resilience Losses for Multiple Disasters. Note: DPU= Displacements Per User (Kilometer), TF= Trip Frequency study [41] on the same data for measuring human mobility pattern found that although human mobility during most of typhoon, rainstorms, winter storms and Napa earthquake can be predicted by established

42

patterns, mobility during earthquakes Bohol and Iquique cannot be predicted. Instead of decreased mobility, a significant increase in mobility with large resilience loss during these events may explain this result. Table 3.2: Comparison of Resilience, Resilience Loss and Recovery Time for Multiple types of Events Occurred in Different Location

Dis

aste

r N

ame

Loc

atio

n

Thr

esho

ld

Z S

core

Loc

atio

n Fi

lter

Star

t Tim

e

Rec

over

y T

ime(

hr.)

Max

D

evia

tion

Res

ilien

ce

(R)

Res

ilien

ce

Los

s (R

L)

Rat

io

(RL

/R)

Sand

y

New York City 𝛼»=60 Y 2012-10-26

00:00 132 0.540 106.730 19.260 0.181

𝛼»=40 N 2012-10-28 12:00 66 0.010 17.620 42.370 2.404

New York State 𝛼»=40 Y 2012-10-28

12:00 48 0.260 21.860 20.100 0.920

𝛼»=40 N 2012-10-28 06:00 144 0.087 36.400 101.400 2.730

New Jersey State

𝛼»=60 Y 2012-10-28 00:00 120 0.176 52.000 55.00 1.057

𝛼»=60 N

2012-10-27 12:00 168 0.001 21.179 140.820 6.648

2012-11-06 00:00 48 0.018 8.907 33.090 3.715

Pennsylvania State

𝛼»=60 Y 2012-10-28 00:00 120 0.180 58.930 49.060 0.833

𝛼»=60 N

2012-10-26 06:00 144 0.003 12.600 125.390 9.949

2012-11-02 06:00 72 0.015 13.970 52.026 3.720

Ear

thqu

ake

Bohol, Philippines 𝛼'=90 NA

2013-10-15 00:00 54 9.330 120.470 162.310 1.340

2013-10-19 18:00 24 11.035 64.956 115.680 1.780

Iquique, Chile 𝛼'=90 NA 2014-04-02 18:00 48 38.167 519.058 344.890 0.664

Napa, USA 𝛼'=90 NA 2014-08-23 18:00 18 6.416 27.490 37.503 1.360

Wild

Fir

e NSW1, Australia 𝛼'=90 NA 2013-10-18

12:00 18 8.257 58.860 28.230 0.480

𝛼»=40 NA 2013-10-19 06:00 48 0.188 22.440 19.550 0.870

NSW2, Australia 𝛼»=40,60 NA NO RL

Win

ter

St

orm

Xaver, Norfolk, Britain 𝛼»=40 NA 2013-12-02

12:00 48 0.339 25.370 16.629 0.655

43

Dis

aste

r N

ame

Loc

atio

n

Thr

esho

ld

Z S

core

Loc

atio

n Fi

lter

Star

t Tim

e

Rec

over

y T

ime(

hr.)

Max

D

evia

tion

Res

ilien

ce

(R)

Res

ilien

ce

Los

s (R

L)

Rat

io

(RL

/R)

Xaver, Hamburg, Germani

𝛼»=40 NA 2013-12-04 18:00 48 0.035 24.817 17.182 0.690

𝛼'=90 NA 2013-12-13 12:00 36 4.306 55.704 43.480 0.780

Atlanta, USA 𝛼»=40 NA 2014-01-28 12:00 54 0.261 20.450 27.545 1.346

Rai

n St

orm

Phoenix, USA 𝛼»=40 NA 2014-09-06 18:00 60 0.329 40.000 13.000 0.413

Detroit, USA 𝛼»=40 NA Not Enough Pre-Disaster Data Baltimore, USA 𝛼»=40,60 NA NO RL

Typ

hoon

Wipha, Tokyo, Japan 𝛼»=40,60 NA NO RL

Halong, Okinawa, Japan 𝛼»=40 NA 2014-07-29

06:00 96 0.616 74.000 10.000 0.135

Kalmaegi, Philippines

𝛼»=40 NA 2014-09-08 12:00 96 0.005 42.568 42.000 0.990

𝛼»=40 2014-09-23 12:00 54 0.003 24.188 23.811 0.980

Rammasun, Philippines 𝛼»=40,60 NA NO RL

Note: NA=Not Applicable, Y=Yes, N=No

3.4 Conclusion In this study, we present a method to compute resilience metrics using geo-location data from social media. The proposed method can detect an extreme event from human movements, measure the recovery time and the maximum deviation from a steady state mobility indicator, and assess the values of resilience and resilience loss. Applying this method on multiple disaster data, we find that human movements within a geographic area (e.g., trips only within a city) is less affected compared to all the movements associated with the area (e.g., trips from, to, and within the city). Disasters such as hurricane, typhoon, winter storm decrease human mobility and the amount of perturbation depends on the location and severity of the disaster. However, an earthquake increases human mobility causing a significant resilience loss. This is probably because an earthquake is unpredictable while for the other disasters people had warnings lasting over multiple days.

44

4 PREDICTING EVACUATION TRAFFIC DEMAND 4.1 Problem Description In recent times, hurricanes Matthew, Harvey, and Irma disrupted the lives of millions of people across multiple states in the United States. During a hurricane, mandatory or voluntary evacuation orders are issued over a large region so that potentially impacted people can move to safer places. Under a hurricane evacuation, it is critical for emergency agencies to ensure smooth operations of interdependent infrastructure systems and emergency services. Efficient traffic operations can maximize the utilization of existing transportation infrastructure reducing evacuation time and stress due to massive congestion. Accurately predicting evacuation traffic is critical to plan for effective traffic operations strategies. But because of the complex dynamic nature of evacuation participation, predicting evacuation traffic demand long ahead of the actual evacuation is a very challenging task. Real-time information from various sources can significantly help us to reliably predict evacuation demand.

During the recent hurricanes (Matthew, Irma), massive evacuations took place in the entire Florida region especially in its coastal counties. Millions of people were under mandatory evacuation orders, creating severe congestion in major evacuation routes especially the interstate highways (I-75 and I-95). To alleviate congestion, emergency management agencies can adopt strategies such as opening hard shoulder for traffic, contraflow operations, modified traffic control, route guidance, and staged evacuation etc. However, traffic prediction plays the most critical role to decide upon the nature and extent of such congestion management strategies. Existing works on traffic prediction mainly focus on short term (5 mins to 1 hour) prediction, which is not adequate for managing hurricane evacuations that last several days. During hurricanes, traditionally adopted short-term features such as present and past traffic conditions are not enough to make traffic predictions. Social media messages and geotagged information about the actions taken by the users can provide valuable signals for predicting evacuation traffic in the long term. The objective of this study to investigate how real-time information from traffic and social media sensors can be used to better predict long-term evacuation traffic demand.

We propose a machine learning approach for making long-term evacuation traffic prediction. In particular, we propose recurrent neural network (RNN) architecture with a long-short term memory (LSTM) model to predict traffic volume during hurricane for different time horizons. We have used twitter data from hurricanes Matthew and Irma and the corresponding traffic counts from the loop detectors in two major interstate highways (I-75 and I-95). We compare the results with a baseline forecast and other machine learning algorithms such as K-nearest neighbor regression (KNN regression), support vector regression (SVR), gradient boosting regression (GBR). Experimental results show that during hurricane evacuation, LSTM-NN captures the traffic demand irregularities better than the other models. Finally, we propose a KNN regression model to predict the error from LSTM-NN model. In this work, we answer four research questions as summarized below:

• During hurricane evacuation, can we predict evacuation traffic for a longer time horizon utilizing real-time data from traffic sensors and social media? We collect traffic data and Twitter data during two major hurricane evacuation periods. We use these two data sources for predicting evacuation traffic demand for a longer forecast horizon (=>1hour).

• How long ahead evacuation traffic demand can be predicted using real-time data? For that we apply the proposed models for different forecast horizon (1 hour to 30 hour) and find the predictive performance of the models.

• How well the predictive model performs when one of the data sources is not available? We apply the models for different combinations (only traffic data, only social media data, combined) of the features and compare the predictive performance of the models?

45

• How can we predict the uncertainties of the evacuation demand predictions? We implement a machine learning model to predict the possible errors in prediction and give the prediction with 90% confidence interval

4.2 Study Area and Data Description In this study, for predicting evacuation traffic, we have used both traffic volume and Twitter data. We collect traffic volume from two detectors: one in I-75 and the other one from I-95 interstate highway. We collect only northbound volume data as we are interested in only the evacuation traffic moving from the affected regions. The detector at I-75 is located at I-75 north bound direction at mile marker 330.2 (see Figure 4.1) (detector id-9828). The detector at I-95 is located at north bound direction at zone id-10077, district 5, Florida at location I95-N US 92 (see Figure 4.1). These detector tools are operated by Florida DOT and we have collected the data from www.ritis.org. The data include traffic volume at 15 minutes intervals.

FIGURE 4.1 Detector Locations at I-75 and I-95 and the Study Area

Detector Location at

I-95 Detector Location at

I-75

46

For social media data, we have used Twitter data from hurricanes Matthew and Irma. We purchased hurricane Matthew data from Twitter. The data were gathered by Twitter using keywords such as hurricane, matthew, hurricanematthew, huracan, huracanmatthew, huracan, storm, evacuation, evacuations, and FEMA. Matthew data contains 11.5 million tweets collecting from September 25, 2016 to October 24, 2016. Searching similar keywords related to Irma, we collected hurricane Irma data using Twitter streaming API for a selected bounding box covering Florida, Georgia, North Carolina and South Carolina. For hurricane Irma, we collected around 1.8 million geotagged tweets from September 5, 2017 to September 14, 2017.

4.3 Data Preparation Interstate highways I-75 and I-95 are the mostly used routes during evacuations from Florida. We take the sum of the traffic volume of I-75 and I-95 to represent the overall evacuation trip from the associated regions. Figure 4.2 shows the traffic volume generated during hurricanes Matthew and Irma. During hurricane Matthew, I-95 traffic was higher than I-75 traffic because Matthew was expected to hit in the east coast. On the other hand, during hurricane Irma, at first traffic on I-95 was higher than I-75; but later (after September 8, 2017) it was the opposite. Predicted path of hurricane Irma changed overnight on September 8, 2017. At first Irma was expected to hit at east coast, but later it changed its path and hit from the west coast.

FIGURE 4.2 Traffic Volume for 15 Minutes Intervals at Interstate Highways during Hurricane Evacuation (a) Hurricane Matthew (b) Hurricane Irma

At first, hurricane Matthew twitter data are filtered for geotagged tweets within the study region bounded by coordinates (25.072, -82.963; 29.352, -79.232). Similarly, hurricane Irma data are also filtered

47

by the tweets coming from our study area. We also filter both data by evacuation related tweets having words such as 'evacuation', 'evac', 'sheltering', 'evacuating', 'evacuate' etc. We aggregated the tweets based on 15 minutes interval to be consistent with the traffic volume data.

(a)

(b)

FIGURE 4.3 Tweet Frequency Distribution over Time (a) for hurricane Matthew (b) for hurricane Irma

48

Figure 4.3 shows the 15 minutes tweet distribution of hurricanes Matthew and Irma for different filter. In both cases, tweet frequency increases, as time get closer to landfall and then decreases as time increases after landfall. There are some missing data in both traffic and tweet data. We recover the missing data because maintaining data sequence in time series forecasting is very important. We process the missing value differently for traffic and twitter data. For traffic data, as the missing data is for very small period, we linearly interpolate the missing data. Hurricane Irma data has missing values for very long period that cannot be recovered by interpolation. Thus, we collect historical data for the active users found in the streaming data during hurricane evacuation. We standardize the data before fitting into the model thus exact values of the data doesn’t have influence in the prediction if the trend of the data is same.

FIGURE 4.4 Twitter Features (a) for hurricane Matthew (b) for hurricane Irma

49

Figure 4.4 shows the created features: tweet count, unique user count, evacuation tweet count at 15 minutes interval for both hurricane Matthew (Figure 4.4a) and hurricane Irma (Figure 4.4b). We have also created the following features: time difference (in hours) from landfall, hour of the day, number of counties ordering mandatory evacuation, number of counties ordering voluntary evacuation. Note that, hurricane Matthew made its landfall on October 8, 2016 and hurricane Irma made landfall on September 10, 2017. In total, we get 716 observations, 263 are from Matthew and 453 are from Irma.

4.4 Methodology To predict evacuation traffic during a hurricane, we have applied long-short neural network (LSTM-NN) which is a special type of recurrent neural network (RNN). In machine learning, recurrent neural networks (RNN) are popular for learning sequential trends. It has been used to solve many problems such as speech recognition [46], language modeling [46], image captioning [46] etc. Unlike traditional neural network, a recurrent neural network has loops in them (Figure 4.5(a) left) which allow to pass message to a successor.

Activation function ∑ sum over all input

(a) RNN architecture. Adopted from [47]

(b) LSTM cell

FIGURE 4.5 Architecture of RNN (a) a single neuron RNN unrolled trough time (b) a standard LSTM cell

50

Figure 4.5(a) (right) shows a one neuron RNN unrolled through time. This chain like nature reveals its potential to learn sequence both from current inputs and previous relevant information. Although standard RNN performs well in general time series forecasting, it performs less in learning long-term dependencies due to vanishing/exploding gradient problem during backpropagation [48]–[50]. LSTM, introduced by Hochreiter and Schmidhuber, resolves the problem by remembering information for long period of time [50]. Like RNN, LSTM also has the form of a chain of repeating modules of neural network. Unlike RNN’s simple (e.g. a simple tanh layer) module, LSTM has four layers interacting in a very special way. Figure 4.5(b) shows an LSTM cell with different components in it. Here 𝜎 represents a sigmoid function 𝜎(𝑥) = ;

;ÍÎÏÐ([�) and 𝑡𝑎𝑛ℎ represents a hyperbolic tangent function tanh(𝑥) = ÎÏÐ(Ï)[ÎÏÐ([Ï)

ÎÏÐ(Ï)ÍÎÏÐ([Ï) .

The key difference between LSTM and RNN is the cell state (𝐶2) shown as a horizontal line in Figure 4.5(b). It runs through the entire chain with some minor linear interaction. Thus, it helps to keep track of long term dependencies. It is also known as long-term state. LSTM allows to add or remove information to the cell state by some structures call gates. Gates are composed to a sigmoid neural net and a pointwise multiplication operation (see Figure 4.5(b)). LSTM have three such gates. Sigmoid layer gives output numbers between zero to one where zero means nothing and one means everything. Like RNN, LSTM also uses the previous short-term state (ℎ2[;) and current input (𝑋2) and feed this into the above discussed layers. The first step is to decide what information to forget. For this, the forget gate (𝑓2) took ℎ2[; and 𝑋2 and outputs numbers between 0 to 1 for each element in the cell state. Mathematically the operation is shown below in equation (4.1).

𝑓2 = σ(WÕ. [hÖ[;, 𝑥2] + 𝑏×) (4.1) 𝑊×, 𝑏× are the weight matrices and bias for the corresponding forget gate neural network. Next step decides what new information to store in the cell states. It is performed in two parts: input gate layer (𝑖2) decides which values to update through its sigmoid layer, next a 𝑡𝑎𝑛ℎ layers convert the values into a vector (𝑔2) of new by its activation function. These two operations are shown below in equation (4.2) and (4.3).

𝑖2 = σ(W#. [hÖ[;, 𝑥2] + 𝑏#) (4.2) Here, 𝑊#, 𝑏# are the weight matrices and bias for the corresponding input gate neural network.

𝑔2 = tanh(WØ. [hÖ[;, 𝑥2] + 𝑏Ø) (4.3) Here, 𝑊Ø, 𝑏Ø are the weight matrices and bias for the corresponding 𝑡𝑎𝑛ℎ layer. The next step is to update the old cell state (𝐶2[;) into new state (𝐶2). The new cell state will be the combined result after forget gate and input gate operations. Equation (4.4) shows the updated cell state.

𝐶2 = fÖ ∗ 𝐶2[; + 𝑖2 ∗ 𝑔2 (4.4) The last step decides the outputs from the current LSTM. The output will be a filtered version of the current cell state (𝐶2). It has two parts: previous state (ℎ2[;) and input (𝑋2) go through a sigmoid layer to decide what part of the cell state will be the output and the cell state go through a 𝑡𝑎𝑛ℎ layer (to push the values to be between −1 and 1). Then multiplication of this two gives the output which is the decided part of the cell state. The operations are shown in equation (4.5) and (4.6).

𝑜2 = σ(WÛ. [hÖ[;, 𝑥2] + 𝑏Û) (4.5) Here, 𝑊Û, 𝑏Û are the weight matrices and bias for the corresponding sigmoid layer.

ℎ2 = oÖ ∗ tanh(𝐶2) = 𝑌2 (4.6) 4.5 Model Development The objective of this study is to predict hurricane evacuation traffic volume for longer time horizon. We have used traffic sensor data and Twitter data as inputs to the proposed LSTM-NN model. We set the

51

prediction problem as: given the traffic or twitter data or both (𝑋2) at time 𝑡, what is the traffic volume after ℎ time intervals 𝑌(2Íz), where ℎ represents the forecast horizon.

The LSTM-NN input data (X) needs to be provided with specific dimensions of array where dimension of X indicates [samples, timesteps, features]. Our features are multivariate as we are using 6 features (time difference from landfall, hour of the day, tweet count in the study area, evacuation related tweet count in the study area, number of counties ordered mandatory evacuation, number of counties ordered voluntary evacuation.) for single time period. However, these features are used in different combination (only sensor data, only twitter data, combination of both) to test how the model performs under different conditions of data availability. The time steps dimension indicates how many time instances we are using to predict the output. For example, [𝑋2[;, 𝑋2] can be used to make prediction of 𝑌2. Now, the first step is to find the time horizons for which the model can perform reasonably. We iteratively find that a single layer LSTM with 50 neurons for all the model performs reasonably well. Batch size is a parameter which represents the number of training example to be considered in one forward or backward pass. Study shows that larger batch size degrades the quality of the model[51]. We find that batch size = 4 performs well on our data for all the forecast horizon. We iteratively run the model for time horizon from 1 hour to 30 hours. For each time horizon, we run the model 5 times to capture the randomness. Along the run, we find the best epoch size (one forward and one backward pass on all training samples) to ensure that the model does not overfit.

We compare prediction accuracy of the proposed LSTM-NN model with some traditional machine learning algorithms such as K-nearest neighbor regression (KNN regression), support vector regression (SVR), and gradient boosting regression (GBR). We iteratively choose the best parameter for these algorithms. In KNN regression, large numbers of neighbors can underfit the model and small number can overfit it. We find that KNN with 3 neighbors performs best in most cases on our sample. For SVR, a 2-degree polynomial kernel with penalty parameter around 100000 performs optimally on our data. We iteratively find that GBR with 10 boosting stage and maximum depth=4 are the best parameters for our data. We have used 10-fold cross validation to capture the randomness in results. We have also created a baseline forecast. In this baseline forecast, current traffic volume is predicted as the traffic volume in the next time interval. With forecast horizon ℎ, at a given time (𝑡), if the current traffic volume is 𝑌2, traffic prediction for (𝑡 + ℎ) is equal to 𝑌2 (i.e., 𝑌2Íz = 𝑌2).

Unlike other regression problems, in time series forecasting, the order of the observation is important to learn the sequence. Thus, keeping the order of the sequence, we have used hurricane Matthew data and a part of the hurricane Irma data as training set (566 observations). The rest of the hurricane Irma data is used as a validation dataset (150 observations). To evaluate the performance of the implemented models, we have used Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE) as performance measures. The equation for performance measures are given below:

𝑅𝑀𝑆𝐸 = ¨∑ (𝑌2 − 𝑦à2)>�

24;𝑛

(4.7)

MAPE =h|𝑌2 − 𝑦à2|

𝑌2

�

24;

× 100% (4.8)

Here, 𝑌2is the actual traffic volume and 𝑦à2 is the predicted value of traffic volume and 𝑛is the number of test observations.

52

Next, we predict the error in the predictions made by our best model. For that, we implement a KNN model to predict the error for the prediction made by the LSTM-NN model and based on that we calculate a confidence interval for that prediction. We use the validation set input data as input and the corresponding squared error as target to train the model. For calculating the confidence interval, we assumed the error as normally distributed where, prediction made by the LSTM-NN model is the mean(𝜇) and prediction made by the trained KNN is the variance (𝜎>). As the error is normally distributed: 5𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑣𝑎𝑙𝑢𝑒 = 𝜇 − 1.65 ∗ 𝜎 , and 95%𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑣𝑎𝑙𝑢𝑒 = 𝜇 + 1.65 ∗ 𝜎.

4.6 Results We implement the LSTM-NN model for different forecasting horizons ranging from 1 hour to 30 hours. Also, we run the model separately considering 4 types of features: only traffic sensor data, only twitter data, combination of both and lastly only the top-4 important features. We calculate the feature importance using Gradient boosting regressor. Feature importance values for all the available features are shown in figure 4.6 for different forecast horizons. It shows that for small time horizon (1-5 hours) traffic volume has the highest importance. As the forecast horizon increases (7-30 hours), importance value of time difference from landfall feature increases. This implies that in predicting evacuation traffic for longer forecasting horizon time difference from landfall plays a very critical role. Interestingly, twitter features have almost no importance for forecast horizon between (1-10 hours), but the importance value increases from forecast horizon 11-15 hours. People tweets before the actual evacuation. In addition, we are considering tweet feature within our study area but measuring traffic on two specific point of exit in the study area. It takes time to travel to the exit point from the other points within the study area. It means that the twitter features are very important in predicting traffic volume around 11-15 hours ahead.

The test RMSE and test MAPE for LSTM-NN model are shown in figure 4.7. We run the model for different forecast horizons with different combination of features. Except for the twitter only features, in all the combinations, 1-hour forecast horizon has the lowest RMSE and MAPE value. This is quite expected since we have the most recent information in this case for our prediction purpose. As the forecast horizon increases, RMSE and MAPE increases with some exceptions. From forecast horizon 1 hour to 12 hours, model trained with the sensor data has performed better than the model trained with other combination of features, which is consistent with feature importance findings that traffic volume has higher importance initially. From forecasting horizon 12 hour to 25 hours, we get lower MAPE for the models trained with top 4 important features but not always in terms of RMSE. At forecasting horizon 15-hour, 24 hour and 26 hours, we get both lower RMSE and MAPE for the models trained with top 4 important features than the model trained with only sensor data.

53

LSTM-NN model trained from only Twitter data gives its best prediction at 15-hour forecasting horizon. It outperforms the results obtained from the model trained with sensor data. It may be because, very often people post about their decision prior to the actual action. Also, the distance between sensor location and the study region is not same for all the areas with in the study region. Thus, it may take some time to realize the traffic impacts of those stating evacuation intents in Twitter.

FIGURE 4.6 Feature Importance for Different Time Horizon

54

(a)

(b)

FIGURE 4.7 Validation (a) RMSE (b) MAPE for different Forecast Horizon for LSTM-NN

55

For SVR and GBR, the trend of the results (RMSE & MAPE) are like LSTM-NN. For KNN model trained with sensor data always gives the better result than the model trained with other combination of features.

We present the result of LSTM-NN model and other machine learning models in Table 4.1 for forecasting horizons 1 hour, 15 hours, and 24 hours. All the models are compared against a baseline persistence forecast because the models should perform at least as good as the baseline persistence forecast. Models with the best performance are marked as bold. For 1-hour forecasting horizon, all models trained with twitter data failed to outperform the baseline results and for the other combinations only the LSTM models outperform baseline forecast. We can see that overall LSTM-NN models perform better than the baseline and other algorithms. Interestingly, for 15-hour forecasting horizon, LSTM-NN model trained with twitter data has the best result. Using twitter data alone has achieved 30.77% MAPE and 208.12 RMSE values, which are pretty good in comparison to combined feature and sensor only prediction. Many times, traffic sensor data is unavailable, twitter data can be used to get necessary insight at that time. And noticeably adding Twitter features with the sensor data lowers the MAPE of the prediction for 15-hour forecasting horizon for LSTM-NN model. At 24-hour forecast horizon, the LSTM model trained with top-4 important features have the best result in terms of RMSE (193.70) and MAPE (22.20 %) which is exciting considering evacuation traffic can be predicted 24 ahead with such accuracy.

TABLE 4.1 Model Performance Summary

Figure 4.8 and 4.9 shows actual data and prediction on training and test set using the LSTM model for 1 hour (important features) and 24 hour (important features) forecast horizon. Figure 4.8 shows the

Model Forecast Horizon (hour/s) 1 15 24

RMSE MAPE RMSE MAPE RMSE MAPE Baseline 154.40 17.15% 795.40 110.39% 381.10 79.05% KNN (twitter) 395.49 39.33% 395.38 46.44% 548.17 132.61% SVR (twitter) 1355.49 283.46% 206.79 30.43% 366.30 88.98% GBR (twitter) 290.04 40.22% 297.316 49.45% 316.32 53.59% LSTM-NN (twitter) 395.214 99.715 162.093 23.97% 415.867 98.873% KNN (twitter and sensor) 352.89 32.18% 736.27 66.27% 541.19 129.14% SVR (twitter and sensor) 2247.35 438.10% 696.50 128.55% 442.41 90.77% GBR (twitter and sensor) 195.62 39.82% 331.74 55.33% 281.32 58.54% LSTM-NN (twitter and sensor) 161.42 15.64% 208.12 30.77% 209.608 29.765 KNN (sensor) 219.67 23.20% 308.95 32.24% 292.09 25.21% SVR (sensor) 162.40 20.41% 345.53 42.82% 223.55 30.02% GBR (sensor) 190.82 39.03% 322.85 48.68% 318.47 59.30% LSTM-NN (sensor) 134.06 12.63% 205.61 40.99% 256.74 29.51% KNN (Important Features) 254.428 24.96% 543.34 48.89% 245.18 29.91% SVR (Important Features) 170.75 21.57% 226.87 32.66% 239.20 45.17% GBR (Important Features) 195.85 39.78% 320.58 53.28% 288.31 57.34% LSTM-NN (Important Features) 135.44 13.69% 262.410 32.48% 193.70 22.20%

56

result when we have used the full hurricane Matthew data (Figure 4.8(a), 4.8(c)) and only a portion of hurricane Irma (Figure 4.8(b),4.8(d)) data as training data and the rest of the hurricane Irma data is used as a test dataset. Figure 4.9 shows the result when we have used a portion of both hurricane Matthew data (Figure 4.9(a),4.9(c)) and hurricane Irma (Figure 4.9(b),4.9(d)) data as training data and the rest of the hurricane Matthew and Irma data is used as a test dataset. For 24 hours forecast horizon, there is no prediction on the first 24 hours data because data are not available for training 24 hour before that period (see Figure 4.8(c),4.8(d),4.9(c),4.9(d)). Prediction on training data fits well for both hurricane Matthew and Irma data for both 1 hour and 24-hour forecast horizon. On the other hand, prediction on the test set show that, prediction for 1-hour forecast horizon fitted better than the prediction for 24-hour horizon (see Figure 4.8 and 4.9), capturing the trends well enough. It shows that although prediction accuracy decreases with increase in forecast horizon, the implemented model learns the overall trend well enough. Moreover, prediction on test data for hurricane Irma has fitted better than hurricane Matthew. Irma has more data point to train than Matthew. It may be the reason for better prediction on Irma data than on Matthew data. We have also shown the 90% confidence interval of the prediction on test set for 1-hour forecast horizon. Predicted value by the best model (LSTM-NN) is always in between the predicted confidence interval. Moreover, the interval is greater when the demand prediction error is greater, and the interval is almost equal to zero when the LSTM-NN model make perfect evacuation demand prediction. Thus, the confidence interval prediction is working well in capturing the uncertainty in prediction by the LSTM-NN model. Evacuation demand prediction with its associated confidence interval will help to interpret the prediction results more reliably.

57

FIGURE 4.8 Prediction with 90% Confidence Interval on training and test data when test data contains only Irma data. (a), (c) for hurricane Matthew; (b),(d) for hurricane Irma

58

FIGURE 4.9 Prediction with 90% Confidence Interval on training and test data when test data contains both Matthew and Irma data. (a), (c) for hurricane Matthew (b), (d) for hurricane Irma

59

4.7 Conclusions Traditional approaches for predicting hurricane evacuation demand uses survey-based data and they work well upon a fixed set of expectations, which may not be suitable for real-time traffic prediction. Information from real time data sources can make evacuation traffic management more dynamic, flexible, and proactive. In this study, we have used traffic sensor and Twitter data to predict hurricane traffic demand for longer term forecasting horizons. We have applied a machine learning model known as LSTM neural networks to predict evacuation traffic demand for different forecasting horizons ranging from 1 hour to 30 hours. We have also applied the model for different combination of features (only traffic sensor data, only twitter data, both sensor and twitter data, only important features). Among the modeling approaches, LSTM-NN outperforms other models in terms of accuracy. Social media features show its best predictive power for 15 hours forecast horizon. Model trained on social media data can help to make reasonable predictions of evacuation traffic when sensor data are not available. We also implement a method to predict the confidence interval of the demand prediction made by the model. These approaches allow us to measure the reliability of the predicted evacuation traffic demand.

This study, however, has some limitations. We sum up the traffic from I-75 and I-95 to get the total evacuation traffic. Although most of the evacuation traffic uses these two highways, this assumption is not always true. Traffic sensor data suffers from missing data. Choosing study area for collecting Twitter data for evacuation demand prediction may not be possible for many road sections. Thus, a rigorous study area and sample selection may improve our prediction accuracy in future.

60

5 CONCLUSIONS AND RECOMMENDATIONS Social media is a great tool for individual users and organizations to communicate during a disaster. During a disaster, getting more attention to a disaster related content will lead to faster communications. In this study, we investigate the contributing factors to gain more attention during a hurricane. If a user or an organization adopts its activity and other factors favorable to gain more attention, information spreading during a crisis is likely to be faster.

Another aspect of social media data is having the opportunity to collect location traces of the users. Especially during a disaster, this gives an opportunity to assess evacuation decisions in response to different disaster related factors such as evacuation order, landfall timing, evacuation destination etc. We have developed a method to identify evacuation decision and destination from Twitter data. Our study shows that the developed input output hidden Markov model can potentially be used to infer user evacuation decisions from real time social media data. While traditional approach of evacuation modeling mainly depends on post-disaster data collection, which is costly and limited to small geographic area, this approach can significantly help in making evacuation management more dynamic and cost effective.

Resilience is one of the most important properties of a community to recover and adapt after an extreme event. In this study, we have used location based social media data to develop appropriate metrices to quantify human mobility resilience. This study uncovers that different types of disaster have different impact on human mobility depending on the intensity of the event. The findings of this study are very important for understanding the nature and amount of perturbation and the subsequent resilience loss in human mobility due to a disaster. Thus, it will help understanding the higher-order impacts of a disruptive event in human society and national economy. It can also help in policy making, as resilience assessment is critical for building a resilient transportation system.

During hurricane evacuation, the biggest challenge is to manage traffic in major evacuation routes. High traffic demand often leads to serious traffic congestion. To manage this traffic, various traffic operation strategies like contraflow, use of shoulders, modified signal timing etc. are taken. However, for taking these decisions, reliable prediction about the upcoming traffic demand is very important. In this study, we have developed traffic demand prediction using real-time traffic sensor and Twitter data for different forecast horizons (1 hour to 30 hours). As the proposed method is built upon real-time data, it can make evacuation traffic operations more dynamic and reliable.

Besides many opportunities, social media is subjected to several challenges due to its data volume and velocities. Social media data is also very unstructured and mixed with rumors, advertisements, and uneducated opinions. Thus, extracting actionable information for the responders demands dynamic algorithms or systems to filter out the noises. Moreover, location traces can be collected only when a user post something about it. Hence mobility analysis using such data may not be representative to the actual population mobility behavior. Social media data is also prone to selection bias as it has different penetration rate over geographic areas and its users may not represent actual population of a region.

61

REFERENCES

[1] A. M. Sadri, S. Hasan, S. V. Ukkusuri, and M. Cebrian, “Understanding Information Spreading in Social Media during Hurricane Sandy: User Activity and Network Properties,” arXiv Prepr. arXiv1706.03019., 2017.

[2] A. M. Sadri, S. Hasan, S. V. Ukkusuri, and J. E. S. Lopez, “Analyzing Social Interaction Networks from Twitter for Planned Special Events,” arXiv Prepr. arXiv1704.02489, 2017.

[3] C. Vaca, L. M. Aiello, A. Jaimes, and P. Milano, “Modeling Dynamics of Attention in Social Media with User Efficiency,” EPJ Data Sci. 3.1 5., 2014.

[4] S. A. Myers and J. Leskovec, “The bursty dynamics of the twitter information network,” in Proceedings of the 23rd international conference on World wide web, 2014, pp. 913–924.

[5] E. S. Blake, T. B. Kimberlain, R. J. Berg, Cangia, J. P. Losi, and J. L. Beven II, “Tropical cyclone report Hurricane Sandy (AL182012) 22 – 29 October 2012,” Natl. Weather Serv. Natl. Hurric. Cent., no. October 2012, pp. 1–157, 2013.

[6] C. M. Kryvasheyeu Y, Chen H, Moro E, Van Hentenryck P, “Performance of Social Network Sensors During Hurricane Sandy,” PLoS one 10.2 e0117288., vol. 10, no. 2, 2015.

[7] W.-S. Kim, Y.-Z. Yoon, J.-H. Bae, and K.-S. Soh, “Nonlinear characteristics of heart rate time series: influence of three recumbent positions in patients with mild or severe coronary artery disease.,” Physiol. Meas., vol. 26, no. 4, pp. 517–29, 2005.

[8] V. Srinivasan, C. Eswaran, and N. Sriraam, “Approximate Entropy-Based Epileptic EEG Detection Using Artificial Neural Networks,” vol. 11, no. 3, pp. 288–295, 2007.

[9] S. Pincus and R. E. Kalman, “Irregularity, volatility, risk, and financial market time series,” Proc. Natl. Acad. Sci., vol. 101, no. 38, pp. 13709–13714, 2004.

[10] S. Pincus and A. Goldberger, “Physiological time-series analysis: What does regularity quantify?,” Am. J. Physiol., vol. 266, no. 4 Pt 2, pp. H1643–H1656, 1994.

[11] S. M. PINCUS, “Approximate entropy as a measure of system complexity,” Proc. Natl. Acad. Sci. 88.6 2297-2301., 1991.

[12] E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini, “The Rise of Social Bots,” arXiv Prepr. arXiv1407.5225, no. grant 220020274, pp. 1–11, 2014.

[13] C. A. Davis, O. Varol, E. Ferrara, A. Flammini, and F. Menczer, “BotOrNot: A System to Evaluate Social Bots,” arXiv, vol. : 1602.009, 2016.

[14] O. Torres-reyna, “Getting Started in Logit and Ordered Logit Regression Logit model,” Princet. Univ. , URL http//dss. princeton. edu/training/Logit. pdf, 2012.

[15] F. Washington, S.P., Karlaftis, M.G. and Mannering, “Statistical and Econometric Methods for Transportation Data Analysis,” CRC Press, 2010.

[16] B. Derr, “Ordinal Response Modeling with the LOGISTIC Procedure,” SAS Glob. Forum, pp. 1–20, 2013.

[17] Tousignant Lauren, “The cost of natural disasters nearly doubled in 2017,” NEWYORK POST, 2017.

62

[18] S. Hasan and G. Foliente, “Modeling infrastructure system interdependencies and socioeconomic impacts of failure in extreme events: emerging R&D challenges,” Nat. Hazards, vol. 78, no. 3, pp. 2143–2168, 2015.

[19] S. Re, Mind the risk: a global ranking of cities under threat from natural disasters. Swiss Re, 2013.

[20] D. Guha-sapir, P. Hoyois, and R. Below, “Annual Disaster Statistical Review 2016: The numbers and trends,” Cent. Res. Epidemiol. Disasters, 2017.

[21] P. Murray-Tuite and B. Wolshon, “Evacuation transportation modeling: An overview of research, development, and practice,” Transp. Res. Part C Emerg. Technol., vol. 27, pp. 25–45, 2013.

[22] S. Hasan, S. Ukkusuri, H. Gladwin, and P. Murray-Tuite, “Behavioral Model to Understand Household-Level Hurricane Evacuation Decision Making,” J. Transp. Eng., vol. 137, no. 5, pp. 341–348, 2011.

[23] E. Chaniotakis, C. Antoniou, and F. C. Pereira, “Enhancing resilience to disasters using social media,” in 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), 2017, pp. 699–703.

[24] Y. Xiao, Q. Huang, and K. Wu, “Understanding social media data for disaster management,” Nat. Hazards, vol. 79, no. 3, pp. 1663–1679, 2015.

[25] Q. Wang and J. E. Taylor, “Quantifying human mobility perturbation and resilience in hurricane sandy,” PLoS One, vol. 9, no. 11, pp. 1–5, 2014.

[26] Y. Martín, Z. Li, and S. L. Cutter, “Leveraging Twitter to gauge evacuation compliance: Spatiotemporal analysis of Hurricane Matthew,” PLoS One, vol. 12, no. 7, pp. 1–22, 2017.

[27] A. MARSHAL, “4 Maps That Show the Gigantic Hurricane Irma Evacuation,” wired, 2017. .

[28] L. A. Luz Lazo, “Airlines scramble and roads fill as residents and visitors rush to get out of Florida ahead of Irma,” The Washington Post, 2017. .

[29] FLKEYSNEWS, “Millions of Floridians who fled Irma are eager to get home. Patience will be necessary,” 2017. .

[30] A. News, “Hurricane Irma begins to reach Florida as millions of residents evacuate ahead of monster storm.” .

[31] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, 1989.

[32] Y. Bengio and P. Frasconi, “An Input Output HMM Architecture,” Neural Inf. Process. Syst., pp. 427–434, 1995.

[33] R. J. W. D. E. Rumelhart, G. E. Hinton, “Learning internal representations by backpropagating errors,” Nature, vol. 323, 1986.

[34] M. Sahlgren, “The distributional hypothesis,” Ital. J. Disabil. Stud., pp. 1–18, 2008.

[35] M. Baroni and G. Dinu, “Don ’ t count , predict ! A systematic comparison of context-counting vs . context-predicting semantic vectors.”

[36] A. Huang, “Similarity measures for text document clustering,” Proc. Sixth New Zeal., no. April,

63

pp. 49–56, 2008.

[37] X. Zhu, “Semi-supervised learning tutorial,” in International Conference on Machine Learning (ICML), 2007, pp. 1–135.

[38] S. Hosseini, K. Barker, and J. E. Ramirez-Marquez, “A review of definitions and measures of system resilience,” Reliab. Eng. Syst. Saf., vol. 145, pp. 47–61, 2016.

[39] M. Bruneau et al., “A Framework to Quantitatively Assess and Enhance the Seismic Resilience of Communities,” Earthq. Spectra, vol. 19, no. 4, pp. 733–752, 2003.

[40] T. J. Wang Q, “Patterns and limitations of urban human mobility resilience under the influence of multiple types of natural disaster,” Dryad Digital Repository, 2016. [Online]. Available: http://datadryad.org/resource/doi:10.5061/dryad.88354.

[41] Q. Wang and J. E. Taylor, “Patterns and limitations of urban human mobility resilience under the influence of multiple types of natural disaster,” PLoS One, vol. 11, no. 1, pp. 1–14, 2016.

[42] Y. Kryvasheyeu and H. Chen, “Performance of Social Network Sensors During Hurricane Sandy,” PLoS one 10.2 e0117288.

[43] R. C, “The cosine-haversine formula.,” Am. Math. Mon. 1957; 64(1)38–40.

[44] B. Donovan and D. B. Work, “Empirically quantifying city-scale transportation system resilience to extreme events,” Transp. Res. Part C Emerg. Technol., vol. 79, pp. 333–346, 2017.

[45] W. Qi and T. John E., “Quantifying, Comparing Human Mobility Perturbation during Hurricane Sandy, Typhoon Wipha, Typhoon Haiyan,” Procedia Econ. Financ., vol. 18, no. September, pp. 33–38, 2014.

[46] A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Acoustics, speech and signal processing (icassp), 2013 ieee international conference on, 2013, pp. 6645–6649.

[47] A. Géron, Hands-On Machine Learning with Scikit-Learn and TensorFlow. O’Reilly Media.

[48] F. A. Gers and F. Cummins, “1 Introduction 2 Standard LSTM,” pp. 1–19, 1999.

[49] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training recurrent neural networks,” in International Conference on Machine Learning, 2013, pp. 1310–1318.

[50] S. Hochreiter and J. Urgen Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.

[51] N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang, “On large-batch training for deep learning: Generalization gap and sharp minima,” arXiv Prepr. arXiv1609.04836, 2016.

64

APPENDIX: LIST OF PUBLICATIONS

Conference Proceedings

1. Roy, K. and Hasan, S., “Understanding the Effectiveness of Social Media Based Crisis Communication During Hurricane Sandy,” In Proceedings of Transportation Research Board 97th Annual Meeting, Washington D.C., January, 2018.

2. Roy, K. and Hasan, S., “Quantifying Human Mobility Resilience to Extreme Events Using Geo-located Social Media Data,” In Proceedings of Transportation Research Board 97th Annual Meeting, Washington D.C., January, 2018.

3. Roy, K. and Hasan, S., “Modeling the Dynamics of Hurricane Evacuation Decisions from Real-time Twitter Data”. Accepted for presentation at Transportation Research Board 98th Annual Meeting, Washington D.C., January, 2019.

Journal Articles

1. Sadri, A. M., Hasan, S., Ukkusuri, S. V., and Cebrian, M. 2018. Crisis communication patterns in social media during hurricane Sandy. Transportation Research Record. https://doi.org/10.1177/0361198118773896

2. Roy, K. and Hasan, S. “Understanding the Effectiveness of Social Media Based Crisis Communication During Hurricane Sandy”. Under review for a journal publication.

3. Roy, K. and Hasan, S. “Quantifying Human Mobility Resilience to Extreme Events Using Geo-located Social Media Data”. Under review for a journal publication.

4. Roy, K. and Hasan, S. “Modeling the Dynamics of Hurricane Evacuation Decisions from Real-time Twitter Data”. Under preparation for a journal submission.

5. Roy, K. and Hasan, S. “Predicting Hurricane Evacuation Traffic Demand in Real-time Synthesizing Data from Transportation Systems and Social Media”. Under preparation for a journal submission.

disaster analytics: disaster preparedness and … · generated content, social media usage on gps...

Documents