paper title (use style: paper title) · web viewgoogle earth is capable of displaying three...

11
Mining Social Media in Extreme Events Lessons Learned from the DARPA Network Challenge Nicklaus A. Giacobe, Hyun-Woo Kim and Avner Faraz College of Information Sciences and Technology The Pennsylvania State University University Park, PA USA [nxg13, hxk263] @ist.psu.edu, [email protected] Abstract— The DARPA Network Challenge was a nationwide exercise in the use of social media in extreme events. Teams competed to locate ten red weather balloons that DARPA tethered over public locations across the continental United States for seven to ten hours on Saturday, December 5, 2009. The MIT team won the event, finding all ten locations using monetary incentive and a multi-level marketing payout scheme. This paper outlines the methods used by the 10th place iSchools Caucus team, which used a combination approach of recruiting observers and the use of Open Source Intelligence (OSINT) to find six of the ten locations. Twitter feeds and the publicly available content on competing team websites were captured. Data from these mechanisms were evaluated for content validity using a combination of secondary observers, evaluation of the reputation of the reported observer and confirming true identities and locations of reporting individuals by mining additional data from several social networking sites. These methods may have application in law enforcement, homeland security and extreme events when there is a desire to use humans as soft sensors, but where it is impossible to directly recruit observers or motivate them with financial incentives. Keywords: Extreme Events, Social Media, Participatory Sensing, Human Sensors. I. INTRODUCTION The DARPA Network Challenge was conceived to learn more about social networking, crowd sourcing, and associated technologies to realize the power of the Internet [1][2]. DARPA announced that they would tether ten (10) red weather balloons over public locations across the continental United States on Saturday, December 5, 2009. Balloons would be launched at 9:00 AM EST and taken down at 4:00 PM, local time. East coast balloons were therefore going to be up for seven hours, while balloons on the west coast would be up ten hours. The Challenge was to locate and submit the latitude, longitude and balloon number of all ten balloons. The first to do so was to win the $40,000 prize. Over 4300 individuals registered on the DARPA site to participate in the Challenge. Some were well-organized teams. Others turned out to be individuals who wanted to participate, but quickly realized the enormity of the task and either joined teams or failed to participate. Fifty-eight (58) teams submitted more than one correct location. The Project Report from DARPA highlights lessons learned from some of the top teams [1]. This paper outlines the lessons learned from the 10th place iSchools Caucus Team. The iSchools Caucus is a collective of the twenty seven (27) University Colleges and Departments that strives to advance information science and related fields. "While each individual iSchool has its own strengths and specializations, together

Upload: others

Post on 25-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

Mining Social Media in Extreme EventsLessons Learned from the DARPA Network Challenge

Nicklaus A. Giacobe, Hyun-Woo Kim and Avner FarazCollege of Information Sciences and Technology

The Pennsylvania State UniversityUniversity Park, PA USA

[nxg13, hxk263] @ist.psu.edu, [email protected]

Abstract— The DARPA Network Challenge was a nationwide exercise in the use of social media in extreme events. Teams competed to locate ten red weather balloons that DARPA tethered over public locations across the continental United States for seven to ten hours on Saturday, December 5, 2009. The MIT team won the event, finding all ten locations using monetary incentive and a multi-level marketing payout scheme. This paper outlines the methods used by the 10th place iSchools Caucus team, which used a combination approach of recruiting observers and the use of Open Source Intelligence (OSINT) to find six of the ten locations. Twitter feeds and the publicly available content on competing team websites were captured. Data from these mechanisms were evaluated for content validity using a combination of secondary observers, evaluation of the reputation of the reported observer and confirming true identities and locations of reporting individuals by mining additional data from several social networking sites. These methods may have application in law enforcement, homeland security and extreme events when there is a desire to use humans as soft sensors, but where it is impossible to directly recruit observers or motivate them with financial incentives.

Keywords: Extreme Events, Social Media, Participatory Sensing, Human Sensors.

I. INTRODUCTION

The DARPA Network Challenge was conceived to learn more about social networking, crowd sourcing, and associated technologies to realize the power of the Internet [1][2]. DARPA announced that they would tether ten (10) red weather balloons over public locations across the continental United States on Saturday, December 5, 2009. Balloons would be launched at 9:00 AM EST and taken down at 4:00 PM, local time. East coast balloons were therefore going to be up for seven hours, while balloons on the west coast would be up ten hours. The Challenge was to locate and submit the latitude, longitude and balloon number of all ten balloons. The first to do so was to win the $40,000 prize.

Over 4300 individuals registered on the DARPA site to participate in the Challenge. Some were well-organized teams. Others turned out to be individuals who wanted to participate, but quickly realized the enormity of the task and either joined teams or failed to participate. Fifty-eight (58) teams submitted more than one correct location. The Project Report from DARPA highlights lessons learned from some of the top teams [1]. This paper outlines the lessons learned from the 10th place iSchools Caucus Team.

The iSchools Caucus is a collective of the twenty seven (27) University Colleges and Departments that strives to advance information science and related fields. "While each individual iSchool has its own strengths and specializations, together they share a fundamental interest in the relationships between information, people, and technology." [3] Some member institutions simply forwarded the recruiting messages to their electronic distribution lists to get word about the Challenge out into the public. Others contributed in more active ways.

The iSchools team used an approach that combined Direct Search (recruiting of observers to directly report their findings to the team) with Cyberspace Search (monitoring open communications channels over the Internet). The methods employed by the iSchools team allowed the team to locate six of the ten balloons and may provide interesting lessons for the law enforcement and homeland security community. First and foremost, it must be acknowledged that recruiting and motivating large numbers of people is difficult. However, there are times where it is inappropriate for law enforcement of government entities to directly ask the public to act. In many cases, grass roots organizations will spring up and motivate people to become observers (especially in “lost child” cases, for example). Those organizations are likely to use open communications channels that can be mined using Open Source Intelligence (OSINT) techniques. The lessons learned by the iSchools Team during the DARPA Network Challenge may be helpful in combining Internet resources in a geo-location exercise, evaluating legitimacy of reported data and in identifying deception in the open channel.

The remainder of the is paper is organized as follows. Section II outlines previously published work that served as a basis for the team’s tactics and response. Section III provides the initial plan that the team intended to follow. Section IV discusses the results of the team’s efforts in terms of individual case studies on the balloon locations found. The paper concludes with a discussion of lessons learned and anticipated future work.

II. RELATED WORK

Previous research provides the foundation for some parts of the iSchools response. While other teams have their specific perspectives, especially those related to mass communications, the iSchools' interdisciplinary approach is valuable to those attempting to leverage this work in real-world situations.

Page 2: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

Using humans as soft sensors has been suggested to address collection of data in urban planning and public health scenarios [Burke] as well as in urban warfare [Hall, Lllinas, McNease, Mullen]. Human observers are used as sensors in this experiment because it was impossible to develop and deploy hard sensors with the right capabilities in the short timeframe and wide geographical scope of the Challenge. Observers needed to be distributed across the continental United States, be mobile within a reasonable geography, have the ability to distinguish between a DARPA tethered red weather balloon and other kinds of red balloons, be able to approach the DARPA representative and retrieve a verification document from them and be able to report their findings back to the central command post via one of the communications methods.

Open Source Intelligence (OSINT) is the process of deliberately discovering, discriminating and analyzing intelligence from unclassified sources [4]. The majority of Open Source Information (OSIF) collected during the detail sourced from public sources on the Internet. Maltego is an OSINT data mining and forensics application that was used with custom coded extensions during the detail [5]. The team was able to leverage the advanced data mining and visual representation capabilities of the software to verifiable effect.

The phrase “six degrees of separation” refers to the small world phenomenon [6] that any person in the world can be connected to any other person through five or fewer intermediaries. It is shown that acquaintance networks on the Internet have the same property by analyzing sixty thousand e-mails seeking out eighteen target people [7] [8]. Understanding this phenomenon allowed the relatively small team to leverage the personal networks of contacts to locate secondary observers to confirm balloon sighting data from untrusted sources.

The Incident Command System [9] seemed to be a logical choice for a structure for the team. The response appeared to have a potential to change over time as different information became available. Because the team was separated by large geographies and potentially had different capabilities to contribute to the effort, the ICS structure would have potentially allowed for different organizations to contribute single elements, task forces or organize into divisions or sectors. In this response, a command structure was developed with an incident commander, planning section, and operations section.

Multi-sensor data fusion techniques hold the promise of being able to combine sensor data from multiple sources and make inferences about the meaning of those data [9]. It seemed likely that the team could be inundated with reports from the website, phone line and email. Combining that data with reports from the Intelligence Branch of the operation. The Dempster-Shafer Theory (D-S Theory) of evidence combination provides a method of combining evidence under uncertainty. Simply put, one multiplies the evidence “for” (on a scale of [0..1]) the inference and divides out the evidence that is “against” the inference [10]. Weighted D-S Theory also provides methods for combining data and weighting the trust that one has in the reliability of the sensor itself [11]. Therefore, it seemed plausible to rig up a simple weighting and combination if lots of data had to be evaluated. Data collection

analyst could be instructed to provide an estimate of reliability on a scale from 0-10. D-S Theory combination works in a way that is similar to how humans think of evidence, so it seemed a logical choice to use in the Challenge.

III. METHODS

A. Direct SearchThe Direct Search Branch of the operation attempted to recruit observers to work on behalf of the team. Messages were sent through the member organizations' marketing teams to email distribution lists, Facebook groups, and twitter feeds.

1)Recruiting Efforts: Members of the Caucus were asked to send electronic communications to their constituent populations including current students, faculty, staff, alumni and friends. Potential observers were asked to look for the red balloons wherever they would be on December 5, 2009 and report their findings through the iSchools web site. Reports from the participating organizations give some idea of “reach” of the recruiting message. Unless otherwise specified the reach is assumed to be from the perspective of the iSchool, not the entire university:

University of Illinois – Email to iSchool alumni; to faculty, staff and students.

Pitt – Email to all alumni; to faculty, staff and students; Webpage article on main page & alumni news page; LinkedIn announcement to iSchool at Pitt group members; and Facebook announcement to iSchool at Pitt group members. Alumni email distributed to 4,674 alumni. 114 opened the DARPA link. Email blasts to 936 students 41 faculty.

Penn State – 338 fans on Facebook and 377 followers on Twitter. Re-tweets, questions about the Challenge on Twitter, but no activity on Facebook. 2,125 alumni received the e-mail.

UCLA – Sent messages to faculty, staff, students and alumni.

Drexel – Alumni listserv, Facebook, Tweeter electronic newsletter for undergrads and grads, online learning system grad web site release.

A late attempt to organize the observers and gain some idea of their intended locations was made. Observers were asked to identify where they would likely be on December 5, 2009 so that they could be contacted to confirm a location. Unfortunately, that website was not ready prior to the majority of the recruiting messages being sent. Only a handful of pre-registered observers submitted their availability.

2) Reporting Methods: Several reporting methods were developed to provide observers with different ways of contacting the command center. This was done to provide multiple methods in case one method was disabled during the

Page 3: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

Challenge. Also, it was anticipated that observers would prefer one method over another. Phone 814-422-5501 – 814-4-BALL-01 Website http://balloon.ist.psu.edu Email [email protected]

3) OSINT / Cyberspace Search: The Rules Of Engagement were presented to the analysts at the start of the day. Intel teams were to scour the open Internet for OSIF (Open Source Information) without hacking, breaking into computer systems or using any other unauthorized means. All incoming OSINT was assigned a credibility rating on a scale of 0-10. The Intel team was divided into groups that monitored both real-time and other sources. Real-time feeds from sites like twitter.com were continuously monitored; while forums, blogs and newsgroups were monitored as they were updated.

4) Web Crawling: A custom search engine crawled the publically accessible portions of the websites of competing teams. The crawler was seeded with the root of the website of all of the competing teams that could be located. The crawler was set to limit its crawl to inside those sites and not to follow external links. Some teams had very open sites, allowing anyone to post, even without prior registration. The crawler indexed the sites in the list several times per hour.

5) Twitter Capture: A tweet crawler using Twitter API (http://apiwiki.twitter.com/) was implemented to capture certain tweets of interest. 33,000 tweets were collected over half a month (from November 27 to December 15, 2009) containing “darpa” or “balloon” keywords in either body texts or hashtags. 6813 tweets were posted on the day of the Challenge. While Twitter does not show it on the public website, the Twitter API provide geo-tags for tweets, if available. Geo-tags, usually generated by smartphone applications, reveal the exact locations from where tweets have been posted.

After the Challenge, a post-hoc evaluation of the captured tweets extracted tweets containing the actual balloon location names. Some of the tweets have key information such as exact coordinates of the balloons, photos of the balloons, numbers on the balloons, and the specific places where the balloons were located.

IV. RESULTS

A. False / Fake Locations: Royal Oak (MI), Albany (NY) and Providence (RI)

Several reports of balloon sightings were reported on competitor websites and through open tweets on the public timeline on Twitter. Three such cases are presented here because they had substantial data about the location of the balloon as well as a picture of the balloon at the location.

A tweet with a picture of a balloon provided a location in Royal Oak, MI. The tweet included the address of the balloon location. The picture was compared with Birds Eye View (From Bing Maps) of the location. A member of the command staff had friends in the Detroit, MI area. He called that friend and asked them to go to the address. The observer found an 8-foot red weather balloon at the location and followed the tether to a hardware store. They question the store owner, who admitted (after some playful banter, claiming to be a DARPA representative) that he heard of the Challenge and had put up the balloon to attract attention to his store.

A tweet with picture of a balloon near Albany, NY was posted. One of the few pre-recruited observers happened to be in Albany. That observer was contacted and the picture was sent to him. He identified the location as being near the Empire State Plaza due to the unique architecture on the building in the background. The observer went to the location, but was unable to find the balloon. He sent pictures of the same angle and background and investigated the area, but could not locate the balloon.

Figure 1 - A fabricated photo (left) posted during the Challenge (http://twitpic.com/s9kun) and photo provided by observer (right)

Another tweet provided a picture of a balloon that was supposed to be in Providence, RI. Because there were several buildings in view in the photograph, they were matched to building pictures of downtown Providence and a general location was estimated. One of the Intelligence Branch analysts had a friend at Brown University, who was called and asked to help. The observer reported back some time later that there was no balloon at that location.

Figure 2 - A fabricated photo of a balloon over Providence, RI

Page 4: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

The Albany and Providence photographs were later analyzed in more detail. Note that neither balloon had the DARPA logo (compare to pictures of San Francisco and Santa Barbara). Also, note that the orientation of the balloon and its number pennant are exact matches to the pre-launch photograph of balloon #8. Further investigation of the balloon in the photographs in Providence showed a thin line of white pixels around the balloon that were left over from the hasty fabrication process. The combination of secondary, trusted observers and analysis of the photographs provided the team with a solid assessment that these two reports were indeed fabricated images and therefore not valid balloon locations.

Figure 3 - Pre-Launch photo of balloons released by DARPA. Note the orientation of balloon 8 compared to Figures 2 and 3.

B. Direct Search: Charlottesville, VA

The team received very little data due to the efforts of the Direct Search Branch of the Operation. The website received less than 1000 external hits and only two submissions via the website mechanism, both of which were obvious false leads. The email address received only two reports. One was a request from another team to trade information.

The other was a report from an observer who located the balloon in Charlottesville, VA. This observer’s initial report provided the approximate GPS coordinates of the location that the user can figured out once he had returned home and looked up on the Internet. A secondary observer was not available for that area. Therefore, the original observer was asked to return to the site to collect the paperwork from the DARPA representative.

Figure 4 - Balloon picture and photo of DARPA's coordinate sheet for Charlottesville, VA

C. Confirmed with Mapping Sites: Christiana (DE), Portland (OR), Santa Barbara (CA), San Francisco (CA)For these four cases, balloon locations were confirmed by

comparing photographs found on Twitter to geotagged pictures of designated locations provided by online mapping services such as Google Earth (http://earth.google.com), Microsoft Bing Maps (http://maps.bing.com) and Panoramio (http://www.panoramio.com). Google Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality aerial images from four angles called Bird’s Eye View covering more than 100 cities in the United States. Panoramio is a Google service that provides pictures on Google Maps that are geotagged for the specific area. These three services turned out to be useful for determining more precise locations by matching man made and natural feature to the backgound of an image provided by an observer.

For instance, there were several reports on Twitter regarding the balloon sighted at Union Square, San Francisco, California. Figure 5 (left) shows a photo linked by the tweet “This one wasn’t hard to find” posted at 2:30pm local time (PST). At 3:53pm local time (PST), the poster explicitly mentioned the balloon number and the exact location. It was a The balloon was in front of a grey-colored department store building. Google Earth was used to confirm the location based on the builldings in the backgound with the same appearance at Union Square. Because Google Earth provides the three-dimensional images of the buildings in San Francisco and many other populated cities, analysts could virtually travel around Union Square and easily find the building even with the same surroundings as shown in Figure 5.

Figure 5 - 3D Mapping for the San Francisco case Left:

http://twitpic.com/sacnb Right: Google Earth

The same work was performed on the balloon sighted in Waterfront Park in Portland, Oregon. There were three key reports on the balloon from Twitter: “just saw #DARPA balloon in Waterfront Park in Portland, Oregon!” posted at 8:56am (PST), “DARPA red balloon Portland waterfront near Naito and Salmon. Balloon # 9” posted at 1:11pm (PST), and the photo shown in Figure 6 (left) posted at 2:30pm local time (PST). All three of these reporters had set Portland as their location on Twitter. The images posted on twitter matched the 3D building picture from Google Earth (see Figure 6).

Page 5: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

Figure 6 - 3D Mapping for the Portland case Left: http://twitpic.com/sack9

Right: Google Earth

A secondary observer was sent to the presumed location for verification and received a picture of the balloon and the certificate, as shown in Figure 7 (top), which contains the exact coordinates of the balloon. A few hours later, another poster provided a manipulated image of a certificate with gake coordinates as shown in Figure 7 (bottom). This may be a case where some fabricated picture detection techniques are helpful.

Figure 7 - Top: A genuine certificate from a secondary observer. Bottom: A fabricated certificate (note the invalid “80” minutes latitude)

Microsoft Bing Maps was used for confirming the balloon seen in Santa Barbara. An initial report on the balloon was posted at 9:59am local time (PST) saying “found #darpa weather balloon #4 in santa barbara, good luck with the hunt.” After this, a crucial report came out at 1:26pm local time (PST) on Twitter. It contained a picture of the balloon clearly showing the balloon number as shown in Figure 8 (left) as well as a link to a map of the estimated location. The estimated location turned out to be very close to the actual location later. Using the Bird’s Eye View of Microsoft Bing Maps, a more exact location was estimated where there are a series of palm trees along the road named in the tweet.

Figure 8 - Aerial Mapping for the Santa Barbara case Left:

http://twitpic.com/sa0rx Right: Bird’s Eye View on Bing Maps

Lastly, for the Christiana case, a picture of balloon #7 (Figure 9 upper left), and a picture of the certificate of the balloon (Figure 9 upper right) were posted on Twitter. The coordinates from the certificate were converted, which is in DMS (degrees/minutes/seconds) format, into the one in decimal format using the Haversine formula: from the latitude of 39° 36' 30" North and the longitude of 75° 43' 51" West into the latitude of 39.60833 and the longitude of –75.73083. Bing Maps provided a bird’s eye view image of the location as a result as shown in Figure 9 (lower). The aerial image seems to be taken prior to the Challenge as the site is under construction, but the image shows the surroundings that are similar to those in the balloon picture (Figure 9).

Figure 9 - The Christiana case Upper left: http://tweetphoto.com/5891723Upper right: http://tweetphoto.com/5891778

Lower: a bird’s eye view on Microsoft Bing Maps

D. Confirmed with OSINT Data Mining: Scottsdale (AZ)

During the intelligence collection phase, raw intel was received about a forum member on fark.com reporting a red balloon a mile from his location. The forum member posted a red balloon sighting a mile from his location when returning from a nearby Jack In The Box for “hangover tacos”.

Analysts tried to verify the poster’s claim and ascertain his location by waiting for the poster to post more information.

However, it was soon discovered that the forum moderators, who had their own team for the Challenge, were made aware of the post and subsequently tried deception tactics to nullify confidence in the user’s post. The moderators posted false messages claiming the reported location of the balloon being Chicago.

Maltego, the deep-web data mining software was used to verify the identity and location of the original poster using custom coded extensions. The poster’s e-mail address and user name listed on his forum profile were used to run a Maltego analysis to harvest his footprints on the internet. The analysis yielded the poster’s entire internet footprint by finding his real name, phone number, employer information,

Page 6: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

address along with profiles on Facebook, Flickr, Myspace and LinkedIn.

Figure 10 – Left: Photograph provided on Fark.com of Balloon in Scotttsdale, AZ. Right: Maltego Analysis of the subject using email address

All evidence from the Maltego analysis confirmed the location of the poster and his balloon report as Scottsdale, AZ. The Intel team was then able to locate a Jack In The Box in the Scottsdale area that was in the vicinity of a park, as the photograph (Figure 10) indicated. Extensive IMINT (Imagery Intelligence) analysis of landscape details using Google and Bing maps validated the poster’s claim and possible coordinates of the location of the balloon.

V. LESSONS AND CONCLUSION

A. Usefulness of Open Source Intelligence MethodsUsing OSINT, monitoring open communications channels and leveraging the networks of people organized by others results in several non-trivial constraints. The first is that on their own, individuals will rarely provide all of the details required for confirmation, location and validation. Using other Internet resources to confirm exact locations using photographic comparisons to known valid images of locations on mapping websites and 3D images sites may help to fill the gap. The second constraint is that competitors will employ deceptive techniques. Overcoming deception in this Challenge took the

form of leveraging a secondary network of trusted observers to confirm sighting reports. Additional analysis was performed on the provided text and picture data, as well as hidden metadata to include geotags and an evaluation of poster reputation. In one particularly interesting case, extensive data mining revealed the true identity and home location of an observer and ultimately the correct location of a balloon where a competing team attempted to cover up an unintentional leak of a valid sighting that had been posted by one of their team members.

B. Usefulness of Social MediaThere are three uses of social media, precisely Twitter, identified in extreme events. First, social media can be great information sources especially when they provide geo information and photos of the event. There was a tweet saying “Spotted DARPA balloon #1 in this very central location” with a link to a photo of the balloon #1. It didn’t tell anything more than that, but did have a geotag (37.7879,-122.4073). However, these coordinates could be entered in GoogleMaps and to condirm that the location is Union Square in San Francisco. A geotag does not always guarantee that the information is useful. The information needs to be evaluated carefully from every possible aspect. For example, there was another tweet claiming “Red balloon sitting in Marina del Rey, CA” with a geotag (33.9741, -118.4317) which points at a place on Lincoln Blvd that can be seen from Marina del Rey. The photo shows a bunch of small red balloons beside a car dealer office that are not from DARPA and were obviously not the balloons in the Challenge.

Figure 11 - Left: http://yfrog.com/3lnhhnj Right: http://twitpic.com/s9k7a

Second, social media can be a good crowd sourcing means. The winning team was very active in recruiting people on social media by posting a lot of advertising messages. They posted a recruiting message almost every minute for an hour after DARPA launched all balloons. The following figure shows the number of tweets per 30 minutes that had a link to http://balloon.media.mit.edu where people could sign up for contribution to the MIT team.

Page 7: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

Figure 12 - # of tweets/30 min leading people to the website http://balloon.media.mit.edu

Third, social media can be used as disturbance means. I could see a lot of false information on Twitter. One example is the one saying “There’s strange red balloon with a number near the Philadelphia Art Museum. Wondering what it is for…” Further tweets were from others involved in the challenge trying to contact this person, explain what the Challenge was about and extract additional details. Some of the participants would spent significant amounts of time to verify if it is true. However, the person who posted this tweet probably only wanted other participants to be deceived. It worked on some. It was interesting to noted that the poster didn’t have any followers and also didn’t follow anyone either. The account was abandoned afterwards. Therefore, the repuation of the poster can be used to evaluate against attempted deception.

C. Usefulness of ICSPrior to the date of the Challenge, the Planning Section developed an Incident Action Plan. The Operations Section organized the Direct Search branch to distribute messages and recruit observers through the email lists, Facebook fans and Twitter feeds at the disposal of the various member institutions. On the date of the Challenge, an Open Source Intelligence Division at Penn State did most of the analysis work. They were augmented by a Twitter Capture Task Force and Custom Crawler Task Force developed automated systems that added value to the Cyberspace Search Division in the Operations Section.

Unfortunately, the iSchools team only had one person with previous experience using ICS. If there were more people who were familiar with ICS in other institutions, the response may have been more robust. Other universities in the caucus could have contributed additional assets to the operation. The ICS structure had the capability to grow in response to the needs of the incident and would have worked well.

D. Comments on Other TeamsOther teams provided interesting lessons as well. Obviously, MIT and GTRI teams did well to motivate the masses to work on their behalf using monetary incentives and mass marketing techniques to recruit and motivate observers. In terms of the “right” network of observers the Groundspeake Geocachers had a nationwide group of individuals with GPS devices who commonly participate in geolocation challenges. George Hotz

must be commended for leveraging his network of followers on twitter with extremely little advanced preparation. All of the top teams provide interesting lessons to be learned.

VI. FUTURE WORK

Entity extraction can be helpful to automatically identify text information regarding time-critical extreme events. Entity types may be geographical location names (e.g. San Francisco), people’s names (e.g. John Doe) and so on. Fortunately, GeoNames.org provides various ways to access a worldwide geo name database that is readily available to everyone for free under a Creative Commons Attribution 3.0 License. EMERSE project (http://emerse.ist.psu.edu/) is an on-going research project that makes use of named entity recognition and machine learning techniques to systematically recognize and classify the topics of tweets on extreme events such as Haiti earthquake.

VII. ACKNOWLEDGEMENTS

The authors are grateful to the following individuals for their support and advice: John Yen, David Hall, Wade Shumaker, Anthony Maslowski, Gregory Traylor, Gregory O’Neill, Madian Khabsa, Guruprasad, Leilei Zhu from Pennsylvania State University; Maeve Reilly and John Unsworth from University of Illinois at Urbana-Champaign; Martin Weiss from University of Pittsburgh; Jeffrey Stanton from Syracuse University; Gary Marchionini from University of North Carolina at Chapel Hill.

VIII. REFERENCES

[1]   “DARPA Network Challenge Project Report,” [Online document], 2010 Feb 16, [cited 2010 Jun 1], Available: https://networkchallenge.darpa.mil/ProjectReport.pdf

[2]   “Regina Dugan's Address at the 40th Anniversary of the Internet conference at UCLA,” [Online video], 2009 Oct 29, [cited 2010 Jun 1], Available: http://www.youtube.com/watch?v=P_hjpva8gBM

[3]   "About the iSchools," [Online document], 2010, [cited 2010 Jun 1], Available: http://www.ischools.org/site/about/

[4]    R. Steele and A. Chester, “NATO Open Source Intelligence Handbook v1.2,” [Online document], 2001 Nov, [cited 2010 Jun 1], Available: http://www.oss.net/extra/document/?module_instance=3&action=show_category&id=95

[5]   “Maltego Overview,” [Online document], 2010, [cited 2010 Jun 1], Available: http://www.paterva.com/web5/client/overview.php

[6]   S. Milgram, "The small world problem," Psychology today, vol. 2, pp. 60-67, 1967.

[7]   P. Dodds, et al., "An experimental study of search in global social networks," Science, vol. 301, p. 827, 2003.

[8]   D. J. Watts, Six degrees: the science of a connected age, 1st ed. New York: Norton, 2003.

[9]   G. Bigley and K. Roberts, "The incident command system: High-reliability organizing for complex and volatile task environments," Academy of Management Journal, vol. 44, pp. 1281-1299, 2001.

[10]   D. Hall and S. McMullen, Mathematical techniques in multisensor data fusion: Artech House Publishers, 2004.

[11]   G. Shafer, A mathematical theory of evidence. Princeton, N.J.: Princeton University Press, 1976.

Page 8: Paper Title (use style: paper title) · Web viewGoogle Earth is capable of displaying three dimensional maps of many cities. Microsoft Bing Maps offers a mode of browsing high quality

[12]   H. Wu, et al., "Sensor fusion using Dempster-Shafer theory," in Proceedings of the 19th IEEE  Instrumentation and Measurement Technology Conference, 2002, pp. 7-12.