hs : role of big data...– people acknowledge that “the trade-off between offering information...

23
HSX: ROLE OF BIG DATA June 2017

Upload: others

Post on 30-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

HSX: ROLE OF BIG DATA

June 2017

Page 2: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

WHAT IS BIG DATA?

2

! Big data refers to extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.

! Big data also refers to the exponential growth and wide availability of digital data that are difficult or even impossible to be managed and analyzed using conventional software tools and technologies

! Big Data analysis incorporates generation, collection, access, and use of data.

Image Credit: http://trice-bigdata.com/assets/images/BDT_1.jpg

Page 3: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

WHAT IS BIG DATA? (CONT.)

Maximize marketing and profit potential

Predict social, economic, health, and environmental trends

Maximize system efficiency (e.g., utility or computer systems)

Find patterns or relationships between multiple factors

Big Data has implications across political, economic, social, technological, legal, and environmental disciplines. It can predict trends, monitor situations in near-real time, find unexpected relationships, and even help fix a problem before it turns into a crisis. Individuals, businesses, and organizations who use big data may have a desire to:

3

Page 4: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA GROWTH

! More data has been created in the past two years than in the entire previous history of the human race.

! It is estimated that by 2020, approximately 1.7 megabytes of new information will be created every second for every human being on the planet.

! The rate at which we are generating data is outpacing our ability to analyze it.

4 Source: Thomson Reuters 2012

Page 5: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

THE FOUR V’S OF BIG DATA

• The amount of information collected is very large, requiring petabytes of storage

Volume

• Data are generated and collected from numerous sources in the form of structured, unstructured, and semi-structured data. • Structured Data – data that is well defined in a set of rules. (e.g.,. names

are always expressed as a series of letters; dates follow a specific pattern)

• Unstructured data- data that has no rules (e.g., a picture, a recording, a social media status [string of variable text])

Variety

• Frequency of incoming data that needs to be processed. •  Data is constantly being added, changed, or deleted.

Velocity

• How trustworthy is the data? •  Data sets are often incomplete, not standardized and error-prone.

Veracity

Valu

e

Page 6: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

6

Image source: http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg

Page 7: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

7

Image Source: http://www.seventhinc.com/bigdata.html

Page 8: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

8

Image Source: https://bicorner.com/2015/06/25/characteristics-that-make-big-data-big/

Page 9: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA ANALYTICS How do we manage all this data?

9

! Advances in technology have led to new ways to collect, analyze and use structured and unstructured data.

! One of the goals of data science is to find value in the massive amounts of data that have been and continue to be collected.

! Advances in data analytics combined with advancements in data storage and computational power are increasing the value of the data that is being collected.

! Logistic regression analysis and social network analysis are two major techniques that work to analyze big data.

Page 10: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA ANALYTICS

10

Data Mining • Identify relationships among information but not causality • Mathematics, computer science, artificial intelligence, and machine learning • Examples: classification algorithms, clustering algorithms, regression algorithms, association tools, anomaly-detection algorithms, summarization tools

Data Fusion • Integrate heterogenous datasets • Requires systems to communicate and exchange data • Examples: sensor networks, video/image processing, robotics and intelligent systems

Data Integration • Broadly combine data repositories, and keep larger set of information

Image and Speech Recognition

• Extract information from large amounts of images, videos, and recorded or broadcast speech

• Examples: scene extractions, facial-recognition technologies, automated speech recognition

Natural Language Processing • Understand natural human language of input data

Machine Learning • Learn from input data

Bayesian Analysis • Combine information about a population parameter with information contained in a sample

Social Network Analysis

• Extract “information from a variety of interconnecting units under the assumption that their relationship is important and that the units do no behave autonomously” (PCAST 2014)

• Use different technologies, such as clustering association and data fusion

Page 11: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

EXAMPLES OF BIG DATA USE New analysis techniques have enormous potential to benefit society and contribute to economic growth, yet also present new challenges across political, environmental, social, technological, legal, and economic environments.

11

Business •  Influence targeted-marketing campaigns • Shape and drive business strategy through near real-time consumer insights

Social • Predict disease patterns and outbreaks •  Identify community needs and wants •  Increase crime detection and prevention

Technology

• Contribute to real-time monitoring and diagnostics of machine systems (utility providers, computer systems, etc.)

•  Improve smart homes, grids, and cities •  Improved technology including developments in computational power, data

storage, network and equipment access, and advances in deep learning (machine learning and pattern recognition) are major drivers for big data use across all disciplines

Environment • Monitor habitat change • Predict the weather

Political • Election campaigns use public data and micro-targeting to isolate and appeal

to specific voter groups • Use social media posts to predict social and political movements • Data collected on a variety of subjects can inform public policy

Page 12: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

EXAMPLES OF BIG DATA USE

!  The 2014 Ebola virus epidemic in West Africa claimed several thousands of lives.

!  The combination of infection in urban areas, poor public health systems, and weak health governance in the countries experiencing the worst of the epidemic along with the lack of a “proven” safe vaccine or effective drug against Ebola all contributed to the scale and devastation of the outbreak.

!  HealthMap, an infectious-disease surveillance tool that uses Big Data analytics, identified the Ebola outbreak nine days before the World Health Organization announced the epidemic. The head of HealthMap suggests that the addition of more data streams to the algorithm, including cell-phone usage data, could help predict the spread of Ebola virus.

!  Additionally, researchers at the Broad Institute described the initial case of Ebola infection in Sierra Leone and provided insight into how the virus arrived in West Africa through the analysis of viral genome sequence data of Ebola virus in patient samples from Sierra Leone.

The 2014 Ebola Outbreak

12

Image Source: BBC News 2016

Page 13: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

EXAMPLES OF BIG DATA USE

!  Facebook keeps a high level of detailed information on their users. User behavior is tracked through

–  Tracking cookies –  Facial recognition –  Tag suggestion –  “Like” analysis

!  Facebook uses Big Data to: –  Create ‘flashback’ videos and memories –  Create targeted marketing campaigns (Klosowski 2013)

•  Because of the information people put on Facebook about themselves, advertisers can select very specific parameters for their ad.

•  Example “Someone engaged to be married, who lives in New York, between the ages of 20-30, who likes swimming, and who drives a BMW.”

–  Analyze social and political movements (Monnappa 2015) •  “Like” behavior can accurately predict a person’s sexual

orientation, intelligence, emotional stability, religion, substance use, age, gender, political affiliation, etc. and influence user actions.

•  Example: In 2010, Facebook users who noticed the “I Voted” sticker on their friends profiles were more likely to vote and be vocal about voting after they saw their friends do it. Facebook claims their “I Voted” sticker directly motivated 60,000 people and indirectly motivated 280,000 people to vote.

How Facebook Uses Big Data

13

Image source: http://ucsdnews.ucsd.edu/pressrelease/facebook_fuels_the_friend_vote

Credit: Facebook.com

Page 14: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

LEGAL, TECHNICAL, AND INDIVIDUAL CHALLENGES

! “Scientific knowledge and technologies are primarily being applied to the betterment of society and/or for economic gain, but they are becoming increasingly accessible to a larger number of individuals, including those who have malevolent intent.” -- AAAS 2014

! Vulnerabilities of datasets exist in the form of cyber and data security threats. ! The interconnectedness of real and virtual worlds increasingly exposes individual

private information - including personal data stored on a digital device, financial data, health issues, and personal interests.

! It is difficult to anticipate the uses of data that may result from large data-mining processes and therefore the types of analysis and targeting that may follow.

! Datasets can be used to discriminate against individuals and communities. For example demographic and/or health data could be used to deny health or auto insurance or social services.

14

Page 15: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BENEFITS ASSESSMENT FRAMEWORK FOR BIG DATA

15

What societal and/or national issues (including national security concerns)

need to be addressed and what resources are needed to address

those problems?

Could Big Data capabilities infringe on human rights, freedoms, and liberties?

What opportunities to address societal and/or national issues (including

national security issues) need to be pursued and what resources are

needed to pursue those opportunities?

Do Big Data technologies provide the necessary

capabilities to address the resource needs?

Do Big Data capabilities improve current

capabilities for addressing societal and/or national

problems (including national security

concerns)?

Could Big Data technologies enhance a

nation’s ability to address societal problems

(including biological security nationally or

transnationally?

Could Big Data technologies facilitate

coordination and cooperation among

security agencies and scientists to address

societal and/or national problems (including

national security problems) Yes Yes Yes Yes

No BENEFIT

NO BENEFIT

After: AAAS 2014, Figure 4

The framework used by the AAAS to determine benefits of big data use

Page 16: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BENEFITS ASSESSMENT FOR BIG DATA

!  In December 2015, A Pew Research Center survey found that 56% of Americans felt the government has not gone far enough to protect the country.

!  When high profile cases involving national security vs. and individual’s privacy emerge, the majority of citizens find in favor of national security, although there is a general urge to avoid dramatic invasion of personal privacy.

–  People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions of the deal and the circumstances of their lives. People indicated that their interest and overall comfort level in sharing personal information depends on the company or organization with which they are bargaining and how trustworthy or safe they perceive the firm to be. It also depends on what happens to their data after they are collected, especially if the data are made available to third parties, and on how long the data are retained.” – Raine and Maniam 2016

Public shifting concerns on data collection and national security

16

Page 17: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA TECHNICAL CHALLENGES & CURRENT SOLUTIONS

17

Image source: AAAS 2014

Page 18: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA FUTURE CHALLENGES ! As technology improves, data volumes will continue to grow presenting

continued storage issues. ! Additional tools for analysis will emerge to enable real-time decision-making

which will require improvements to machine learning. ! More companies and programs will try to find ‘value and revenue in their data.’ ! New program development for learning institutions and businesses to stay

relevant. ! Artificial intelligence and smart devices – robots, self-driving cars, virtual

automated customer service representatives, etc. will trend. It will be difficult to keep a human touch.

! Continued concern with privacy and data leaks.

18

Page 19: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA & PRIVACY - POLICY RECOMMENDATIONS

! 1: Policy attention should focus more on the actual uses of big data and less on its collection and analysis. (PCAST 2014)

! 2: Policies and regulation at all levels of government should not embed particular technological solutions, but rather should be stated in terms of intended outcomes. (PCAST 2014)

! 3: With coordination and encouragement from the White House Office of Science and Technology Policy (OSTP), the agencies of the Networking and Information Technology Research and Development program should strengthen U.S. research in privacy-related technologies and in the relevant areas of social science that inform the successful application of those technologies. (PCAST 2014)

! 4: OSTP together with the appropriate educational institutions and professional societies should encourage increased education and training opportunities concerning privacy protection, including career paths for professionals. (PCAST 2014)

! 5: The United States should take the lead both in the international arena and at home by adopting policies that stimulate the use of practical privacy-protecting technologies that exist today. (PCAST 2014)

2014 White House Office of Science and Technology Policy Recommendations (PCAST 2014)

19

Page 20: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA IN NATIONAL SECURITY AND LIFE SCIENCES – POLICY RECOMMENDATIONS

! The U.S. government should actively engage the science and technology communities in evaluating the potential risks and benefits of Big Data to national and transnational biological security. The evaluation of risks and benefits to national security should be a coordinated effort among private, public, and government security and scientific experts, and conducted on a regular basis. (AAAS 2014)

! The U.S. government and the broader scientific and technology communities should develop educational materials and curricula that impart an understanding of the security risks and vulnerabilities associated with Big Data in the life sciences. (AAAS 2014)

Recommendations from the 2014 National and Transnational Security Implications of Big Data in the Life Sciences Report (AAAS 2014)

20

Page 21: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

BIG DATA IN NATIONAL SECURITY AND LIFE SCIENCES– POLICY RECOMMENDATIONS (CONT.)

! The U.S. government and the broader scientific and technology communities should engage in the development of detailed solution scenarios to identify existing legal, technological, institutional, and individual solutions and gaps in governance that need addressing. This should include support for the development of security strategies that can be integrated in an open source environment where large datasets are collected, aggregated, and analyzed. (AAAS 2014)

! The U.S. government should evaluate legal, technical, institutional, and individual measures to promote the benefits of and to prevent or mitigate risks presented by multidisciplinary science such as Big Data in the life sciences, which involves computer science, data science, mathematics, engineering, bioinformatics and life sciences. This should include a review of standing statutory and other legal frameworks to determine the adequacy, applicability and efficacy for enforcement and a determination of whether new statutory and/or regulatory measures may be required. In addition, this evaluation should include an ongoing review of the available technical solutions and institutional and individual practices for their applicability to addressing the risks of Big Data in the life sciences. (AAAS 2014)

Recommendations from the 2014 National and Transnational Security Implications of Big Data in the Life Sciences Report (AAAS 2014)

21

Page 22: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

“While big data unquestionably increases the potential of government power to accrue unchecked, it also

holds within it solutions that can enhance accountability, privacy, and the rights of citizens. Properly implemented, big data will become an historic driver of progress, helping our nation

perpetuate the civic and economic dynamism that has long been its hallmark.”

– PCAST 2014

22

Page 23: HS : ROLE OF BIG DATA...– People acknowledge that “the trade-off between offering information about themselves in exchange for something of value are shaped by both the conditions

RESOURCES •  American Association for the Advancement of Science. National and Transnational Security

Implications of Big Data in the Life Sciences. (2014). Prepared by the American Association for the Advancement of Science in conjunction with the Federal Bureau of Investigation and the United Nations Interregional Crime and Justice Research Institute. Accessed 10 April 2017, https://www.aaas.org/report/national-and-transnational-security-implications-big-data-life-sciences.

•  Cukier, Kenneth and Viktor Mayer-Schoenberger. Big Data: A Revolution that will Transform How We Live, Work, and Think. 2014. John Murray Publishers, London, England.

•  PCAST. Fact Sheet: PCAST Report on Big Data and Privacy: A Technological Perspective (2014), May 1. Office of Science & Technology Policy, The White House. Accessed 10 April 2017, https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/PCAST/pcast_fact_sheet_on_bdp_report_-_final_formatted.pdf. Full Report is available at: https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/PCAST/pcast_big_data_and_privacy_-_may_2014.pdf.

•  SINTEF. "Big Data, for better or worse: 90% of world's data generated over last two years." ScienceDaily, (2013) May 22. Accessed 11 April 2017, www.sciencedaily.com/releases/2013/05/130522085217.htm.

Additional research materials and information sources regarding this topic can be found in the associated Literary & Scholastic Resource List.