ethics of big data

12
Ethics of Big Data Eduardo Felipe Zecca da Cruz

Upload: burton

Post on 24-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Ethics of Big Data. Eduardo Felipe Zecca da Cruz. What is Big Data?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ethics of Big Data

Ethics of Big DataEduardo Felipe Zecca da Cruz

Page 2: Ethics of Big Data

What is Big Data?

• Stamford, Conn.-based IT research firm Gartner Inc. defines "big data" as "high-volume, velocity and/or variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation."

• Other experts point out that big data might include unstructured textual information from social media sites, machine-generated log data and a host of other information collected by cloud applications, on-premises applications and websites.

Page 3: Ethics of Big Data

Numbers

• Stored information totaled 0.8 zetabytes, the equivalent of 800 billion gigabytes, in 2009.

• IDC predicts that by 2020, 35 zetabytes of information will be stored globally

• Untargeted ad online was $1.98 per thousand views. • The average price of a targeted ad was $4.12 per

thousand

Page 4: Ethics of Big Data

Concerns• Privacy Issues

• Users could feel monitored for every single action that they do on the Internet

• Examples• Target Case

• Sent coupons to a teenager for baby products by analyzing her shopping patterns• Revealed that she was pregnant to her family

• Travel Agency Orbitz Case• Began up-charging Apple users after data-crunching revealed that they are

generally willing to pay more for travel

Page 5: Ethics of Big Data

Data Brokers

• Data Broker is a business that collect personal information about customers and sells it to other organizations

• Google, Facebook, Amazon and Microsoft take the most private information

Page 6: Ethics of Big Data

Hadoop

• Software Architecture that handles high volumes of simultaneous search queries

• Created in 2005• Licensed under the Apache License 2.0• It was inspired by Google’s MapReduce

Page 7: Ethics of Big Data

Hadoop Architecture

• It consists of:• Hadoop Common package

• Provides filesystem and OS level abstractions• Contains the necessary Java Archive files and scripts need to start Hadoop

• Hadoop Distributed File System• Hadoop YARN• Hadoop MapReduce

• Programming model for processing large data sets with a parallel, distributed algorithm on a cluster

Page 8: Ethics of Big Data

Hadoop Cluster

Page 9: Ethics of Big Data

Prominent Users

• Yahoo!• Yahoo Search Webmap

• Facebook• Has the largest Hadoop cluster in the world

• After 2013 more than half of Fortune 50 uses Hadoop

Page 10: Ethics of Big Data

Proposals

• Consumer Privacy Bill of Rights Act of 2011• A bill to establish a regulatory framework for the comprehensive

protection of personal data for individuals under the aegis of the Federal Trade Commission, and for other purposes.

• Do-Not-Track Online Act of 2011• A bill to require the Federal Trade Commission to prescribe regulations

regarding the collection and use of personal information obtained by tracking the online activity of an individual, and for other purposes.

Page 11: Ethics of Big Data

Proposals

• Clarity on Practices• Let users know when data is being collected

• Simplicity of Settings• Privacy by Design• Exchange of Value

Page 12: Ethics of Big Data

References• http://www.technologyreview.com/news/424104/what-big-data-needs-a-

code-of-ethical-practices/

• http://www.forbes.com/sites/emc/2014/03/27/the-ethics-of-big-data/• http://

searchcloudapplications.techtarget.com/feature/Big-data-collection-efforts-spark-an-information-ethics-debate

• http://whatis.techtarget.com/definition/data-broker-information-broker• http://hadoop.apache.org/• http://en.wikipedia.org/wiki/Apache_Hadoop#File_system• http://searchcloudcomputing.techtarget.com/definition/Hadoop