data and society big data 1 lecture 2bermaf/data course 2019/lecture 2 -- big data 1.pdf ·...

44
Fran Berman, Data and Society, CSCI 4370/6370 Data and Society Big Data 1 – Lecture 2 1/18/19

Upload: others

Post on 16-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Data and Society

Big Data 1 – Lecture 2

1/18/19

Page 2: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Announcements 1/18

• Please fill out the “Why are you here” sheet during break if you didn’t fill it out last time

• Office hours today (and every Friday) AE 218 1-2

• Please sign attendance sheet each time you are here (your participation grade depends partly on attendance).

– If you are on the waiting list and trying to get in the class, please do the same (put a star by your name).

• If you are enrolled and decide to drop the class or decide to leave the waiting list, please let me know ([email protected]).

• CLASS WEDNESDAY JANUARY 23 at 9:00 here.

Page 3: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Wednesday Section Friday Lecture (first half) Second half of class Assts.

January 9: NO CLASS January 11: INTRO – DATA AND SOCIETY Fran presentation demo

January 16: NO CLASS January 18: BIG DATA 1; Topic groups / Topic

materials information

Student presentations

January 23: Student

presentations

January 25: BIG DATA 2 Student presentations Op-Ed instructions

January 30: NO CLASS February 1: DATA AND SCIENCE Student presentations

February 6: NO CLASS February 8: DATA STEWARDSHIP AND

PRESERVATION

Student presentations Group Topics due

February 13: NO CLASS February 15: INTERNET OF THINGS Student presentations

February 20: Student

presentations

February 22: DATA AND PRIVACY /

FOUNDATIONS

Student presentations Op-Ed Drafts due

February 27: NO CLASS March 1: DATA AND PRIVACY / POLICY AND

REGULATION

Student presentations Briefing instructions

March 6: Spring Break March 8: Spring Break

March 13: Student

presentations

March 15: DATA AND ENTERTAINMENT [ANDY

MALTZ?]

Student presentations Op-Ed Drafts Returned

Topic Reports 1 due

March 20: TOPICS

PRESENTATIONS 1

March 22: DATA AND DATING Student presentations

March 27: Student

presentations

March 29: DIGITAL RIGHTS 1 Student presentations Op-Ed Finals due

April 3: NO CLASS April 5: DIGITAL RIGHTS 2 Student presentations Briefings due

April 10: Student

presentations

April 12: DATA AND ETHICS Student presentations Op-Ed Finals returned,

Topic Reports 2 due

April 17: Student

presentations

April 19: CAREERS IN TECH [KATHY PHAM?] Student presentations

April 24: Student

presentations

April 26: TOPICS PRESENTATIONS 2

Page 4: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Today

• Data Fundamentals:

– What is big data and what does it tell us?

– What doesn’t big data tell us?

• Data-driven Commerce:

– Data and Target

– Precision Agriculture

• Topic Groups / Topic Instructions

• Break

• Presentations

Page 5: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Lecture 2: Big Data 1

• About Big Data

• Big Data and Business

Page 6: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

What is big data?

• Wikipedia: “Broad term for data sets so large or complex that traditional data processing applications are inadequate.”

• McKinsey: “Datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze”

• O’Reilly Radar: “Data that exceeds the processing capacity of conventional database systems. The data that is too big, moves too fast, or doesn’t fit the structures of your database architectures. To gain value from this data, you must choose an alternative way to process it.”

Page 7: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

What does big data tell us?

• Big data is often noisy, dynamic, heterogeneous. Inter-

related and untrustworthy. Why do we find it useful?

– General statistics obtained from frequent patterns and

correlation analysis can disclose more reliable hidden patterns

and knowledge

– Interconnected big data forms large heterogeneous

information networks, with which information redundancy can

be explored to compensate for missing data, cross check

conflicting cases, validate trustworthy relationships, disclose

inherent clusters, and uncover hidden relationships and

models.

Page 8: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Big data visualization from the

Cooper Hewitt Design Museum

(Thanks to Sarah Schattschneider!)

• “Flight Patterns” by Aaron Koblin:https://www.youtube.com/watch?v=ttH7sQ48n5k

• From https://collection.cooperhewitt.org/objects/68743525/: “Flight Patterns is a data visualization project that traces domestic airline traffic during a single 24-hour period over North America. Flight paths, using datasets provided by the Federal Aviation Administration, are rendered as arced trajectories. The result is a stunning visual animation that elegantly renders air traffic data as cartography.”

Page 9: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

About Big Data [Strata]

• Value of big data: analytical use, enabling new products

• Ways that big data impacts infrastructure

– Volume: big data calls for scalable storage and a distributed approach to querying

– Velocity: big data infrastructure must adapt to the speed of the input and the need for quick analysis and turnaround. Need for stream processing technologies

– Variety: Source data often “messy”, non-homogeneous, unstructured. Infrastructure must organize and find meaning from it.

Page 10: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/637010

Image adapted from NIST. Original credit: Jason Kolb, Applied Data Labs; Modified from the original at: www.applieddatalabs.com/content/new-reality-business-intelligence-and-big-data

Things You Know

Things You Don’t Know

QuestionsYou’reAsking

QuestionsYou Haven’tThought Of

Conventional

Data Analytics

Data

Acquisition

BIG

DATA

Data-enabled

Exploration

Big Data – Potential for Innovation

Page 11: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

How is big data useful in industry?

• (Big) data is being used by virtually every industry and is being used to boost/improve production

• Big data contributing to new ways of creating value:

– Creating transparency

– Enabling experimentation to discover needs, expose variability and improve performance

– Segmenting populations to customize actions

– Replacing / supporting human decision making with automated algorithms

– Supporting new business models, products, services

• Big data becoming a competitive advantage and means of industry growth

• Big data enabling substantial growth in productivity and customer satisfaction.

• Big data enabling new insights and discoveries

Page 12: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Big data can mean big profits

Page 13: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

McKinsey’s take on Big Data (circa 2011)

Page 14: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Inferences from Big Data:

Correlation and Causation

• Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate together. [http://whatis.techtarget.com/definition/correlation]

• Causation, or causality, is the capacity of one variable to influence another. The first variable may bring the second into existence or may cause the incidence of the second variable to fluctuate.

• Causation is often confused with correlation, which indicates the extent to which two variables tend to increase or decrease in parallel. However, correlation by itself does not imply causation. There may be a third factor, for example, that is responsible for the fluctuations in both variables. [http://whatis.techtarget.com/definition/causation]

Page 15: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Beware of too much inference from Big Data! Correlation vs. Causation

Correlations from Spurious correlations: http://www.tylervigen.com/spurious-correlations

Page 16: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Limitations of Big Data• From “Eight (no Nine) Problems with Big Data”, New York Times,

https://www.nytimes.com/2014/04/07/opinion/eight-no-nine-problems-with-big-data.html

1. “… although big data is very good at detecting correlations, …, it never tells us which correlations are meaningful”

2. “ … big data can work well as an adjunct to scientific inquiry but rarely succeeds as a wholesale replacement.”

3. “ … many tools that are based on big data can be easily gamed.”

4. “ … even when the results of a big data analysis aren’t intentionally gamed, they often turn out to be less robust than they initially seem.”

5. “ … whenever the source of information for a big data analysis is itself a product of big data, opportunities for vicious cycles abound [echo chamber effect].”

6. “ … risk of too many correlations.”

7. “ … big data is prone to giving scientific-sounding solutions to hopelessly imprecise questions.”

8. “ …big data is at its best when analyzing things that are extremely common, but often falls short when analyzing things that are less common.”

9. “ … the hype.”

Page 17: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Data-Driven Commerce

Page 18: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Predictive Analytics

• Retailers highly interested in the buying habits of their customers: what you like, what you need, which coupons will help draw you to their store, etc.

• Retailers also use highly sophisticated models of human behavior: buying behavior, formation of habits, etc. to help determine how to best draw customers

• Many retailers hiring statisticians, mathematicians, data scientists to improve the bottom line through strategic marketing, including Target

Page 19: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Predictive Analytics at Target

• Target develops profile of customer information for each customer

– Information indexed by a unique guest ID number: credit card information, name, email address, purchases, demographic information as available, etc.

– Information is collected by Target or bought from other sources (information available includes ethnicity, job history, magazines you read, if you’ve declared bankruptcy or gotten divorce, what kinds of topics you talk about online, etc.)

• Retailers know that at major life events, old routines fall apart and usual brand loyalties and buying habits are in flux: graduating from college, birth of a child, moving to a new area / town, etc.

• Target wanted to focus on the life event of having a child

– New parents will develop new buying routines for diapers, toys, lotion, baby food, clothes, etc.

– If Target can change the buying habits of new parents before the birth of the baby, they are pre-competitive and can win big

Page 20: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Marketing to Pregnant

Women

• Target statistician Andrew Pole analyzed data from customers who had

signed up in Target’s baby registry

• Analyses identified ~25 products that, when analyzed together,

contributed to a “pregnancy prediction” score (e.g. unscented lotion,

vitamin supplements, etc.). Score also estimated due date.

• Target used pregnancy prediction score and estimated due date to

identify which target customers to send baby product coupons to, what

and when

• Anecdote:

Page 21: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Minimizing the “creepiness factor”

• Behavioral research and data analysis helping drive much more in-depth predictive analytics

• Combining prediction and analysis with marketing infrastructure:

• Target had the capacity to send customers customized ad books. Once it is determined that they are potentially pregnant, seemingly random pregnancy and baby products can be included with other ads that accurately target the consumer.

• Company began to mix baby products with other things (e.g. lawn mowers, wineglasses, etc.)

• Customers found this less creepy and used the baby coupons

Page 22: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Personalized marketing

• Soon after the new ad campaign, Target’s “Mom and Baby” sales greatly increased and grew over time ($44B in 2002 to $67B in 2010)

• Similar data mining approach being used in many, many stores and businesses: department stores, Facebook, Google, etc.

• Key issues about privacy remain and your rights within the burgeoning market for data about you are yet to be sorted out.

Page 23: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

21 Things “Big Data” Knows about You (Forbes) -- 1http://www.forbes.com/sites/bernardmarr/2016/03/08/21-scary-things-big-data-knows-about-you/#23aec89b66a7

1. Your browser knows what you’ve searched for.

2. Google also knows your age and gender — even if you never told them. They make a pretty comprehensive ads profile of you, including a list of your interests (which you can edit) to decide what kinds of ads to show you.

3. Facebook knows when your relationship is going south. Based on activities and status updates on Facebook, the company can predict (with scary accuracy) whether or not your relationship is going to last.

4. Google knows where you’ve travelled, especially if you have an Android phone.

5. And the police know where you’re driving right now — at least in the U.K., where closed circuit televisions (CCTV) are ubiquitous. Police have access to data from thousands of networked cameras across the country, which scan license plates and take photographs of each car and their driver. In the U.S., many cities have traffic cameras that can be used similarly.

6. Your phone also knows how fast you were going when you were traveling. (Be glad they don’t share that information with the police!)

7. Your phone has also probably deduced where you live and work.

8. The Internet knows where your cat lives. Using the hidden meta-data about the geographic location of where the photo was taken which we share when we publish photos of our cats on sites like Instagram and other social media networks.

9. Your credit card company knows what you buy. Of course your credit card company knows what you buy and where, but this has raised concerns that what you buy and where you shop might impact your credit score. They can use your purchasing data to decide if you’re a credit risk.

10. Your grocery store knows what brands you like. For every point a grocery store or pharmacy doles out, they’re collecting mountains of data about your purchasing habits and preferences. The chains are using the data to serve up personalized experiences when you visit their websites, personalized coupon offers, and more.

Page 24: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

21 Things “Big Data” Knows about You (Forbes) -- 2http://www.forbes.com/sites/bernardmarr/2016/03/08/21-scary-things-big-data-knows-about-you/#23aec89b66a7

11.HR knows when you’re going to quit your job. An HR software company called Workday is testing out an algorithm that analyzes text in documents and can predict from that information, which employees are likely to leave the company.

12.Target knows if you’re pregnant. (Sometimes even before your family does.)

13.YouTube knows what videos you’ve been watching. And even what you’ve searched for on YouTube.

14.Amazon knows what you like to read, Netflix knows what you like to watch. Even your public library knows what kinds of media you like to consume.

15.Apple and Google know what you ask Siri and Cortana.

16.Your child’s Barbie doll is also telling Mattel what she and your child talk about.

17.Police departments in some major cities, including Chicago and Kansas City, know you’re going to commit a crime — before you do it.

18.Your auto insurance company knows when and where you drive — and they may penalize you for it, even if you’ve never filed a claim.

19.Data brokers can help unscrupulous companies identify vulnerable consumers. For example, they may identify a population as a “credit-crunched city family” and then direct advertisements at you for payday loans.

20.Facebook knows how intelligent you are, how satisfied you are with your life, and whether you are emotionally stable or not – simply based on a big data analysis of the ‘likes’ you have clicked.

21.Your apps may have access to a lot of your personal data. Angry Birds gets access to your contact list in your phone and your physical location. Bejeweled wants to know your phone number. Some apps even access your microphone to record what’s going on around you while you use them.

Page 25: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Precision Agriculture – the data-optimized

farm

Page 26: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

What is Precision

Agriculture?

• [Wikipedia] Precision agriculture is a

farming management concept based on

observing, measuring and responding to

inter and intra-field variability in crops.

The goal of precision agriculture research

is to define a decision support system

(DSS) for whole farm management with

the goal of optimizing returns on inputs

while preserving resources.

• Data-generating technologies used to

map/monitor crop yield, terrain features,

organic matter content, moisture levels,

nitrogen levels, pH, etc.

Top image (vegetation density) shows the color variations determined by crop density where dark blues and greens indicate lush vegetation and reds show areas of bare soil. Middle image (water deficit) is a map of water deficit. Greens and blues indicate wet soil and reds are dry soil. Bottom image (crop stress) shows where crops are under serious stress (indicated by red and yellow pixels). https://en.wikipedia.org/wiki/Precision_agriculture, NASA Earth Observatory.

Page 27: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

How does precision agriculture work?

• Farmers collect data on weather, soil, fertilizer and water uptake, productivity, etc. to optimize crop yield and minimize resources (water, fertilizer, soil additives, etc.)

• Technologies:

– GPS-based computer mapping: yield and crop data used to customize crop management across and within fields. Farmers can add resources to lower-yielding areas or reduce inputs and cut their losses.

– Guidance systems (GPS-guided or auto-steered combines and tractors): Systems reduce operator errors by determining precise field locations and compensating for operator fatigue. Systems reduce over- and under- application of sprays and better aligning and seeding of field crop rows, enabling improved harvesting.

– Variable-rate technology: Customized seeding and application of fertilizer, chemicals, and pesticides accomplished with machine attachments that can vary the rate of application from GPS controls.

• What’s the benefit? Application of precise, targeted amounts of water, fertilizer, pesticides, chemicals, etc. to specific crops and crop areas can

– Increase productivity, maximize yields, increase profit

– Reduce added nutrients and other crop inputs when not needed, benefiting crops, soil and groundwater and reducing costs.

Page 28: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Precision CornCost savings and efficient resource management on U.S. Corn Farms using

• Yield mapping

• GPS and soil mapping

• Guidance system (GPS-enabled combine harvester)

• Variable rate technologies (differential application of water, fertilizer, etc.)

Graph and info from: “Sequential Adoption and Cost Savings from Precision Agriculture,” by David Schimmelpfennig and Robert Ebel, Journal of Agricultural and Resource Economics and https://www.ers.usda.gov/amber-waves/2016/may/cost-savings-from-precision-agriculture-technologies-on-us-corn-farms/

Page 29: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

USDA study (data

gathered between

1996 and 2013)

results

USDA Precision Agriculture Study found that:

• Adoption varies

– Adoption rates vary significantly across precision agriculture technologies and type of crop

• Size of the farm matters

– Largest corn farms (>2900 acres) have double the adoption rates of all farms

– Average size farms benefit from 3 major precision agriculture technologies

• Other cost-of-operations factors matter

– Both precision agriculture technology adoption and farm size influence production costs (labor higher on large farms)

– Labor, machinery, soil testing all impact uptake of precision agriculture

Page 30: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

More next time:

• Big data and astronomy

• Big data and elections

• Big data challenges and misuse

Page 31: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Lecture 2 Sources (not on Slides)

• “Big data: The next frontier for innovation, competition and productivity”, Report from the McKinsey Global Institute, http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation

• “What is big data?” O’Reilly Radar, http://radar.oreilly.com/2012/01/what-is-big-data.html

• “How Target figured out a teen girl was pregnant before her father did,” Forbes, http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/#5df063ed34c6

• “How Companies Learn your secrets”, The New York Times, http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html?pagewanted=all&_r=0

• “Precision Agriculture”, Wikipedia

• “Farm Profits and Adoption of Precision Agriculture”, USDA, https://www.ers.usda.gov/publications/pub-details/?pubid=80325

Page 32: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Topic Group Instructions

Page 33: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

How you’ll be graded

Presentation 1, 15

Presentation 2, 15

Op-Ed Final, 15

Topic Report and Presentation (Group), 35

Briefing, 10

Part. / Attend.,

10

Grade Distribution

Presentation 1 Presentation 2 Op-Ed Final Briefing (Group)

Briefing (Ind.) Part. / Attend.

Page 34: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Instructions for Topic Groups• Topics Materials (Groups of 4, Group presentation / 15

points, group Topic Report /15 points, Coordination / at most 5 points, based on individual assessments);

– Get/read Bruce Schneier’s book (didn’t order it at the Bookstore, should be available through Amazon, etc.)

– Fran will form Topic Groups on January 18.

– Choose a topic from Schneier’s book and submit a one-pager describing your topic to Fran by February 8 (more detail later). Choose your Topic Report date preference (March 20 or April 26). (Fran may rebalance …)

– Assignment components:

• Jointly written topic description / Feb. 8 – 0 points

• Jointly written Topic Report (6-8 pages) – all must contribute – 15 points

• Joint 15 minute talk (+ 5 min Q&A) (all must contribute) – 15 points

• Individual assessment of group dynamics and coordination – used for 0-5 point coordination grade

– Topic reports will be provided on the web for the class to read.

Group grade (35 points):• Joint Presentation: 15

points, usual rubric• Joint written Topic

Report: 15 points, group grade, information provided on Jan. 18

• Group coordination: at most 5 points for coordination based on materials and individual assessments

Page 35: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

More Detail / Group Report

• Fran will read numbers. Remember your number.

• Group Topic reports:

– Pick a topic from Schneier’s book that you will describe in some depth. Group topics due to Fran February 8 (or before if you’re ready)

• Topic = problem that needs to be addressed and potential ways of addressing it

– Write a report on the topic. The report should have the following format:

• Introduction (What is problem? What are the consequences of not addressing the problem?)

• Supporting detail (Why is it a problem? What factors contribute to the problem?

• Potential solution(s) (What could we do to address the problem? Why would this approach / these approaches address the problem?)

• Necessary infrastructure (What policy, agreements, software, hardware, algorithms, data might be needed to support the solution?)

• Metrics of success (How will we know if the solution successfully addresses the problem?)

• Next steps (What should we do next if we want to address the problem?)

• References (not counted in page count)

Page 36: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

More Detail / Group Report

• Fran will read numbers. Your group is your number mod 4.

• Group Topic reports:

– Pick a topic from Schneier’s book that you will describe in some depth. Group topics due to Fran February 8 (or before if you’re ready)

• Topic = problem that needs to be addressed and potential ways of addressing it

• Basics:– Who is your audience? Non-specialists / professionals

– What is your purpose? Describe a problem and a proposed solution compellingly and with reasonable supporting evidence and technical details

– Format:

• Joint report (one copy for everyone in the group, one grade for everyone in the group)

• 6-8 pages (doesn’t include references)

• 12 point font, 1.5 spacing

• Turn in hardcopy to Fran at the beginning of class on March 15 (first groups) or April 12 (second groups)

Page 37: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Report Format

• The report should have the following format:

– Introduction (What is problem? What are the consequences of not addressing the problem?)

– Supporting detail (Why is it a problem? What factors contribute to the problem?

– Potential solution(s) (What could we do to address the problem? Why would this approach / these approaches address the problem?)

– Necessary infrastructure (What policy, agreements, software, hardware, algorithms, data, etc. might be needed to support the solution?)

– Metrics of success (How will we know if the solution successfully addresses the problem?)

– Next steps (What should we do next if we want to address the problem?)

– References (not counted in page count)

Page 38: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Expectations for Topic Groups

• It is expected that you will meet with your group to discuss possible topics, apportion work, discuss issues

– When / how often you meet is up to you

– Use the project to learn how to collaborate/negotiate – this is an exercise in writing, presenting and working successfully in groups.

• It is expected that everyone will have a job for the written report and presentation

– Workloads should be equivalent

• Recommendations:

– Meet often (1-2+ times a week)

– Don’t paste components together – make sure it comes across with a unified voice

– Focus on professional, compelling materials. Tell a good story, spell check, include charts, graphs, visuals as needed.

– Create and revise the report and talk multiple times to get the best product.

Page 39: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Break

Page 40: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Presentations

Page 41: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Presentation Articles for January 23

• “The Most Famous Person to Die in 2018, According to Data Science”, Huffington Post, https://www.huffingtonpost.com/entry/most-famous-celebrity-death-2018_us_5c26634ee4b08aaf7a903312 (Clarisse B.)

• “Just don’t call it Privacy”, New York Times, https://www.nytimes.com/2018/09/22/sunday-review/privacy-hearing-amazon-google.html (N. Agu)

• “Want to Stop Students from Using Their Smartphones in class? Ironically, there’s an App for That”, Washington Post, https://www.washingtonpost.com/dc-md-va/2018/09/18/want-stop-students-zoning-out-class-with-their-smartphones-ironically-theres-an-app-that/?utm_term=.683cc4d445a0 (Heather S.)

• “Giving Viewers What They Want,” NY Times, http://www.nytimes.com/2013/02/25/business/media/for-house-of-cards-using-big-data-to-guarantee-its-popularity.html?pagewanted=all&_r=1&(Rahul D.)

Page 42: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Presentation Articles for January 25

• “The $3B map: scientists pool oceans of data to plot Earth’s final frontier”, Reuters, https://www.reuters.com/article/us-oceans-rights-science/the-3-billion-map-scientists-pool-oceans-of-data-to-plot-earths-final-frontier-idUSKBN1O504M (Korryn R.)

• “Crowdsourcing Snow Depth Data with Citizen Scientists”, Eos, Earth and Space Science News, https://eos.org/project-updates/crowdsourcing-snow-depth-data-with-citizen-scientists (Grace R.)

• “Biden’s Leading the Polls, but that doesn’t mean much yet”, New York Times, https://fivethirtyeight.com/features/bidens-leading-the-iowa-polls-but-that-doesnt-mean-much-yet/ (Kevin B.)

• “Electronic Voting was going to be the Future. Now paper’s making a comeback”, CNET, https://www.cnet.com/news/electronic-voting-machines-were-going-to-be-the-future-now-paper-ballots-make-a-comeback/(Rufeng M.)

Page 43: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Presentation Articles for February 1

• “Investigating Glaciers in Depth”, Science News, https://www.sciencedaily.com/releases/2018/10/181024095348.htm (Chris T.)

• “The polar vortex has fractured, and the eastern U.S. faces a punishing stretch of winter weather”, Washington Post, https://www.washingtonpost.com/weather/2019/01/15/polar-vortex-has-fractured-eastern-us-faces-punishing-stretch-winter-weather-just-underway/?utm_term=.d1b70d0443ea&wpisrc=nl_most&wpmm=1 (Noah Z.)

• “How an unlikely family history site transformed cold case investigations”, New York Times, https://www.nytimes.com/2018/10/15/science/gedmatch-genealogy-cold-cases.html (Samad F.)

• “Genetics has learned a ton – mostly about white people. That’s a problem.”, Vox, https://www.vox.com/science-and-health/2018/10/22/17983568/dna-tests-precision-medicine-genetics-gwas-diversity-all-of-us (Milena G.)

Page 44: Data and Society Big Data 1 Lecture 2bermaf/Data Course 2019/Lecture 2 -- Big Data 1.pdf · ontent/new-reality-business-intelligence-and-big-data Things You Know Things You Don’t

Fran Berman, Data and Society, CSCI 4370/6370

Presentation Articles for Today

• “Digital Divide is Wider Than We Think, Study Says”, New York Times, https://www.nytimes.com/2018/12/04/technology/digital-divide-us-fcc-microsoft.html (Jessie O.)

• “Google and the flu: How Big Data will Help us Make Gigantic Mistakes”, the Guardian, https://www.theguardian.com/technology/2014/apr/05/google-flu-big-data-help-make-gigantic-mistakes (Rachel R.)

• “10 Examples of Predictive Customer Experience Outcomes Powered by AI”, Forbes, https://www.forbes.com/sites/blakemorgan/2018/12/20/10-examples-of-predictive-customer-experience-outcomes-powered-by-ai/#711136515d0b (Raz R.)

• “5 Ways Facebook Shared Your Data”, New York Times, https://www.nytimes.com/2018/12/19/technology/facebook-data-sharing.html (Ayushi B.)