confessions (and lessons) of a "recovering" data broker

41
Confessions (and Lessons) of a "Recovering" Data Broker: Responsible Innovation in the Age of Big Data, Big Brother, and the Coming Skynet Terminators Jim Adler Vice President, Products Metanautix [email protected] @jim_adler http://jimadler.me Usenix Security Conference Washington, DC Aug 15 2013

Upload: metanautix

Post on 24-Jan-2015

811 views

Category:

Technology


2 download

DESCRIPTION

Jim Adler

TRANSCRIPT

Page 1: Confessions (and Lessons) of a "Recovering" Data Broker

Confessions (and Lessons) of a "Recovering" Data Broker: Responsible Innovation in the Age of Big Data, Big Brother, and the Coming Skynet Terminators

Jim AdlerVice President, ProductsMetanautix

[email protected]@jim_adlerhttp://jimadler.me

Usenix Security ConferenceWashington, DC

Aug 15 2013

Page 2: Confessions (and Lessons) of a "Recovering" Data Broker

About Metanautix

• Mission: Next generation big data management & analysis– Transparency into the big data supply chain within and across

organizations

• Engineering led, product company– Team of veterans (10-20+ years experience)– Built massive data analysis systems at Google, FB, MSFT, AMZN, Pixar– Solidly funded by top VCs– Working with lighthouse customers

• Started Nov 2012, slowly emerging from stealth mode

• Stay tuned!

Page 3: Confessions (and Lessons) of a "Recovering" Data Broker

I am not an attorney.Confession

Intelligence

Social IneptitudeObsession Dork

NerdGeek Dweeb

Page 4: Confessions (and Lessons) of a "Recovering" Data Broker

You can often do more from the inside than the outside.

• 20B public records

• 30M visitors per month

• 50M+ reports sold

Confession

Page 5: Confessions (and Lessons) of a "Recovering" Data Broker

Listen to your toughest criticsLesson

Page 6: Confessions (and Lessons) of a "Recovering" Data Broker

Can I have a little narcissism with my voyeurism?

• What does my background check say?

• Privacy controls– Suppress single address or

phone number

• Comment on your own public profile

Lesson

Page 7: Confessions (and Lessons) of a "Recovering" Data Broker

Data from Feb 2013

Data from 40,000 BC thru 2002

20Exabytes

(20M TB)

Lots of data created, transferred, & stored.Confession

Page 8: Confessions (and Lessons) of a "Recovering" Data Broker

The “Public” Data Supply Chain of YouConfession

Search

Blogs

Criminal Records

Risk

Civil Suits Marketing

Addresses

Directory

Phone Numbers

Payments

Background

Resumes

Public Posts Name

s

Data Uses

Government

Commercial

Self-Reported

Big Data

Engines

Page 9: Confessions (and Lessons) of a "Recovering" Data Broker

FIND OWNER OF DOG’S RELATIVE FOR TRANSPLANT

SINGLES CURIOUS ABOUT THE PEOPLE THEY MEET

PARENTS ENSURING WHO THEIR KIDS

SAFETY

GENEALOGISTS CULTIVATING THEIR

FAMILY TREE

BUSINESSES THAT NEED TO UPDATE CONTACT INFORMATION ON

CUSTOMERS

FINDING LONG-LOST FRIENDS, MILITARY BUDDIES, ROOMMATES, OR

CLASSMATES

ANYONE CURIOUS ABOUT WHO'S EMAILING OR CALLING THEM

THOSE IN LEGALLY ENTANGLED LOOKING FOR COURT RECORDS ANYONE WHO NEED ADDRESS

HISTORIES FOR PASSPORTS

SOCIAL NETWORKERS LOOKING TO EXPAND THEIR FRIENDS LIST PROFESSIONALS LEARNING ABOUT

COLLEAGUES AT CONFERENCES

RECONNECTING OUT-OF-TOUCH FAMILY MEMBERS

ONLINE SHOPPERS VERIFYING ONLINE

SELLERS

INVESTIGATIVE JOURNALISTS RUNNING

DOWN LEADSSALES PROFESSIONALS

LOOKING FOR NEW PROSPECTS

NETWORKERS SEEKING BUSINESS OPPORTUNITIES

LAW ENFORCEME

NT

NON-PROFIT ORGANIZATIONS LOOKING

FOR SUPPORTERS

FIANCÉS AND THEIR CURIOUS FAMILY

MEMBERS

SOCIAL WORKERS WHO NEED TO KNOW MORE ABOUT THEIR CLIENTS

CALLER ID OF HARASSING PHONE

CALLS

ALUMNI GROUPS ARRANGING REUNIONS

ADOPTED KIDS SEEKING THEIR BIOLOGICAL

PARENTS

AIRLINES TRYING TO RETURN LOST

LUGGAGE

CHECKING OUT A PROSPECTIVE SOCIAL NETWORK CONNECTION

CHECKING OUT A PROSPECTIVE DATE

CHECKING OUT A PROSPECTIVE

TENANT

FINDING PEOPLE THAT HAVE THE SAME ILLNESS AS

YOU

RESEARCHING A PROSPECTIVE

EMPLOYEE

ANYONE RETRIEVING

COURT RECORDS

REGULATED

Lots of uses for your data … some regulated

LAWYERS NEEDING QUICK ACCESS TO COURT RECORDS

BANKING SERVICES

RESEARCH

SHARING

LEARNING ABOUT A BUSINESS

Confession

Page 10: Confessions (and Lessons) of a "Recovering" Data Broker

Billions of RecordsMillions of People

Jim AdlerHouston, TX

Age 70

Jim AdlerRedmond, WA

Age 50Jim Adler

Denver, COAge 48

Jim AdlerMcKinney, TX

Age 57

Jim AdlerCanaan, NH

Age 59

Jim AdlerHastings, NE

Age 32

213 records linkedto the correct 37 Jim Adlers

Philip Collins

375 PeopleJim Adler

213 Records37 People

Randolph Hutchins5 People

Gwen Fleming2 PeopleCarol Brooks

9800 Records1250 People

Confession We don’t know you all that well

Page 11: Confessions (and Lessons) of a "Recovering" Data Broker

Opt-out doesn’t always mean deletionConfession

Jane Hampton 06/23/1998 123 Main Peoria, IL

Jane Hampton 123 Main Peoria, IL [email protected]

Jane Hampton [email protected]

Jane Hampton 06/23/1998 123 Main Peoria, IL

Jane Hampton 123 Main Peoria, IL [email protected]

Jane Hampton [email protected]

Page 12: Confessions (and Lessons) of a "Recovering" Data Broker

When towns were small, personal anonymity was low …

“The only thing worse than being talked about, is not being talked about.”

− Oscar Wilde

Lesson

Page 13: Confessions (and Lessons) of a "Recovering" Data Broker

“Good Fences Make Good Neighbors”

− Robert Frost

Urban populations grew along with personal anonymity…

Lesson

Page 14: Confessions (and Lessons) of a "Recovering" Data Broker

1850 1870 1890 1910 1930 1950 1970 1990 2010 20300

20

40

60

80

100

120

Pri

vacy E

xp

ecta

tion

s

“Rockwell” Era “Good Fences” Era “PrivacyVertigo”

Era O

nline Density ↓

Personal Anonymity ↓

… we’re suffering from Privacy Vertigo.Confession

Urban Density ↓Personal Anonymity ↓

Urban Densit

y ↑

Personal A

nonymity

Page 15: Confessions (and Lessons) of a "Recovering" Data Broker

Privacy is deeply cultural.Lesson

Discretion

Disclosure

Page 16: Confessions (and Lessons) of a "Recovering" Data Broker

EU “Right of Personality”

• Inviolable right of dignity

• Germany: BGB, 1900, post WWII US: Brandeis & Warren, 1890

• Germany: “Source Right”

• Esra “kiss & tell” plaintiff won, book banned

US “Privacy Torts”

• Statutory harm

• US: Prosser’s Torts, 1960

• US: Sectorial privacy regime

• Bonome “kiss & tell” case dismissed

EU Rights versus US Torts

http://paulschwartz.net/pdf/SchwartzPeifer_Prosser_FINAL.pdf

Page 17: Confessions (and Lessons) of a "Recovering" Data Broker

How to unpack Privacy? Think PPP.

PRIVACY

PERILS

PLAC

ESPLAYERS

Lesson

Page 18: Confessions (and Lessons) of a "Recovering" Data Broker

Sometimes you’re in a public place when you think you’re in a private place.

“Gaydar”

A 2009 MIT study found it was possible to predict men’s sexual orientation by analyzing the gender and sexuality of their social network contacts – even if the rest of the information on their profile was set to private.

Confession

Page 19: Confessions (and Lessons) of a "Recovering" Data Broker

“To Serve Man” is a Cookbook.Confession

“If you’re not paying for the product, you are the product.”

− Claire Wolfe (paraphrased)

Page 20: Confessions (and Lessons) of a "Recovering" Data Broker

Peer to Peer

Corporati

on & Custo

mer/Employe

e

Govern

ment & Citize

n

Your God &

You

Power Disparity

Pri

vacy

Rig

hts

In privacy contexts, Power matters.Lesson

Page 21: Confessions (and Lessons) of a "Recovering" Data Broker

Target knows you’re pregnant and when you’re due. So, what’s so perilous?

Confession

Page 22: Confessions (and Lessons) of a "Recovering" Data Broker

Secrets are power equalizers.Confession

Page 23: Confessions (and Lessons) of a "Recovering" Data Broker

Public PlacesPowerful Players

Private PlacesPowerful Players

Private PlacesWeaker Players

Public PlacesWeaker Players

M O R E P R I V A T E P L A C E S

MO

RE

PL

AY

ER

PO

WE

R G

AP

Mapping Places-Players-Perils Cases

Page 24: Confessions (and Lessons) of a "Recovering" Data Broker

M O R E P R I V A T E P L A C E S

MO

RE

PL

AY

ER

PO

WE

R G

AP

Places-Players-Perils Cases

Page 25: Confessions (and Lessons) of a "Recovering" Data Broker

A head in the clouds < 20 yearsPrediction

2012 2014 2016 2018 2020 2022 2024 2026 2028 2030 2032 2034 2036 2038 2040 2042$1

$10

$100

$1,000

$10,000

$100,000

$27,100

$13,500

$6,800

$3,400

$1,700

$850

$420

$210

$100

$53

$26

$13

$7

$3$2

$1

Year

Cost

per

Mon

th (

000s

) • 20,000 TFlops• 2,500 Terabytes• Less than $700K per year

Chris Westbury, University of Alberta

Page 26: Confessions (and Lessons) of a "Recovering" Data Broker

“Watch your thoughts, they become words.Watch your words, they become actions.Watch your actions, they become habits.Watch your habits, they become your character.Watch your character, it becomes your destiny.”

– Lao Tzu

“… the essential crime that contained all others in itself. Thoughtcrime, they called it.”

– George Orwell

Big data inferences are not thoughtcrimesLesson

Page 27: Confessions (and Lessons) of a "Recovering" Data Broker

Felon Classifier Overview

ModelLearner250 M Defendants Feature

Extraction

15K Labels

15K PredictorsClea

ning

Link

ing

Sam

plin

g

ObjectiveIf someone has minor offenses on their criminal

record, do they also have any felonies?

Today’s Bloomberg piece: http://bloom.bg/1eMtnug

Page 28: Confessions (and Lessons) of a "Recovering" Data Broker

How does the Felon Classifier work?

Gender Eye Color Tattoos Criminal Offenses Score

Over Threshold

of 3.5?Likely Felon?

NO

YES

Hazel (+1.7)

Male Blue 2 + (+1.3) Traffic only

Female (-0.5) Brown < 2 4 or fewermisdemeanors (+1.9)

YES

Green 8 or fewermisdemeanors

NO

4.4

Hazel

Male (+0.1) Blue 2 + Traffic only (-0.5)

Female Brown (+1.2) < 2 (+0.1) 4 or fewermisdemeanors

YES

Green 8 or fewermisdemeanors

NO

0.9

widget: http://jimadler.me

Page 29: Confessions (and Lessons) of a "Recovering" Data Broker

Classifiers depend on policy as much as technology

AN

AR

CH

Y

T Y R A N N Y

0.0% 5.0% 10.0% 15.0% 20.0%0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

False Positive Rate

Fals

e N

egat

ive

Rat

e

Threshold: 0.66FP Rate: 5% FN Rate: 22%

Threshold: 1.1FP Rate: 1% FN Rate: 40%

Threshold: -1.82FP Rate: 19% FN Rate: 0%

Confession

Page 30: Confessions (and Lessons) of a "Recovering" Data Broker

Privacy, Reasonable Suspicion, & Probable Cause

• Courts have upheld profiling– US v Sokolow

• Predictive information never enough1. Reliable 2. Efficient3. Particularized4. Detailed5. Timely6. Corroborated• More• Andrew Guthrie Ferguson, Predictive Policing

http://ssrn.com/abstract_id=2050001 • Bernard Harcourt, Rethinking Racial Profiling

http://www.law.uchicago.edu/files/files/rethinking_racial_profiling.pdf• The “Not Ready For Prime Time” Classifier

http://jimadler.me/post/47374264398/the-not-ready-for-prime-time-classifier

Page 31: Confessions (and Lessons) of a "Recovering" Data Broker

NYC Stop & Frisk Found UnconstitutionalRuling

“The city … believes that blacks and Hispanics should be stopped at the same rate as their proportion of the local criminal suspect population.”

− US District Judge Shira Scheindlin

100% greater chance that a minority stop is justified over a random stop

NYC assumes 87%

Minorities in NYC

If it’s not ok to stop & frisk 95% of the general population for nothing,why is it ok to stop & frisk 90% of minorities for nothing?

PA|M = Chance a minority should be arrestedP~A|M = Chance a minority should not be arrestedPM|A = Chance that someone arrested is a minorityPM = Chance someone is a minorityPA = Chance someone is arrested

Bayes’ Rule

Page 32: Confessions (and Lessons) of a "Recovering" Data Broker

M O R E P R I V A T E P L A C E S

MO

RE

PL

AY

ER

PO

WE

R G

AP

Big brother is watching (duh)

“We’re being asked to trust without being able to verify.”

− Alex Howard

Pres. Obama calls for more transparency in FISA court and surveillance laws

NSA chief announces plan to replace 1,000 sysadmins with machines

Confession

Page 33: Confessions (and Lessons) of a "Recovering" Data Broker

Technology grows exponentially. Wisdom grows linearly.

• Gov’t doesn’t trust people (at least sysadmins) but does trust machines

• Little Transparency

• Wisdom is hard to come by

• Sentient (?) brain in the cloud in < 20 years

Lesson

Wisdom

Knowledge

Information

Data

Page 34: Confessions (and Lessons) of a "Recovering" Data Broker

Hilary Mason’s Maxim

Math + Code = Awesome

QuantsMaking a killing on Wall Street but still can’t impress the chicks

Weakonomics.com

Lesson

Page 35: Confessions (and Lessons) of a "Recovering" Data Broker

Corollary to Mason’s Maxim

Values * (Math + Code) = Awesome

Lesson

Page 36: Confessions (and Lessons) of a "Recovering" Data Broker
Page 37: Confessions (and Lessons) of a "Recovering" Data Broker

Norio OhgaSony President

74 min CD

Jeff JonasBig Data

Privacy by DesignMark Zuckerberg

Real Names

Steve Jobs‘nuff said

Eclectic generalists drive innovation

Richard FeynmanPhysics

Lesson

Page 38: Confessions (and Lessons) of a "Recovering" Data Broker

We’re making this up as we go.

Austin Alleman @allemanau

Innocence Frontier Regulation Innovation

Confession

Page 39: Confessions (and Lessons) of a "Recovering" Data Broker

“Can’t we all just get along?”Plea

Geeks

WonksSuits

SocialEntrepreneur

High-TechMercenary

ResponsibleInnovator

TraditionalCapitalist

− Rodney King

Page 40: Confessions (and Lessons) of a "Recovering" Data Broker

Lesson “No one here gets out alive.”− Jim Morrison

Adapt, Invent,

Test

Question, Scrutinize,

Criticize

Watch, Listen, Learn Geeks

WonksSuits

Page 41: Confessions (and Lessons) of a "Recovering" Data Broker

Questions!

Jim Adler

[email protected]