cyberspace law committee meeting, august 3, 2012 big data lois mermelstein the law office of lois d....

Post on 17-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Cyberspace Law Committee Meeting, August 3, 2012

Big DataLois MermelsteinThe Law Office of Lois D. Mermelsteinlois@loismermelstein.com512-222-8589

Ted ClaypooleWomble Carlyletclaypoole@wcsr.com704-331-4910

What Is Big Data?

✤ Data that exceeds the processing capacity of conventional database systems.

✤ Too much data

✤ It moves too fast

✤ It’s too diverse

How’d we get here?

✤ Storage, processing speed, and bandwidth are becoming exponentially faster

✤ Networking is expanding exponentially

✤ And you can buy all the pieces - data, infrastructure, processing

source: http://radar.oreilly.com/2011/08/building-data-startups.html

Crunching Big Data - Volume

✤ Turn 12 terabytes of tweets/day into improved product sentiment analysis

✤ Convert 350 billion annual meter readings to better predict power consumption

✤ Crunching Facebook recommendations based on your friends’ interests

Crunching Big Data - Velocity

✤ Time-sensitive analysis and decision-making - to catch important events as they happen

✤ When there’s too much input data (so toss some) or immediate decisions must be made

✤ Examples:

✤ Scrutinize 5 million trade events/day to identify potential fraud

✤ Analyze 500 million daily call detail records in real-time to predict customer churn faster

Crunching Big Data - Variety

✤ Not just names/addresses in a customer database

✤ Want to analyze text, sensor data, audio, video, location data, click streams, log files, and anything else that’s available

✤ Principle: when you can, keep everything - there might be something useful in what you throw away

Unexpected Consequences

✤ Anonymous AOL searcher isn’t (NYT, 8/9/2006)

✤ Anonymous Netflix users aren’t, when compared with IMDb database (Wired, 12/13/2007)

✤ For many, browsing history is unique and repeatable (8/1/2012)

✤ Target knows when you’re pregnant (NYT, 2/19/2012)

Lessons to (Re)learn

✤ Correlation isn't causation

✤ But correlation may be all you need

✤ You can't hide in the crowd

Personally Identifiable Information

PII as a mathematical function

How many points of data do you need?

Pineda v Williams Sonoma Stores, Inc. (Cal, Feb 10 2011)

HIPAA De-Identified Data

Re-Identifying De-Identified Data

Escaping Regulatory Requirements

Privacy

Fair Credit Reporting

Redlining

Employment Discrimination

Single Transaction Owned By:

Retailer

Wholesale vendor

Manufacturer

Shipping Company

Customer’s Bank

Customer’s ISP

Retailer’s Bank

Merchant Card Processor

Phone company/Hardware/Software

Government Using Big Data

Law Enforcement

Copyright Issues

Who owns the data?

Who owns the derivative works?

Combined data?

top related