micah altman david o’brien & alexandra wood these opinions are our own. they are not the...

27
Towards a Modern Approach to Privacy-Aware Government Data Releases Micah Altman David O’Brien & Alexandra Wood MIT Libraries Berkman Center for Internet & Society Open Data: Addressing Privacy, Security, and Civil Rights Challenges 19th Annual BCLT/BTLJ Symposium April 2015

Upload: vutuyen

Post on 15-Apr-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Towards a Modern Approach toPrivacy-Aware Government Data Releases

Micah Altman David O’Brien & Alexandra Wood MIT Libraries Berkman Center for Internet & Society

Open Data: Addressing Privacy, Security, and Civil Rights Challenges19th Annual BCLT/BTLJ Symposium

April 2015

Page 2: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

DisclaimerThese opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of co-authored previously published

work) our collaborators.

2Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 3: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Collaborators & Co-Conspirators

Collaborators

● The Privacy Tools for Research Data Project<privacytools.seas.harvard.edu>

● Research Support from Sloan Foundation; National Science Foundation (Award #1237235); Microsoft Corporation

3Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 4: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Related Work● Vadhan, S., et al. 2011. “Re: Advance Notice of Proposed Rulemaking: Human

Subjects Research Protections.” ● Altman, M., D. O’Brien, S. Vadhan, A. Wood. 2014. “Big Data Study: Request for

Information.”● O'Brien, et al. 2015. “Integrating Approaches to Privacy Across the Research

Lifecycle: When Is Information Purely Public?” (Mar. 27, 2015) Berkman Center Research Publication No. 2015-7.

● Wood, et al. 2014. “Integrating Approaches to Privacy Across the Research Lifecycle: Long-Term Longitudinal Studies” (July 22, 2014). Berkman Center Research Publication No. 2014-12.

Preprints and reprints available from: informatics.mit.edu

4Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 5: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Goals

1. Examine critical use cases

2. Develop a framework for systematically analyzing privacy in releases of data

3. Produce a guide for selecting among new legal and technical tools for privacy protection

5Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 6: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Use Cases for Government Data Releases

● Freedom of Information Act/Privacy Act

● Open Government/E-Government Initiatives

● Traditional Public and Vital Records

● Official Statistics

6Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 7: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Recent Examples

● E-Government DataOccupational Safety and Health Administration release of workplace injury records

● Open Government DataOpen cities data

7Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 8: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Public Release ofWorkplace Injury Records

8Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 9: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Benefits from Public Data Availability

● Transparency as a democratic principle

● Accountability of institutions

● Economic and social welfare benefits

● Data for research and scientific progress

9Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 10: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Scope of Information Made Public

● All collected data not protected by FOIA, the Privacy Act, or OSHA reporting regulations

● Redaction of names, addresses, dates of birth, and gender

● Information to be released includes job title, date and time of incident, and descriptions of injury or illness and where and how it occurred

10Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 11: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

OSHA rulemaking mockup of proposed web display of injury/illness reports11Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 12: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Re-identification Risks

● Individuals can be identified despite redaction of directly identifying fields or attributes

● Robust de-identification of microdata is a very difficult problem, and free-form text fields are especially challenging

12Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 13: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Information Sensitivity

● OSHA identifies “privacy concern cases” as injuries or illnesses related to sexual assault, mental health, or infectious diseases

● There are other situations in which details regarding an injury or illness may be sensitive, such those related to drug or alcohol abuse, that are not included

13Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 14: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Review, Reporting, and Accountability

● Lack of review mechanisms, such as systematic redactions of sensitive information before release

● Lack of accountability for harm arising from misuse of disclosed data

14Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 15: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Framework for Modern Privacy Analysis

15Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 16: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Observations

Privacy is not a simple function of the presence or absence of specific fields, attributes, or keywords in a released set of data.

Other factors, including what one can learn or infer about individuals from a data release as a whole or when linked with other information, may lead to harm.

16Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 17: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Observations

Redaction, pseudonymization, coarsening, and hashing, are often neither an adequate nor appropriate practice, and releasing less information is not always a better approach to privacy.

Simple redaction of information that has been identified as sensitive is often not a guarantee of privacy protection and may also reduce the usefulness of the information. In addition, the act of redacting certain fields of a record may reveal the fact that a record contains sensitive information.

17Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 18: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Observations

Naïve use of any data sharing model, including a more advanced model, is unlikely to provide adequate protection.

Thoughtful analysis with expert consultation is necessary in order to evaluate the sensitivity of the data collected, to quantify the associated re-identification risks, and to design useful and safe release mechanisms.

18Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 19: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Framework for Privacy Analysis

● Benefits from public data availability● Scope of information made public● Re-identification risks● Information sensitivity● Review, reporting, and information

accountability19Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 20: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Privacy Interventions at Any Stage

20Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 21: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Data Sharing Models

21Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 22: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Data Management Approaches

● Access controls (including tiered access models)

● Secure data enclaves

● Personal data stores

● Audit systems

● Information accountability/operational policy

● Risk assessments

22Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 23: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Legal & Regulatory Approaches

● Notice and consent

● Data sharing agreements

● Transparency and audit requirements

● Data minimization requirements

● Accountability for misuse, including civil and criminal penalties and private rights of action

23Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 24: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Statistical & Computational Approaches

● Contingency tables

● Synthetic data

● Data visualizations

● Interactive mechanisms

● Multiparty computations

● Functional and homomorphic encryption

24Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 25: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Selecting Appropriate Controls

25Towards a Modern Approach to Privacy-Aware Government Data Releases

Analysis Type

Data Structure

● Logical Structure (e.g., single relation, multiple relational, network/graph, semi-structured, geospatial, aggregate table)

● Source● Unit of observation● Attribute measurement type (e.g., continuous/discrete; ratio/interval/ordinal/nominal

scale; associated schema/ontology)● Performance characteristics (e.g., dimensionality/number of measures, number of

observation/volume, sparseness, heterogeneity/variety, frequency of updates/velocity)● Quality characteristics (e.g., measurement error, metadata, completeness, total error)

● Form of output (e.g., summary scalars, summary table, model parameters, data extract, static data publication, static visualization, dynamic visualization, statistical/model diagnostics)

● Analysis methodology (e.g., contingency tables/counting queries, summary statistics/function estimation, regression models/GLM, general model-based statistical estimation/MLE/MCMC, bootstraps/randomization/data partitioning, data mining/heuristics/custom algorithms)

● Analysis goal (e.g., rule-based, theory formation, existence proof, verification, descriptive inference, forecasting, causal inference, mechanistic inference)

● Utility/loss/quality measure (e.g., entropy, mean squared error, realism, validity of descriptive/predictive/causal statistical inference)

Page 26: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

References● Salil Vadhan, et al., Comments to the Department of Health and Human Services and the Food and Drug

Administration, Re: Advance Notice of Proposed Rulemaking: Human Subjects Research Protections, Docket No. HHS-OPHS-2011-0005 (Oct. 26, 2011), available at http://privacytools.seas.harvard.edu/files/commonruleanprm.pdf.

● Micah Altman, David O’Brien, & Alexandra Wood, Comments to the Occupational Safety and Health Administration, Re: Proposed Rule: Improve Tracking of Workplace Injuries and Illnesses, OSHA-2013-0023-1207 (March 10, 2014), available at http://www.regulations.gov/#%21documentDetail;D=OSHA-2013-0023-1207.

● Micah Altman, David O’Brien, Salil Vadhan, & Alexandra Wood, Comments to the White House Office of Science and Technology Policy, Re: Big Data Study; Request for Information (March 31, 2014), available at http://privacytools.seas.harvard.edu/files/whitehousebigdataresponse1.pdf.

● David O’Brien, et al., Integrating Approaches to Privacy Across the Research Lifecycle: When Is Information Purely Public?, Berkman Center Research Publication No. 2015-7 (March 27, 2015), available at http://ssrn.com/abstract=2586158 or http://dx.doi.org/10.2139/ssrn.2586158.

● Alexandra Wood, et al., Integrating Approaches to Privacy Across the Research Lifecycle: Long-Term Longitudinal Studies, Berkman Center Research Publication No. 2014-12 (July 22, 2014), available at http://ssrn.com/abstract=2469848 or http://dx.doi.org/10.2139/ssrn.2469848.

26Towards a Modern Approach to Privacy-Aware Government Data Releases

Page 27: Micah Altman David O’Brien & Alexandra Wood These opinions are our own. They are not the opinions of MIT, Brookings, Berkman any of the project funders, nor (with the exception of

Questions

E-mail: Micah Altman, [email protected]: privacytools.seas.harvard.edu

27Towards a Modern Approach to Privacy-Aware Government Data Releases