formal methods for privacy formal methods 2009 eindhoven, the netherlands november 4, 2009 jeannette...
TRANSCRIPT
Formal Methods for Privacy
Formal Methods 2009Eindhoven, The Netherlands
November 4, 2009
Jeannette M. WingAssistant Director
Computer and Information Science and Engineering Directorateand
President’s Professor of Computer ScienceCarnegie Mellon University
2FM 2009 Jeannette M. Wing
Broader Context: Trustworthy Systems
• Trustworthy =
Reliability
Security
Privacy
Usability
3FM 2009 Jeannette M. Wing
Broader Context: Trustworthy Systems
• Trustworthy =
Reliability• Does it do the right thing?
Security• How vulnerable is it to attack?
Privacy• Does it protect a person’s information?
Usability• Can a human use it easily?
4FM 2009 Jeannette M. Wing
Technical Progress: Reliability
• Formal definitions, theories, models, logics, languages, algorithms, etc. for stating and proving notions of correctness.
• Tools for analyzing systems—from code to architecture—for desired and undesired properties
• Use of languages, tools, etc. in industry.– “Reliable” [= “good enough”] systems in practice: telephony, the
Internet, desktop software, your automobile• Examples:
– Strongly typed programming languages rule out entire classes of errors.– Database systems are built to satisfy ACID properties: atomicity, consistency,
isolation, durability– Byzantine fault-tolerance, n > 3t+1– Impossibility results, e.g., distributed consensus with 1 faulty node
Current challenge: Nature and scale of systems and their operating environments are more complex, forcing us to revisit these fundamental results. E.g., cyber-physical systems, safety-critical systems.
5FM 2009 Jeannette M. Wing
• Formal definitions, theories, models, logics, languages, algorithms, etc. for stating and proving notions of security.
• Tools for analyzing systems—from code to architecture—desired and undesired properties
• Use of languages, tools, etc. in industry.– Secure [= “secure enough”] systems in practice: telephony, the Internet,
desktop software, your automobile (today)• Examples:
– Cryptography– Systems designed to satisfy informally CIA properties (confidentiality,
integrity, availability).– Logic of authentication [BurrowsAbadiNeedham89], logic for access
control [LampsonAbadiBurrowsWobber92]
Technical Progress: Security
Current challenges: (1) Assumptions have changed; revisit the blue. (2) Fill in the gray. (3) Nature and scale of systems and their operating environments are more complex, forcing us to revisit the fundamentals. E.g., today’s crypto rests (mostly) on RSA, i.e., hardness of factoring.
6FM 2009 Jeannette M. Wing
Technical Progress: Privacy
??Examples: - Privacy-preserving data mining [DworkNissam04] - Private matching [LiTygarHellerstein05] - Differential privacy [Dwork et al.06] - Privacy policy language [BarthDattaMitchellNissenbaum06] - Privacy in statistical databases [Fienberg et al. 04, 06] - Non-interference, confidentiality [GoguenMeseguer82,TschantzWing08]
7FM 2009 Jeannette M. Wing
Aspects to Privacy
Philosophical
SocietalLegalPolitical
Technical
Philosophical Views on Privacy Rights
9FM 2009 Jeannette M. Wing
Two Approaches
• Descriptive– Attempts to answer the question “What is privacy?”– Gives a precise definition or analysis of concepts– Need not justify ethical questions as part of analysis
• Prescriptive– Explains why we value privacy and why privacy should
be respected by others (individuals, organizations, governments)
• Often intertwined, masking the distinctions
10FM 2009 Jeannette M. Wing
Descriptive (What is Privacy?)
• Restrictions on the access of other people to an individual’s information.
vs.
• An individual’s control over personal information.
11FM 2009 Jeannette M. Wing
Prescriptive (Value of Privacy)
• Fundamental human right
vs.
• Instrumental: value of privacy derives from the value of other, more fundamental values privacy allows– E.g., autonomy, fairness, intimate relations
Philosophical Views on Privacy Violations[Solove 2006]
13FM 2009 Jeannette M. Wing
Model
Data Subject
Data
Data Holder
14FM 2009 Jeannette M. Wing
Four Classes
• Invasions
• Information collection
• Information processing
• Information dissemination
15FM 2009 Jeannette M. Wing
Invasions
• Physical intrusions– Trespassing– Blocking passage
• Decisional interference– Interfering with personal
decisions• E.g., use of
contraceptives,abortion,sodomy
Perhaps reducible to violations of other rights o property and security intrusions o autonomy and liberty decisional interference
16FM 2009 Jeannette M. Wing
Information Collection
• Surveillance• Interrogation
Making observations}
Makes people uncomfortable about how collected information will be used, even if never used. Puts people in awkward positions of having to refuse to answer questions. Even in the absence of violations, collection should be controlled in order to prevent other violations, e.g., blackmail.
Credit: Apple, Inc.
17FM 2009 Jeannette M. Wing
Information Processing (I)• Aggregation
- Combines diffuse pieces of information - Enables inferences otherwise unavailable
• Identification - Links information with a person - Enables inferences, alters how a person is treated
• Insecurity - Makes information more available to those unauthorized - Leads to identity theft, distortion of data
• Secondary Uses - Makes information available for purposes not originally intended• Exclusion
- Inability of data subject to know what records are kept, to view them, to know how they are used, or to correct them
Credit: Abingdon PressCredit: Abingdon Press
18FM 2009 Jeannette M. Wing
Information Processing (II)
• All types create uncertainty on the part of the data subject
• Even in the absence of abuse, uncertainty can cause people to live in fear of how information may be used
19FM 2009 Jeannette M. Wing
Information Dissemination
• DisclosureMaking private information known outside the group of individuals expected to know it
• ExposureEmbarrassing information shared,stripping data subject of dignity
• Appropriation (related to distortion)Associates a person with a cause/product he did not agree to endorse
• Increased accessibilityData holder makes previously available information more easily acquirable
• Confidentiality breachTrusted data holder provides information about data subject,e.g., doctor-patient, lawyer-client, priest-parishioner
• BlackmailThreat of dissemination unless demand is met, creating a power relationshipwith no social benefits
• DistortionPresentation of false informationabout person, harming not just subject, but also third parties no longer able to accurately judge subject’s character
Technology Raises New Privacy Concerns
21FM 2009 Jeannette M. Wing
Technology vs Legal
• The courts often lead the examination of these questions.– Cameras [Warren and Brandeis 1890], wiretapping, aerial
observation, tracking devices, hidden video cameras, thermal imaging, …
– Leads to new regulations in US (e.g., HIPPA), France (e.g., SAFARI), Canada (e.g., PIPED), …
• Often flip-flopping decisions, e.g., US Total Information Awareness data mining program
• Technological advances represent progress, but often raise new privacy concerns.– How does the new technology affect privacy?– How should we mitigate ill effects?
Credit: Sony Corporation
Technology Helps Preserve Privacy
23FM 2009 Jeannette M. Wing
Diversity of Technical Approaches
• Technology can make some violations impossible– Cryptography
• One-time pads guarantee perfect secrecy• Voting schemes that prevent attacks of coercion
– Formal verification• Secure operating systems kernels
• Technology can mitigate attacks of intrusion– Intrusion detection systems, spam filters
• Technology can preserve degrees of privacy– Onion routing for anonymity– Privacy-preserving data mining
• Technology can provide assurance– Logics of knowledge for reasoning about secrecy and anonymity
24FM 2009 Jeannette M. Wing
Computer Scientists Have Focused Primarily on Disclosure and Aggregation• Invasions
– Physical– Decisional
• Information collection– Surveillance– Interrogation
• Information processing– Aggregation– Identification– Insecurity– Secondary Uses– Exclusion
• Information dissemination– Confidentiality breach– Disclosure– Exposure– Distortion– Appropriation (related to distortion)– Increased accessibility– Blackmail
Future work: Many other aspects ofprivacy have not been addressed
25FM 2009 Jeannette M. Wing
Recent Work on Disclosure and Aggregation
• Linkage attacks• Anonymizing search query logs, databases
– K-anonymity• Deleting private information
– Vanishing data, unvanishing data• Statistical approaches
– Releasing tables of data: frequencies or aggregations of individual counts
– Releasing micro-data: sanitization of individual responses
• Semantic approaches– Differential privacy
26FM 2009 Jeannette M. Wing
Vanishing Data: Overcoming New Risks to Privacy(University of Washington) [USENIX June 2009]
Yesterday(Data on Personal/Company Machines)
Private Data
Tomorrow(Data in “The Cloud”)
Data passes through ISPs
Data stored in the “Cloud”
Sept 18: Unvanishing Data (Princeton and Rice) breaks Vanishing Data
27FM 2009 Jeannette M. Wing
Differential Privacy
• Add noise to value of statistic.• Adversary cannot learn about any one individual.
Motivation: Dalenius [1977] proposes the privacy requirement that an adversary with aggregate information learns nothing about any of the data subjects that he could not have known without the aggregrate information.
Dwork [2006] proves this is impossible to hold. Dwork et al. [2006] introduces a tunable probabilistic differential privacy requirement.
Opportunities for Formal Methods
29FM 2009 Jeannette M. Wing
Generic Formal Methods
• Models• Logics• Policy Languages• Abstraction and Refinement• Policy Composition• Code-Level Analysis• Automated Tools
30FM 2009 Jeannette M. Wing
Models
Traditional– Reliability: System + Environment– Security: System + Adversary
Privacy– Data Subject + Data Holder + Adversary
Traditional– Algebraic, compositional
Privacy– Trust is not transitive
• X trusts Y, Y trusts Z does not imply X trusts Z
Future Work: New models to capture new relationships
Data Subject
Data
Data Holder
X
Y
Z
31FM 2009 Jeannette M. Wing
Logics
Traditional• Properties are assertions over traces, e.g.,
sequences of states and/or transitions
Privacy• Some information-flow properties cannot be
expressed as trace properties [McLean 1994]– E.g., non-interference, “secrecy”
Future Work: Logics for specifying and reasoning about such
non-trace properties and other privacy properties
s1 s2 s3 s4 s5 …
t1 t2 t3 t4 t5 …
32FM 2009 Jeannette M. Wing
Policy Languages
• Informal– Corporate privacy policies
• Formal notation, informal semantics– Enterprise Privacy Authorization Language (EPAL)– Platform for Privacy Preferences (P3P)
• Formal policy language by Barth et al.– Linear temporal logic, semantics based on
Nissenbaum’s “contextual integrity”
Future Work: Richer set of formal privacy policy languageswith tool support
33FM 2009 Jeannette M. Wing
Policy Composition (Open)
Given: Components A and B Privacy policies P1 and P2
• If A P1 and B P2 ,
A B P1 P2
Example: National Science Foundation (NSF) and National Institutes of Health (NIH) reviewer policies conflict.
??
Privacy-Specific Needs
35FM 2009 Jeannette M. Wing
Statistical/Quantitative Reasoning
Traditional– Correctness: yes/no– Security: information-flow yes/no
Statistical aggregration– A “small” amount of flow may be acceptable
• E.g., average weight hides weights of individuals
Future work: Combine traditional formal methods withstatistical models/methods
36FM 2009 Jeannette M. Wing
Broader Context: Trustworthy Systems
• Trustworthy =
Reliability
Security
Privacy
Usability
Tradeoffs among all four.
Privacy and Usability
39FM 2009 Jeannette M. Wing
Clicking Your Way Through Privacy (Firefox)
40FM 2009 Jeannette M. Wing
Do You Read These? What Are They Saying?
This privacy statement goes on for seven screenfuls!
Windows Media Player 10 Privacy Statement
41FM 2009 Jeannette M. Wing
Privacy: A Few Questions to Ponder1. What does privacy mean?
2. How do you state a privacy policy? How can you prove your system satisfies it?
3. How do you reason about privacy? How do you resolve conflicts among different privacy policies?
4. Are there things that are impossible to achieve wrt some definition of privacy?
5. How do you implement practical mechanisms to enforce different privacy policies? As they change over time?
6. How do you measure privacy? (Is that a meaningful question?)
42FM 2009 Jeannette M. Wing
Postlude
• My talk is based on a paper with my student, Michael Tschantz.
• My talk is a bit US-centric. Please share your views on privacy with me.
• Our paper has 77 references. Please let me know of work we missed.
43FM 2009 Jeannette M. Wing
Thank you!
44
Credits• Copyrighted material used under Fair Use. If you are the copyright holder and believe your material
has been used unfairly, or if you have any suggestions, feedback, or support, please contact: [email protected]
• Except where otherwise indicated, permission is granted to copy, distribute, and/or modify all images in this document under the terms of the GNU Free Documentation license, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation license” (http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License)
• The inclusion of a logo does not express or imply the endorsement by NSF of the entities' products, services or enterprises