1 privacy preserving data publishing prof. ravi sandhu executive director and endowed chair march...
TRANSCRIPT
1
Privacy Preserving Data Publishing
Prof. Ravi SandhuExecutive Director and Endowed Chair
March 29, 2013
© Ravi Sandhu World-Leading Research with Real-World Impact!
CS 6393 Lecture 9
© Ravi Sandhu 2World-Leading Research with Real-World Impact!
Domingo-Ferrer 2007
DatabasePrivacy
Respondent
Owner User
3 independent dimensions
© Ravi Sandhu 3World-Leading Research with Real-World Impact!
Fung et al 2010
PPDP
Record Owners
Data Publisher Data Recipient
© Ravi Sandhu 4World-Leading Research with Real-World Impact!
Fung et al 2010
PPDP
Record Owners
Data Publisher Data RecipientData MinerData Collector
© Ravi Sandhu 5World-Leading Research with Real-World Impact!
Fung et al 2010
PPDP
Record Owners
Data Collection
Data Publication
Data PublisherData Collector
Data RecipientData Miner
© Ravi Sandhu 6World-Leading Research with Real-World Impact!
Fung et al 2010
PPDP
Record Owners
Data Collection
Data Publication
Data PublisherData Collector
Trusted
Willing
Data RecipientData Miner
Data UtilityPrivacy ExposurePotential Attacker
Privacy preserving data mining (PPDM)How to do data mining when the publisher has modified the
data to obscure sensitive information?How to modify the data to obscure sensitive information
without loosing ability to data mine?Techniques often tied to data mining task.PPDM is being used even when no data mining as such is
being done.
© Ravi Sandhu 7World-Leading Research with Real-World Impact!
Related but not Synonymous
Single tableEach record pertains to a distinct owner (typically)4 kinds of attributes (disjoint):
Explicit identifierQuasi identifier (QID)Sensitive attributes Non-sensitive attributes
Anonymization techniquesModified quasi identifier (QID’)Add noiseGenerate synthetic data “similar” to original
© Ravi Sandhu 8World-Leading Research with Real-World Impact!
Records in a Table
Absolute Privacy, Dalenius 1977Access to published data should not enable the attacker to
learn anything extra about any target victim compared to no access to the database, even with the presence of any attacker’s background knowledge obtained from other sources.
Impossible, Dwork 2006Even if published data does not include target victims record
attacker can still learn something about target victim from published data and background knowledge.
© Ravi Sandhu 9World-Leading Research with Real-World Impact!
Privacy Definition: Absolute Privacy
Differential Privacy, Dwork 2006Compare risk to target victim’s privacy with or without
presence of target victim’s record in published database.
Risk should not substantially increase if the record is included.
© Ravi Sandhu 10World-Leading Research with Real-World Impact!
Privacy Definition: Differential Privacy
Uninformative Principle, Machanavajjhala et al 2006Difference between prior and posterior beliefs is small
© Ravi Sandhu 11World-Leading Research with Real-World Impact!
Privacy Definition: Uninformative Principle
Record linkageAttribute linkageTable linkage
© Ravi Sandhu 12World-Leading Research with Real-World Impact!
Privacy Definition: Linkage Attacks
© Ravi Sandhu 16World-Leading Research with Real-World Impact!
Table Linkage
Published PublicKnown to be
subset of
Probability that Alice is in (c) is 4/5Probability that Bob is in (c) is 3/4
Generalization and suppressionAnatomization and permutation Perturbation
© Ravi Sandhu 18World-Leading Research with Real-World Impact!
Anonymization Techniques
Full domain generalizationGeneralize to same level in tree
Subtree generalization Sibling generalization Cell generalization
Local recoding versus global recoding for above Multi-dimensional generalization
© Ravi Sandhu 19World-Leading Research with Real-World Impact!
Generalization
Full domain generalizationGeneralize to same level in tree
Subtree generalization Sibling generalization Cell generalization
Local recoding versus global recoding for above Multi-dimensional generalization
© Ravi Sandhu 20World-Leading Research with Real-World Impact!
Generalization
Full domain generalizationGeneralize to same level in tree
Subtree generalization Sibling generalization Cell generalization
Local recoding versus global recoding for above Multi-dimensional generalization
© Ravi Sandhu 21World-Leading Research with Real-World Impact!
Generalization
Full domain generalizationGeneralize to same level in tree
Subtree generalization Sibling generalization Cell generalization
Local recoding versus global recoding for above Multi-dimensional generalization
© Ravi Sandhu 22World-Leading Research with Real-World Impact!
Generalization
Full domain generalizationGeneralize to same level in tree
Subtree generalization Sibling generalization Cell generalization
Local recoding versus global recoding for above Multi-dimensional generalization
Generalize Engineer,Male -> Engineer,AnyGeneralize Engineer,Female -> Professional,Female
© Ravi Sandhu 23World-Leading Research with Real-World Impact!
Generalization
Record suppressionValue suppression (globally)Cell suppression (local value suppression)
© Ravi Sandhu 24World-Leading Research with Real-World Impact!
Suppression