on your social network de- anonymizablity: quantification and large scale evaluation with seed...

Post on 18-Jan-2016






Click to see full reader


On Your Social Network De-anonymizablity:Quantification and Large Scale Evaluation with Seed Knowledge

NDSS 2015, Shouling Ji, Georgia Institute of Technology

Fengli Zhang



• Introduction • Motivation • Contribution • De-anonymization Quantification • Evaluation• Conclusion


• As social networks have become deeply integrated in people’s lives, social networks can produce a significant amount of social data that contains their users’ detailed personal information• To protect users’ privacy, data owner usually anonymize

their data before it is shared, transferred, and published Naïve ID removal, K-anonymization, Differential privacy

• Existing anonymization schemes have vulnerabilities. Structure based de-anonymization attacks can break the privacy of social networks effectively based only on the data’s structural information.

De-anonymization Attack


• Question 1 : Why social networks are vulnerable to structure based de-anonymization attacks? • Question 2 : How de-anonymizable a social network is? • Question 3 : How many users within a social network

can be successfully de-anonymized?


• first theoretical quantification on the perfect and partial de-anonymizablity of social networks in general scenarios, where the social network can follow an arbitrary network model• implement the first large scale evaluation of the

perfect and partial de-anonymizablity of 24 various real world social networks• find that compared to the structural information

associated with known seed users, the other structural information(the structural information among anonymized users) is also useful in improving structure based de-anonymization attacks

Data Model • Anonymized Data ()=(, )• Auxiliary Data ()=(, )• De-anonymization scheme (σ) σ is a mapping: if i Є , σ(i) Є

• Seed mapping S S={(i, σ(i)|i Є , σ(i) Є }, Λ=|S|• Conceptual Underlying Graph (G)• Sampling rate s• Measurement

System Model

• : Edge difference between and under σ• For the mapping (i, σ(i)=j) Є σ • 2

De-anonymization Quantification

• Graph G : Erdos-Renyi (ER) model; General model• QuantificationSeed based perfect de-anonymization

Structure based perfect de-anonymization

De-anonymization Quantification

Error Toleration QuantificationWe define is (1 − ϵ)-de-anonymizable if at least (1−ϵ)n users in are perfectly de-anonymizable. That is at most ϵn incorrect de-anonymizations are allowable.



• Suffixes -S: Using seed information -A, None: Using overall structural information -e.g. Twitter-A, Twitter-S• Seed mapping are chosen randomly -High-degree users are not given preference -Representing the general scenatios• 2-part of Evaluation - Evaluation of perfect De-anonymizablity - Evaluation of (1 − ϵ)-de-anonymizablity

Evaluation _ perfect De-anonymizablity [1/3]

Evaluation _ perfect De-anonymizablity [2/3]

Evaluation _ perfect De-anonymizablity [3/3]

Evaluation _ partial De-anonymizablity [1/2]

Evaluation _ partial De-anonymizablity [2/2]

Evaluation Overview

Conclusion & Limitation

• Provide the theoretical foundation for the existing De-anonymization attacks with seed information• The overall structural information based de-

anonymization is more powerful and it can perfectly de-anonymize a social network even without any seed information• Do not speciafically consider how to design structural

data anonymization technique to defend against such de-anonymization attacks• Do not explicitly involve the noise model because it

does not have proper scheme to add noise with data utility preservation

top related