an automated social graph de-anonymization technique · pokec: fp vs tp (self-validation) scheme 1...
TRANSCRIPT
![Page 1: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/1.jpg)
An Automated Social GraphDe-anonymization Technique
Kumar Sharad 1 George Danezis 2
1 2
November 3, 2014
Workshop on Privacy in the Electronic Society, Scottsdale, Arizona, USA
![Page 2: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/2.jpg)
2
This Talk
1 The Art of Data Anonymization
2 The D4D Challenge
3 An Ad-hoc Attack
4 Learning De-anonymization
5 Results
![Page 3: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/3.jpg)
3
The Art of Data Anonymization
![Page 4: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/4.jpg)
4
Releasing Anonymized Data
Motivation: Process data without jeopardizing privacy.
Popular: Randomize identifiers and/or perturb data.
Pros: Cheap, preserves utility, provides legal immunity.
Cons: Practiced as an art form.
![Page 5: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/5.jpg)
5
The Data for Development(D4D) Challenge
![Page 6: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/6.jpg)
6
The D4D Challenge1
Introduced by a large Telco for research related to socialdevelopment in Ivory Coast.
Four datasets of anonymized call patterns released.
Datasets include: Antenna-to-antenna calls, individualtrajectories of varying spatial resolution and call graphs.
Ivory Coast facts:
Population – 22.4 million.Mobile phone users – 17.3 million.Telco subscribers – 5 million.A country fraught with civil war.
1http://www.d4d.orange.com/
![Page 7: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/7.jpg)
7
Timeline
July 2012: A preliminary version of the datasets madeavailable to us for evaluation.
September 2012: We provide feedback depicting weaknessesof the scheme, specifically the anonymized call graphs.
Late 2012: The challenge goes live after strengthening theanonymization. Released under strict NDA.
![Page 8: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/8.jpg)
8
The Dataset 4: Anonymized Call Graphs
2-hop communication network (egonet) of an individual.
Vertices represent users and edges their interactions.
Scheme 1 (pre-review):
8300 egonets.Edge attributes: call volume, duration and directionality.
Scheme 2 (post-review):
5000 egonets.All edges between 2-hop nodes are removed.Edge attributes: redacted.
![Page 9: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/9.jpg)
9
Scheme 1 vs. Scheme 2: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
Scheme 1: Pre-review
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
Scheme 2: Post-review
![Page 10: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/10.jpg)
9
Scheme 1 vs. Scheme 2: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
Scheme 1: Pre-review
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
Scheme 2: Post-review
![Page 11: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/11.jpg)
9
Scheme 1 vs. Scheme 2: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
Scheme 1: Pre-review
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
Scheme 2: Post-review
![Page 12: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/12.jpg)
10
Anonymization Strategy
Individuals picked at random.
Identifiers randomized in each egonet.
Tries to conceal the larger graph.
Hope: Facilitate analysis while preserving privacy.
Anonymity strengthened by redacting information.
![Page 13: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/13.jpg)
11
How to Evaluate Anonymization Schemes?
Option 1: We believe the scheme is secure.
Hard to merge the egonets.Difficulty of linking egonets should be quantifiable.
Option 2: We believe the scheme is insecure.
Show that a significant fraction of egonets can be re-linked.Discern real world identities.Recover full communication graph.
Gap: Lack of an attack does not imply security.
![Page 14: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/14.jpg)
11
How to Evaluate Anonymization Schemes?
Option 1: We believe the scheme is secure.
Hard to merge the egonets.Difficulty of linking egonets should be quantifiable.
Option 2: We believe the scheme is insecure.
Show that a significant fraction of egonets can be re-linked.Discern real world identities.Recover full communication graph.
Gap: Lack of an attack does not imply security.
![Page 15: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/15.jpg)
11
How to Evaluate Anonymization Schemes?
Option 1: We believe the scheme is secure.
Hard to merge the egonets.Difficulty of linking egonets should be quantifiable.
Option 2: We believe the scheme is insecure.
Show that a significant fraction of egonets can be re-linked.Discern real world identities.Recover full communication graph.
Gap: Lack of an attack does not imply security.
![Page 16: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/16.jpg)
12
An Ad-hoc Attack
![Page 17: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/17.jpg)
13
Ad-hoc Attack on Scheme 1
Transformation into egonets preserves an important variant.
The degree of egos and 1-hop nodes is preserved.
Degrees of the 1-hop sub-graph of 1-hop nodes is preserved.
Can be used as a stable signature.
![Page 18: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/18.jpg)
14
Ad-hoc Attack on Scheme 1: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
![Page 19: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/19.jpg)
14
Ad-hoc Attack on Scheme 1: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
A
![Page 20: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/20.jpg)
14
Ad-hoc Attack on Scheme 1: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
A
deg: 1
deg: 3
deg: 2
deg: 2
![Page 21: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/21.jpg)
14
Ad-hoc Attack on Scheme 1: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
A
deg: 1
deg: 3
deg: 2
deg: 2
sig: [1, 2, 2, 3]
![Page 22: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/22.jpg)
14
Ad-hoc Attack on Scheme 1: Illustrated
ego
1-hop
1-hop
1-hop
2-hop
2-hop
2-hop 2-hop
A
deg: 1
sig: [1, 2, 2, 3]
Bdeg: 1
deg: 4
deg: 1
deg: 1
sig: [1, 1, 1, 4]
![Page 23: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/23.jpg)
15
Success Rate: Scheme 1
100% match for identical node pairs (theoretical).
Over 99.9% mismatch for non-identical node pairs.
![Page 24: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/24.jpg)
16
Learning De-anonymization
![Page 25: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/25.jpg)
17
Security Economics: Attacking a Class of Schemes
Scheme 2 defeats the ad-hoc attack.
A piecemeal approach towards de-anonymization does notscale.
Defeating an instance of anonymization is not generalizable.
Can we generalize attacks?
![Page 26: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/26.jpg)
18
A Machine Learning Approach
Traditional approach:
1 An anonymization strategy is designed.2 Manually construct an attack.3 Strategy is tweaked.4 GO TO 2.
Machine learning approach:
1 An anonymization strategy is designed.2 Generate training and test data based on the algorithm.3 Extract features.4 Train the model.5 Evaluate the performance
![Page 27: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/27.jpg)
19
The Model for D4D Learning Task
Original Call Graph
Anonymization Process
Anonymized Egonets
Training Set
Known node pairs
Evaluation Set
Identical node pair?
![Page 28: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/28.jpg)
19
The Model for D4D Learning Task
Original Call Graph
Anonymization Process
Anonymized Egonets
Training Set
Known node pairs
Evaluation Set
Identical node pair?
![Page 29: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/29.jpg)
19
The Model for D4D Learning Task
Original Call Graph
Anonymization Process
Anonymized Egonets
Training Set
Known node pairs
Evaluation Set
Identical node pair?
![Page 30: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/30.jpg)
19
The Model for D4D Learning Task
Original Call Graph
Anonymization Process
Anonymized Egonets
Training Set
Known node pairs
Evaluation Set
Identical node pair?
![Page 31: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/31.jpg)
19
The Model for D4D Learning Task
Original Call Graph
Anonymization Process
Anonymized Egonets
Training Set
Known node pairs
Evaluation Set
Identical node pair?
![Page 32: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/32.jpg)
20
Node Features
Must distinguish identical and non-identical node pairs.
Feature vector purely based on topology (no edge weights ordirectionality).
Too generic: high false positives.
Too specific: low true positives.
Extend the signature by quantizing it.
![Page 33: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/33.jpg)
21
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[1, 1, 3, 3, 5, 6, 7, 13, 16, 20, 21, 30, 65, 69, 72, 1030, 1100].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 34: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/34.jpg)
21
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[1, 1, 3, 3, 5, 6, 7, 13, 16, 20, 21, 30, 65, 69, 72, 1030, 1100].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 35: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/35.jpg)
22
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[ 1, 1, 3, 3, 5, 6, 7, 13 , 16, 20, 21, 30, 65, 69, 72, 1030, 1100].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 36: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/36.jpg)
23
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[1, 1, 3, 3, 5, 6, 7, 13, 16, 20, 21, 30 , 65, 69, 72, 1030, 1100].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 37: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/37.jpg)
24
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[1, 1, 3, 3, 5, 6, 7, 13, 16, 20, 21, 30, 65, 69, 72, 1030, 1100].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 38: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/38.jpg)
25
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[1, 1, 3, 3, 5, 6, 7, 13, 16, 20, 21, 30, 65, 69, 72 , 1030, 1100].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 39: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/39.jpg)
26
Internals: Feature Vector
Feature vector of a node with neighbors of degrees –[1, 1, 3, 3, 5, 6, 7, 13, 16, 20, 21, 30, 65, 69, 72, 1030, 1100 ].
c0 = 8 c1 = 4 c2 = 0
size = 15
70 bins. . .
. . .c4 = 3
. . .
. . .c69 = 2
![Page 40: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/40.jpg)
27
Internals: Random Forest
400 trees trained.
Identical node pair types: 1-hop, 1,2-hop and 2-hop.
4 random forests trained: 1 per category + 1 generic
Prediction: Aggregate the decision of all trees.
![Page 41: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/41.jpg)
28
Results
![Page 42: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/42.jpg)
29
Evaluation: Datasets
Evaluation does NOT use D4D datasets.
Ethical concernsLack of ground truth.
Publicly available datasets used
D4D (5M nodes) – 5000 egonets released.Epinions (75K nodes) – 100 egonets extracted.Pokec (1.6M nodes) – 1000 egonets extracted.
![Page 43: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/43.jpg)
30
Pokec Dataset: ROC Curves
0.0 0.2 0.4 0.6 0.8 1.0
False Positive
0.0
0.2
0.4
0.6
0.8
1.0
Tru
e P
osi
tive
1-hop: AUC = 0.952
1,2-hop: AUC = 0.914
2-hop: AUC = 0.802
Complete: AUC = 0.793
Pokec: Scheme 1 (self-validation)
0.0 0.2 0.4 0.6 0.8 1.0
False Positive
0.0
0.2
0.4
0.6
0.8
1.0
Tru
e P
osi
tive
1-hop: AUC = 0.978
1,2-hop: AUC = 0.930
2-hop: AUC = 0.984
Complete: AUC = 0.891
Pokec: Scheme 2 (self-validation)
![Page 44: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/44.jpg)
31
Pokec: FP vs TP (self-validation)
Scheme 1
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 27.50 42.92 51.04 88.75 93.961,2-hop 5.25 11.58 36.16 73.24 88.682-hop 0.00 12.55 23.15 49.14 69.96Complete 0.01 10.44 20.48 47.60 68.36
Scheme 2
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 4.20 16.26 49.89 97.20 99.581,2-hop 0.79 6.41 28.32 73.88 94.662-hop 1.62 12.12 50.42 99.96 99.99Complete 0.68 6.12 21.14 64.12 86.10
![Page 45: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/45.jpg)
32
Pokec: FP vs TP (self-validation)
Scheme 1
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 27.50 42.92 51.04 88.75 93.961,2-hop 5.25 11.58 36.16 73.24 88.682-hop 0.00 12.55 23.15 49.14 69.96Complete 0.01 10.44 20.48 47.60 68.36
Scheme 2
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 4.20 16.26 49.89 97.20 99.581,2-hop 0.79 6.41 28.32 73.88 94.662-hop 1.62 12.12 50.42 99.96 99.99Complete 0.68 6.12 21.14 64.12 86.10
![Page 46: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/46.jpg)
33
Claim of Generality
Random forest uncovers artifacts and invariants of theanonymization algorithm not merely quirks of the input data.
Learning de-anonymization allows it to attack previouslyunseen data (x-validation).
Ideal: training and test distributions are close.
De-anonymization is successful for a variety of schemes.
![Page 47: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/47.jpg)
34
Pokec: FP vs TP (x-validation)
Scheme 1
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 19.38 27.29 34.79 57.92 76.251,2-hop 2.98 10.10 26.52 70.37 90.722-hop 1.71 4.18 18.84 39.12 52.52Complete 1.89 4.05 16.83 36.81 50.76
Scheme 2
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 2.11 5.40 12.29 28.29 60.261,2-hop 0.18 2.08 14.34 49.25 70.762-hop 3.02 13.57 45.45 99.80 100.00Complete 1.00 5.61 19.22 56.90 72.76
![Page 48: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/48.jpg)
35
Pokec: FP vs TP (x-validation)
Scheme 1
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 19.38 27.29 34.79 57.92 76.251,2-hop 2.98 10.10 26.52 70.37 90.722-hop 1.71 4.18 18.84 39.12 52.52Complete 1.89 4.05 16.83 36.81 50.76
Scheme 2
False Positive 0.01% 0.1% 1% 10% 25%
1-hop 2.11 5.40 12.29 28.29 60.261,2-hop 0.18 2.08 14.34 49.25 70.762-hop 3.02 13.57 45.45 99.80 100.00Complete 1.00 5.61 19.22 56.90 72.76
![Page 49: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/49.jpg)
36
Concluding Remarks
What TP rate is acceptable?
What rate of de-anonymization is secure?
Lower bound on attack performance but cheaper evaluations.
What are the definitive set of features?
![Page 50: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/50.jpg)
37
Summary
Ad-hoc attack works but limited.
Better: Construct attacks by using machine learning.
Generic: Attack works even on learning from a differentdataset.
![Page 51: An Automated Social Graph De-anonymization Technique · Pokec: FP vs TP (self-validation) Scheme 1 False Positive 0.01% 0.1% 1% 10% 25% 1-hop 27:50 42:92 51:04 88:75 93:96 1,2-hop](https://reader033.vdocuments.us/reader033/viewer/2022051916/60089813cb67952f0e4ad330/html5/thumbnails/51.jpg)
38
Contact
Paper: research.ksharad.com
Authors
Kumar Sharad� [email protected]
� ksharad.com
George Danezis� [email protected]
� cs.ucl.ac.uk/staff/G.Danezis