online testing of user profile resilience against inference attacks...

39
Online testing of user profile resilience against inference attacks in social networks 1 / 37 Online testing of user profile resilience against inference attacks in social networks Younes Abid, Abdessamad Imine, Michael Rusinowitch INRIA/LORIA Supported by MAIF Foundation 09/02/2018

Upload: others

Post on 05-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 1 / 37

Online testing of user profile resilience againstinference attacks in social networks

Younes Abid, Abdessamad Imine, Michael Rusinowitch

INRIA/LORIASupported by MAIF Foundation

09/02/2018

Page 2: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 2 / 37

Music and politics on Facebook in the 2014 Midterm Elections byWinter Mason

Page 3: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 3 / 37

Introduction

I OSN data can be correlated to infer sensitive information[Heatherly et al., 2013].

I Information leakage can have harmful impact [ex. CambridgeAnalytica].

I Most users do not know how to protect their personal data.

I We aim to design a tool that help users to protect theirprivacy.

Page 4: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 4 / 37

Sensitive information:

I Links: friendship, group membership, . . .

I Attribute values: age, relationship status, prefered politician,. . .

Privacy attacks:

I Disclosure attacks: bypass the visibility settings

I Inference attacks: correlation techniques to guess information

Page 5: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 5 / 37

SONSAI: SOcial Networks Sensitive Attribute Inference system

I Checks user security against link disclosure and attributeinference attacks on Facebook in few minutes

I Identifies the origin of the risks for helping user to take actions

Inputs:

I Facebook ID of the target

I Sensitive attribute

Outputs:

I Disclosed links between the target and other users

I Inferred values of sensitive attributes of the target

I Most correlated attributes to the sensitive one

Page 6: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 6 / 37

Comparison with related works on attribute inference

I [Ryu et al., 2013] use several statistical measures.

I [Heatherly et al., 2013] use bayesian classification.

I [Estivill-Castro et al., 2014] use decision trees.

I As [Perozzi et al., 2014] we use random walk to performinference.

However,

1. We first perform link disclosure attacks and collect hiddeninformation in the local network around a targetted user.

2. We design cleansing techniques to select most relevantattributes (to speed up sensitive attribute inference).

3. We perform online analysis on sparse datasets (rather thanoffline on large datasets).

Page 7: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 7 / 37

SONSAI: SOcial Networks Sensitive Attribute Inference system

Figure: Architecture of SONSAI.

Page 8: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 8 / 37

SONSAI Architecture

Page 9: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 9 / 37

Link disclosure

RequestsI Friendship requests : /friendship/ < IDT > / < ID2 >

I returns true if T and u2 are friends and u2 publishes his friendlist

I Mutual friend requests:/browse/mutual friends/?uid =< ID1 > &node =< ID2 >

I returns a list of common friends between T and u2 if u2publishes his friend list

I only common friend that publish their friend list are returned

Page 10: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 10 / 37

Link disclosure

I We have noticed that:I 49% of users publish at least one membership to a small group

(maximum 50 members)I The average densities of small groups is 0.3

(density = |existing links||possible links| )

Page 11: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 11 / 37

Link disclosure

Attack strategy

1. Discover groups around the target

2.

Page 12: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 12 / 37

Link disclosure

Attack strategy

1. Discover groups around the target

2.

Page 13: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 13 / 37

Attack strategy

1. Discover groups around the target

2. Disclose links between users

Page 14: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 14 / 37

I Hidden links can be disclosed using allowed requests.

I Successful online attacks on more than 50% of targets thatmask their links

I User privacy is in the hands of her groups and friends.

Page 15: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 15 / 37

Attribute inference attacks: Cleanser

Page 16: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 16 / 37

Attribute representation

1. large number of attributes → select the most relevant ones bycomparing their graph structure to sensitive attribute graph

2. large number of attribute values → cluster similar values

Page 17: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 17 / 37

Reducing the number of attribute values by clustering

I Maximize similarity between values inside each cluster.I similarity(v1, v2) = likes(v1)∩likes(v2)

likes(v1)∪likes(v2)

I Clusters must be size-like

Page 18: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 18 / 37

Random walker: Transforms the graph model into a textmodel

Page 19: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 19 / 37

Figure: Example of multi graph random walk.

Page 20: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 20 / 37

Word2Vec: Computes vectors for sensitive values and users

Page 21: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 21 / 37

Ranker: Ranks sensitive values w.r.t. target user closeness

Page 22: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 22 / 37

I Uses cosine similarity to rank sensitive attributes values.

Page 23: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 23 / 37

Experiments

For each dataset we

I select 10% of profiles that publish their preferences concerningthe sensitive attribute and at least one other attribute,

I remove all the sensitive preferences of selected profiles

The experiments have consisted in inferring the removedpreferences by analysing the modified dataset.

Page 24: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 24 / 37

Results

I Processing times:

Process CleansingRandom

Word2Vec Rankingwalk

Gender D1 423 34 50 0.12Time inference D2 243 25 30 0.12

(in seconds) Politicians D1 782 523 924 1inference D2 574 451 733 1

I Politicians

Selection methodaccuracy

] targets] deleted

in AUC preferences

Correlation-based selection0.79 252 409

(23 graphs)Random selection

0.41204 351

(23 graphs) (average) (average)

Page 25: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 25 / 37

I Relationship statusThe accuracy is higher than 0.7 AUC if the target publisheshis preferences concerning at least 4 attributes.

I GenderThe accuracy is higher than 0.83 AUC in datasets D1 andhigher than 0.67 AUC in the dataset D2 if the target publisheshis preferences concerning at least 2 attributes.

I The friendship is highly correlated to political preferenceshowever it is not correlated to gender or relationship status.

Page 26: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 26 / 37

Conclusions

I SONSAI enables users to predict sensitive links/attributes insocial networks from relatively small amount of data andlimited computing resources.

I Correlated attributes are automatically brought to light bySONSAI without using any semantic information.

Page 27: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 27 / 37

Perspectives

I Semantics information are useful to process data. We havenoticed that:

I Some types of attributes are semantically very close and canbe clustered.

I The architecture of SONSAI can be extended to 3-tierarchitecture for better privacy:

1. Data layer: collects and anonymises data2. Process layer: cleanses data and realises attacks3. Presentation layer: Displays results for identified user

Page 28: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 28 / 37

Defining sensitive subjects

I All the unpublished information by a given user are sensitivefor him [Ryu et al., 2013, Vidyalakshmi et al., 2016].

I Google: “confidential medical facts, racial or ethnic origins,political or religious beliefs or sexuality”

I French law of January 6, 1978 : “racial or ethnic origin,political, philosophical or religious opinions, trade unionmembership, health and sex life”

Page 29: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 29 / 37

Our definition

I We have analysed 18 subjects through a survey.

I A sensitive subject meet at least 2 criteria of the following:

I Low discussion frequencyI High anonymity frequencyI High avoidance frequency

I Statistics related to the sensitive subjects.

In %Discussions on Discussions on

Anonymity Avoidancesocial networks forums and websites

Money 0.94 54.42 25 10.14Religion 5.63 26.05 14.28 33.33

Shopping 1.88 66.05 9.09 0Dating 5.16 24.19 0 21.74Health 17.37 66.05 9.09 5.8Politics 25.82 54.42 25 50.72

Page 30: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 30 / 37

Crawling criteria for sampling:I Node type (page, group, profile)I Node distance from targetI Number of paths between node and target

Page 31: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 31 / 37

1. Selecting relevant attributes w.r.t. a sensitive attribute

I Transform bipartite graphs into unipartite weighted graphsusing Jaccard index: weight(u1, u2) = likedby(u1)∩likedby(u2)

likedby(u1)∪likedby(u2)

I Compare unipartite weighted graphs using Hamming distance

Page 32: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 32 / 37

Page 33: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 33 / 37

Page 34: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 34 / 37

Page 35: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 35 / 37

Page 36: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 36 / 37

Hyper-parameters:

I Continuous bag of word algorithm

I Context window is limited to 3

I Vectors dimension of nodes

I 128 for small values (gender, relationship status ...)

I 512 for huge values (politicians, health ...)

Page 37: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 37 / 37

Details about the dataset 1.

] Crawled15 012

] Discovered3 353 590

profiles profiles

] Pages 1 022 847] Types of

1 926pages

] Groups 135 381] Relationship

11status

Details about the dataset 2.

] Crawled6 550

] Discovered1 010 966

profiles profiles

] Pages 298 604] Types of

1 293pages

] Groups 29 062] Relationship

11status

Page 38: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 37 / 37

Estivill-Castro, V., Hough, P., and Islam, M. Z. (2014).Empowering users of social networks to assess their privacyrisks.In 2014 IEEE International Conference on Big Data, Big Data2014, Washington, DC, USA, October 27-30, 2014, pages644–649.

Heatherly, R., Kantarcioglu, M., and Thuraisingham, B.(2013).Preventing private information inference attacks on socialnetworks.IEEE Transactions on Knowledge and Data Engineering,25(8):1849–1862.

Perozzi, B., Al-Rfou, R., and Skiena, S. (2014).Deepwalk: online learning of social representations.

Page 39: Online testing of user profile resilience against inference attacks …eric.univ-lyon2.fr/bigdatamaps/files/prestelerise.pdf · 2018. 9. 2. · Online testing of user pro le resilience

Online testing of user profile resilience against inference attacks in social networks 37 / 37

In The 20th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining, KDD ’14, New York,NY, USA - August 24 - 27, 2014, pages 701–710.

Ryu, E., Rong, Y., Li, J., and Machanavajjhala, A. (2013).curso: protect yourself from curse of attribute inference: asocial network privacy-analyzer.In Proceedings of the 3rd ACM SIGMOD Workshop onDatabases and Social Networks, DBSocial 2013, New York,NY, USA, June, 23, 2013, pages 13–18.

Vidyalakshmi, B. S., Wong, R. K., and Chi, C. (2016).User attribute inference in directed social networks as aservice.In IEEE International Conference on Services Computing, SCC2016, San Francisco, CA, USA, June 27 - July 2, 2016, pages9–16.