privacy vulnerability of published anonymous mobility traces

33
Privacy Vulnerability of Privacy Vulnerability of Published Anonymous Mobility Published Anonymous Mobility Traces Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak Ridge National Laboratory)

Upload: mada

Post on 14-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

Privacy Vulnerability of Published Anonymous Mobility Traces. Chris Y. T. Ma, David K. Y. Yau , Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak Ridge National Laboratory ). Motivation: Collecting mobility traces. Mobile network applications - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Privacy Vulnerability of Published Anonymous Mobility Traces

Privacy Vulnerability ofPrivacy Vulnerability ofPublished Anonymous Mobility Published Anonymous Mobility TracesTraces

Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University)

Nageswara S. V. Rao(Oak Ridge National Laboratory)

Page 2: Privacy Vulnerability of Published Anonymous Mobility Traces

Motivation:Motivation:Collecting mobility tracesCollecting mobility tracesMobile network applications

◦traffic monitoring, road surface sensing, radiation and chemical detection

Mobility traces are collected and published to assist the design, analysis, and evaluation of mobile networks◦E.g., Crawdad

Page 3: Privacy Vulnerability of Published Anonymous Mobility Traces

Motivation:Motivation:Privacy vulnerabilityPrivacy vulnerability

Measures are carried out to protect privacy of the participants◦Traces are identified using a random

but consistent and unique identifier that is not correlated to the real ID

◦Spatial and temporal granularities are reduced

<11:32:12, Chris Ma, (41.89840,-87.61999)>

<11:30~11:35, ID-271, (41.89~41.90,-87.62~-87.61)>

Page 4: Privacy Vulnerability of Published Anonymous Mobility Traces

These measures are not enough!◦Participants can be openly observed◦Participants may leak their location

information (snapshots of time and location pairs, termed as side information) web blogs, status in social networks, tweets,

causal conversations, etc.

An adversary, who tries to identify the complete trace (movement history) of one or more participants, may succeed with high probability

Motivation:Motivation:Privacy vulnerabilityPrivacy vulnerability

Page 5: Privacy Vulnerability of Published Anonymous Mobility Traces

Our contributionsOur contributionsComprehensive study of attack

strategies◦Various ways for side information collection◦Analytically proved the optimality of attack

strategy◦Quantitative simulation results

Privacy implications of characteristics of real traces and synthetic traces◦Synthetic nodes are more sparsely placed

More easily identified but more difficult to meet with

Page 6: Privacy Vulnerability of Published Anonymous Mobility Traces

AgendaAgendaProblem formulationAnalytical derivationExperimental analysisConclusion

Page 7: Privacy Vulnerability of Published Anonymous Mobility Traces

Problem formulationProblem formulation- trace sampling and publication- trace sampling and publication

<t, R.B., (x,y)> <t’, IDi, (x’,y’)>

Page 8: Privacy Vulnerability of Published Anonymous Mobility Traces

Problem formulationProblem formulationAn adversary tries to identify the

complete movement history of the participant(s)◦collects side information and

compares with the published tracesPossible attack scenarios

◦Adversary infers the location of a victim indirectly (passive adversary)

◦Adversary observes the movement of the victims physically (active adversary)

Page 9: Privacy Vulnerability of Published Anonymous Mobility Traces

Passive AdversaryPassive Adversary- infers snapshots of victim- infers snapshots of victim

Special case:reference times are sampling times

Page 10: Privacy Vulnerability of Published Anonymous Mobility Traces

Passive AdversaryPassive Adversary- infers snapshots of victim- infers snapshots of victim

General case:reference times are not sampling times

Page 11: Privacy Vulnerability of Published Anonymous Mobility Traces

Passive AdversaryPassive Adversary- infers snapshots of victim- infers snapshots of victim

General case:reference times are not sampling times

Infers the possible location of the node at reference times using a general mobility model - preference of the nodes, physical constraints

Page 12: Privacy Vulnerability of Published Anonymous Mobility Traces

Passive AdversaryPassive Adversary- infers snapshots of victim- infers snapshots of victim

General case:reference times are not sampling times

Infers the possible location of the node at reference times using a general mobility model

Page 13: Privacy Vulnerability of Published Anonymous Mobility Traces

Passive AdversaryPassive Adversary- infers snapshots of victim- infers snapshots of victim

General case:reference times are not sampling times

Page 14: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack approaches of passive Attack approaches of passive adversaryadversaryUse of Bayesian approach to determine the

trace that gives the best match with the inferred location information

Published traces

Noisy side information

Page 15: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack approaches of passive Attack approaches of passive adversaryadversaryFor the special case (reference time =

sampling time), with the assumption that noise is i.i.d.,

For the general case, with the assumptions that noise is i.i.d. and movement is Markovian,

Page 16: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack approaches of passive Attack approaches of passive adversaryadversary

Most Likelihood Estimator (MLE) approach

Minimum Square (MSQ) approach

Basic (BAS) approach

Weighted Exponential (EXP) approach

• When noise is Gaussian, MLE and MSQ are equivalent

Distance0

0 Distance

0 Distance

Page 17: Privacy Vulnerability of Published Anonymous Mobility Traces

Active AdversaryActive Adversary- observes victims physically- observes victims physically

Adversary is one of the participants

Page 18: Privacy Vulnerability of Published Anonymous Mobility Traces

Active AdversaryActive Adversary- observes victims physically- observes victims physically

Adversary stays at a (popular) position

Page 19: Privacy Vulnerability of Published Anonymous Mobility Traces

Active AdversaryActive Adversary- observes victims physically- observes victims physically

Adversary travels between popular locations

Page 20: Privacy Vulnerability of Published Anonymous Mobility Traces

Problem formulationProblem formulationWhy the two different cases?

◦Active Needs to consider how to collect the side

information physically as time evolves Adversary tries to identify as many victims

as possible – plot of k-anonymity as function of time

◦Passive Snapshots of victim are inferred (not

collected) and less accurate in general Adversary tries to identify one victim only –

plot of correctness as function of pieces of side information

Page 21: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack strategy of active Attack strategy of active adversaryadversaryAlgorithm of the attack (in

action)1 A, B, C2 A, B, C3 A, B, C

1 A, B, C2 A, B, C3 A, B, C

1 A, B2 A, B3 A, B, C

123

t1t2

real ID trace IDs

Page 22: Privacy Vulnerability of Published Anonymous Mobility Traces

Experimental analysisExperimental analysisBasic information

◦Real traces 536 San Francisco taxicabs 2348 Shanghai Grid buses

◦Synthetic traces Using map size and average speed computed

from taxi cab traces Random waypoint (with different maximum

trip lengths) Random walk

◦Spatial granularity = 1 km◦Temporal granularity = 1 minute

(unless stated otherwise)

Page 23: Privacy Vulnerability of Published Anonymous Mobility Traces

Characteristics of the tracesCharacteristics of the tracesDistance between tracesDistance between traces

Real traces are closer to each other on average◦ Bus traces have a

broader range For synthetic traces,

the shorter the trip length, the further away they are from each other in general

Page 24: Privacy Vulnerability of Published Anonymous Mobility Traces

Significant observationsSignificant observations• Lack of preferred locations and

random initial location of the synthetic traces–Nodes are more sparsely distributed in

the network• Implications:–For adversary in general• Can easily identify the trace of a synthetic

node since no other traces share similar path–For active adversary• May take longer time to meet with each

synthetic node

Page 25: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack performanceAttack performancePassive adversary (special case)Passive adversary (special case)

Special case - side-information inferred at sampling times of traces

Correct assumption of noise (Gaussian )

Cab traces Observations

◦ MLE, MSQ perform equally well

◦ BAS gives the least amount of wrong conclusions initially

Page 26: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack performanceAttack performancePassive adversary (special case)Passive adversary (special case)

Random waypoint traces

Most efficient attack◦ traces have very

different paths

Page 27: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack performanceAttack performancePassive adversary (special case)Passive adversary (special case)

Incorrect assumption of noise◦ Assumption:

Uniform◦ Actual: Gaussian

Cab tracesObservations

◦ MLE is much worsened

Page 28: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack performanceAttack performancePassive adversary (general case)Passive adversary (general case)

General case – side information at times different from trace sampling times

Worst case scenario – all times are different

Infer the location of the victim using the mobility model

Gaussian noise (no noise as best performance bound)

Cab traces

Page 29: Privacy Vulnerability of Published Anonymous Mobility Traces

SummarySummaryPassive adversaryPassive adversary

For passive adversary◦MLE and MSQ give the best

performance among the four approaches in terms of the fraction of correct conclusions

◦Since MLE relies on the knowledge of type of noise and its magnitude, MSQ is the preferred more robust attack approach

Page 30: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack performanceAttack performanceActive adversary as one of mobile nodesActive adversary as one of mobile nodes

Higher attack efficiency for real traces◦ Mobile nodes

more likely to visit the same set of locations at the same time

◦ Synthetic nodes more sparsely distributed in the network

1 time step = 1 minute

Page 31: Privacy Vulnerability of Published Anonymous Mobility Traces

Attack performanceAttack performanceActive adversary who stays at one of the Active adversary who stays at one of the cellscells

cabs buses

Random waypointRandom walk

Observations◦ Comparing real traces and synthetic

traces Attacks on real traces are more efficient –

k-anonymity drops more quickly◦ Popular cells in real traces and random

waypoint traces are more aggregated together

◦ Being at a popular cell does not necessarily results in higher attack efficiency

Page 32: Privacy Vulnerability of Published Anonymous Mobility Traces

cabs buses

Random waypointRandom walk

Attack performanceAttack performanceActive adversary who moves among Active adversary who moves among popular cellspopular cells

The ability to move among popular cells improve attack efficiency◦ Improvement is more

significant if node movements are more localized

◦Visiting more cells does not necessarily improves efficiency

Page 33: Privacy Vulnerability of Published Anonymous Mobility Traces

ConclusionConclusionStudy how privacy leaks through

trace publication◦Under different adversary strategies to

collect side information◦Using different mobile traces with

different characteristicsExperimentally show that the

adversary is able to identify the trace of a victim from the published set with high probability