privacy preserving log file processing in mobile network environment
TRANSCRIPT
1 © Nokia Solutions and Networks 2014
Privacy Preserving Log File Processing in Mobile Network EnvironmentShankar Lal16-06-2015
2 © Nokia Solutions and Networks 2014
Presentation outline•Introduction and background review•Cases of privacy breach•Statistical analysis over Network trace•Continuous fields anonymisation through Differential Privacy•Discrete fields anonymisation through ℓ-diversity•IP address anonymisation•Future work and conclusion
3 © Nokia Solutions and Networks 2014
Introduction and Objectives of this work
4 © Nokia Solutions and Networks 2014
Background review
•Data Privacy
•Need of privacy in user data
•Sharing of network trace data
•Tradeoff between data utility and data privacy
5 © Nokia Solutions and Networks 2014
Privacy Laws
•PII (Personally Identifying Information) US privacy law
•Personal Data EU Data protection Directive
6 © Nokia Solutions and Networks 2014
IP address as Personal Data
•Arguments on both sides
•EU consider it personal (UK is exception)
•US consider it non-personal
7 © Nokia Solutions and Networks 2014
Cases of privacy breach from anonymised data sets
8 © Nokia Solutions and Networks 2014
There's No Such Thing As An Anonymized Dataset
9 © Nokia Solutions and Networks 2014
Netflix anonymous data set and user privacy breach
10 © Nokia Solutions and Networks 2014
AOL anonymous data set of user queries
11 © Nokia Solutions and Networks 2014
Identification of medical record of former governor of Massachusetts
William WeldFormer governor of Massachusetts
12 © Nokia Solutions and Networks 2014
Statistical analysis over Network traces
13 © Nokia Solutions and Networks 2014
Sample of a Network Log file
14 © Nokia Solutions and Networks 2014
Statistical Analysis on network trace I
15 © Nokia Solutions and Networks 2014
Statistical Analysis on network trace II
Most used protocols
Most used packet lengths
Source and destination IP class count
IP class packet length distribution
16 © Nokia Solutions and Networks 2014
Functional dependencies between fields
17 © Nokia Solutions and Networks 2014
Why packet length and timestamp fields are sensitive?
Certain security incidents have fixed packet length
• Slammer worm 404 bytes • Nachi worm 92 bytes
Timestamp along with IP address reveals communication existed between parties.
18 © Nokia Solutions and Networks 2014
Privatizing network trace
19 © Nokia Solutions and Networks 2014
Privacy Enhancing Technologies (PETs)•Hashing•Encryption•Randomization and Tokenization•k-anonymity
New Inclusions:•Differential Privacy•ℓ-diversity
20 © Nokia Solutions and Networks 2014
k-anonymityMain idea: Generalization Suppression Perturbation
21 © Nokia Solutions and Networks 2014
Example on network data set
Sample Data set 2-anonymous data set
22 © Nokia Solutions and Networks 2014
Differential Privacy: Anonymisation of continuous fields
23 © Nokia Solutions and Networks 2014
Differential PrivacyDifferential privacy algorithm states that probability that dataset D1 produces output C is very close to the probability ofdata set D2 producing same output.
Laplace noise calculation: Scale parameter b = Δ f/ ϵ
Mean μ =0
Δ f =sensitivity of the functionϵ= Privacy parameter
Probability density plots of Laplace distributions
24 © Nokia Solutions and Networks 2014
Noise addition through Differential Privacy
Original Distribution ϵ =0.01ϵ =0.1
Packet length field
25 © Nokia Solutions and Networks 2014
Noise addition through Differential Privacy
ϵ =0.01ϵ =0.1Original Distribution
Timestamp field
26 © Nokia Solutions and Networks 2014
Comparison between original and noisy data
Packet Length Time stamp
27 © Nokia Solutions and Networks 2014
ℓ-diversity: Anonymisation of discrete fields
28 © Nokia Solutions and Networks 2014
ℓ-diversityA q-block is ℓ-diverse if contains at least ℓ “well-represented” values for the sensitive attribute (in other words, diversity in the sensitive attributes).
29 © Nokia Solutions and Networks 2014
Example on network data set
Sample Data set 3-diverse Data set
30 © Nokia Solutions and Networks 2014
ℓ-diversity technique
<Change information classification in footer>
31 © Nokia Solutions and Networks 2014
Equivalence class creation
Equivalence class name
Protocol name
Protocol name
Protocol name
Protocol name
Transport Protocols
TCP UDP * *
ManagementProtocols
DNS ICMP DHCP ARP
Security Protocols
TLS SSL SSH HTTPS
Mobile Networks Protocols
SSMP GTP GTPv2 UCP
Other Protocols
* * * *
32 © Nokia Solutions and Networks 2014
5-diverse data set
<Change information classification in footer>
33 © Nokia Solutions and Networks 2014
Other Noise addition techniques
34 © Nokia Solutions and Networks 2014
Zero Mean noise addition
35 © Nokia Solutions and Networks 2014
Noise addition by summing LSBs technique
Example: 1414 1414+9= 1423
LSBs
36 © Nokia Solutions and Networks 2014
IP address Anonymisation
37 © Nokia Solutions and Networks 2014
Anonymising IP addresses
Method: 1. Last octet Obfuscation Method: 2. Transformation to IP class
Goal is to anonymise IP addresses but also preserve network topology information
38 © Nokia Solutions and Networks 2014
Final Anonymised data set
39 © Nokia Solutions and Networks 2014
Anonymised network trace
<Change information classification in footer>
40 © Nokia Solutions and Networks 2014
Conclusion and Future Work
41 © Nokia Solutions and Networks 2014
Conclusion
•Preserving user privacy in a network trace.•Analyzing Functional dependencies between the fields.•Packet length and timestamp anonymisation by Differential Privacy and ℓ-diversity technique.•Deciding the best values of privacy parameter ε •IP addresses anonymisation by last octet obfuscation method
.
42 © Nokia Solutions and Networks 2014
Future work
•Framework for calculating best value of epsilon
•Re-identification testing
•Feature extraction/Clustering
•Anomaly detection/Malware Analysis
<Change information classification in footer>
43 © Nokia Solutions and Networks 2014
Thank you
Questions?