spyros kotoulas and ronny siebes vrije universiteit amsterdam the netherlands {kot,ronny}@few.vu.nl...
Post on 22-Dec-2015
215 views
TRANSCRIPT
Spyros Kotoulas and Ronny Siebes
Vrije Universiteit AmsterdamThe Netherlands
{kot,ronny}@few.vu.nl
Scalable discovery of private resources
Discovery systemsPrivacy in discovery systemsDistributed obfuscated indexes
Example Design choices
EvaluationSummary
Outline
A discovery system stores content or pointers to it and retrieves
them according to a query.
Discovery systems
P2P storage and centralized index
Discovery systems
user 1
store description query
answer pointer
user 2get(object)
send(object)
P2P storage and P2P index
Discovery systems
user 1
stor
e de
scrip
tion
query
ask user 1
user 2get(object)
send(object)
Discovery systems
They all have one problem in common
The object or description is shown to at least one third party, which may be malicious.
Discovery systems
user 1
stor
e de
scrip
tion
query
ask user 1
user 2get(object)
send(object)
•sniff
•censor
•modify
•track
Distributed obfuscated index
[a..c] [d..f] [g..k] [l..m] [n..v] [w..z]
user 1
user 2
usr1nu
user 2
query: nukes
nu
Distributed obfuscated index
[a..c] [d..f] [g..k] [l..m] [n..v] [w..z]
user 1
usr1nu
user 2
ask usr1
about NU
Distributed obfuscated index
[a..c] [d..f] [g..k] [l..m] [n..v] [w..z]
user 1
usr1nu
user 2
I’m usr2
anything
on NUK?
Distributed obfuscated index
[a..c] [d..f] [g..k] [l..m] [n..v] [w..z]
user 1
usr1nu
user 2
no,
nothing on
NUK
Choosing the key
The uniqueness is a trade-off between scalability and privacy.
If keys are too unique, you don’t have privacy because somebody may do a dictionary attack.If keys are too common, the system will not scale, because they map to too many peers
Three ways to make the key Secure hash prefix (e.g. SHA-1)
For example, always take the first 25% of the characters of a hash of a word in a description:
Logica CMG, Microsoft, Xerox
x41a4c1, x7bh349, xaa4652 x41, x7b,xaa
advantage: the variance of the matching number of terms is small
disadvantage: no substring matching on published part
Three ways to make the key Fixed ratio prefix
For example, always take the first 20% of the characters of a word in a description:
Logica CMG, Microsoft, Xerox lo, mi, xe
advantage: substring matching on published part
disadvantage: the variance of the matching number of terms is big (popular prefixes vs unpopular)
Three ways to make the key Information value based prefix
Based on the frequency of the prefixes on a training set, determine a desired prefix length. For example, some prefixes are more common than others (compare “pro” with “xxf”).
Logica CMG, Microsoft, Xerox logic, mic, x