[ieee 2013 convegno nazionale aeit: innovation and scientific and technical culture for development...
Post on 04-Apr-2017
214 Views
Preview:
TRANSCRIPT
A Secret Sharing Scheme for Anonymous DNSQueries
Giuseppe Di Bella1, Cettina Barcellona1, Ilenia Tinnirello1
1Universita degli Studi di Palermo, Italy.
Abstract—Since its adoption in the early 90’s, several privacyconcerns have emerged about the Domain Name System (DNS).By collecting the DNS queries performed by each user, it ispossible to characterize habits, interests and other sensitive dataof the users. Usually, users resolve their url requests by queryingthe DSN server belonging to their Internet Service Provider (ISP)and therefore they assume they can trust it. However, differentDNS servers can be used, by revealing sensitive data to a partiallyuntrusted entity that can collect and sell this data for severalpurposes (target advertising, user profiling, etc.).
In this paper we address the possibility to integrate tools inthe current DNS architecture to enhance users privacy whenthey decide to use a DNS server different from the one madeavailable by their ISP, while allowing the DNS servers to collectstatistics about queries in order to optimize their operations. Themean feature of the proposed architecture is to guarantee SenderAnonymity from the DNS point of view, without obfuscating theactual queries. This is possible by applying a Secret Sharingscheme on urls to be resolved in an overlay network consistingof clients using the same DNS, and by disseminating the sharesof each query to multiple nodes, randomly selected from thisnetwork, which in turn act as proxies to reach the DNS.
I. INTRODUCTION
Undoubtedly, the DNS [1] represents one of the crucial
building blocks of the Internet, because it tremendously sim-
plifies the web surfing operations and all the other server-
based services. Indeed, DNS has been designed with the
goal to give clients the possibility to reach web servers
by specifying urls (easier to memorize as human-readable)
rather than IP addresses. Although web servers could be
reached by directly specifying the IP address, nowadays it
is not realistic to think about the web without an automatic
mechanism for address resolution. Since its inception, several
optimizations for making DNS faster and more efficient, as
well as for supporting different security improvements, have
been considered. However, in comparison with other aspects,
privacy issues have received much less attention.
In this paper we deal with a solution for enhancing end-users
privacy in DNS queries, which in our opinion is a very critical
aspect in a context (i.e., the web) where the amount of users
data collected for commercial reasons has grown up rapidly.
The privacy leakage we want to prevent refers to the trivial
observation that, whenever a client needs to translate an urlinto an IP address, he has to reveal to his DNS server not only
the url to be converted in the desired IP address, but also the IP
address of the user to which the query answer has to be sent.
This problem has not been deeply investigated in literature,
and partially considered by DNS providers. Changing DNS
provider is a simple operation that can be carried out by
each user by configuring its device opportunely, but in this
way user privacy can be threatened. For example, OpenDNS
[2] or FoolDNS [3] have been recently proposed as DNS
servers which give users warranty to not perform any profiling
operation and to delete the resolved urls history. However,
users do not have an actual contractual agreement with these
DNS providers, such as the contract they have with their ISP.
Therefore, technical solutions that support by design some
privacy-preserving functionalities and make configurable what
can be disclosed and what cannot be from users navigation are
of critical importance.
Our main goal is achieving Sender Anonymity against a
generic DNS server while keeping the DNS possibility to
extract some relevant statistics and perform aggregated anal-
ysis on user data. The proposed solution is based on a well
known cryptographic technique, called Secret Sharing, and an
overlay networks of clients using the same DNS server. The
rest of the paper is organized as follows: in section II we
discuss research work related to anonymity in generic net-
works communications (including the DNS case); then, after
a brief introduction about secret sharing and sender anonimity
techniques III, in section IV we show how our system works in
two different versions: a basic solution and an enhanced one;
a simplified performance evaluation is discussed in section V,
while section VI concludes the paper and introduces some
alternative scenarios where our system can be used.
II. RELATED WORKS
Some privacy-preserving solutions trying to respond to the
risks of private data leakage have been considered for different
Internet services, including the DNS. In [4] [5], after an
analysis of users privacy risks for DNS queries, the authors
propose a method for hiding the urls queries by means of
random noise and Privacy Information Retrieval (PIR) [6].
Some variants of the aforementioned approach have also been
implemented in [7] working on network nodes based on
GNU/Linux environments for experimentally evaluating the
benefits, overheads (e.g. the additional bandwidth consump-
tion) and limitations of the privacy-preserving functionalities.
In particular the solution showed in [4] proposes to generate
a set of n queries containing the right one, with the goal that
neither DNS nor an attacker are able to know the real one,
but they can only guess it with probability 1n . This solution in
terms of guess probability shows a very optimistic evaluation
taking into account active attackers, as argued in [7].
1
In [5] is showed a little different approach than [4] to reduce
the required bandwidth during a server response. Now the
authors propose to create two different query sets Q1 and Q2,
whose length is respectively n and n + 1, where only one is
the real query. These sets are sent to two different servers that
resolve all IP addresses and compute the bitwise exclusive-or
of them. Each output is sent back to the user who obtains
the final output computing the bitwise exclusive-or among the
received strings. Although this approach reduces the required
bandwidth and simultaneously keeps privacy issue, as clarified
in [7], it still does not bridge the security gap. For all these
reasons the authors in [7] propose to distribute the query
among several servers and construct different ranges of queries
for every servers. This solution safeguards against server and
channel side attacks. While in [4], [5], [7] the main focus
is related to privacy preserving issues in the communication
between clients and the DNS server, in this paper we consider
the additional requirement to not hide the queries to the DNS
in order to allow usual caching and statistical operations.
A different solution, based on sender anonymity for generic
web transactions, have been proposed by [8]. The key idea of
this solution is exploiting a network of peers, named crowd,
acting as proxies for forwarding a request to a web server
keeping sender anonymity. Each peer in a crowd can make
a request to the final server or just forward the request to
another peer. When a node needs to make a request to a
server, it picks at random peer from the crowd, including the
node itself originating the request, and forwards the request
message. The receiver peer decides with probability p to
forward again the request to another peer or, with probability
(1-p), to forward the request to the server. In this way the
request travels among different nodes and finally arrives to the
server. This solution permits to achieve the sender anonymity
condition and guarantees the sender-receiver unlinkability (i.e.
it is impossible for an attacker to find any relation among the
sender and the receiver of a message).
The crowd theory has also been improved in [9], where the
authors also use secret sharing techniques for guaranteeing
also the receiver anonymity. In [9] the request initiator defines
a path of nodes to reach the final receiver and, through the
crowd, send to each node a particular message by means of
shares crossing different nodes. For each node of the path, a
(2, 2) secret sharing scheme is applied for addressing the next
node: one share is directly sent by the previous node, while
the second one is sent by the initiator to a random peer, which
in turns will reach the node of the path by using the p, 1− pforwarding mechanism. The two shares are then composed by
each node of the path to get the IP address of the next node
and the message to be forwarded. This mechanism guarantees
that only the node of the path will knows that he is the receiver
and only this node will be able to see the request.
Our solution is inspired by the crowd approach applied
to the case of DNS queries: the idea is exploiting sender
anonymity in order to prevent DNS servers from linking each
query to a specific user, and secret sharing to achieve the query
confidentiality, i.e. enabling the exact reconstruction of the
requested url only at the DNS side thus allowing to collect
statistics about resolved requests (obviously in an anonymous
form). In section IV we detail our envisioned scenario and
provide an overview of the architecture tailored for the DNS
operations, justifying some relevant design choices.
III. BACKGROUND
The privacy-preserving DNS architecture proposed in this
paper is based on Secret Sharing and Sender Anonimity. Before
presenting the details of our system, we briefly introduce the
secret sharing technique and what we mean for anonymity.
Secret Sharing is a method by which a dealer can split a
secret among K parties, such that all the authorized subsets
of these K can recover the secret by aggregating their share,
whereas the remaining ones can not recover it or get any
information about. Among the several schemes proposed in
literature since its introduction by [10] [11], the ones by which
the authorized subsets are characterized by having a number
of parties greater than a certain value are referred as (N ,K)-
threshold schemes, where N ≤ K, and such N is the threshold
value, that is the minimum number of shares to aggregate in
order to recover the secret. In the case of N = K, that is a
scheme in which all the shares need to be aggregate in order to
reconstruct the secret, the construction of the shares, although
possible with the same schemes proposed for N ≤ K, can
be more simple by choosing K − 1 random shares and the
final one as the bitwise exclusive-or of the others. This last
technique is the one used in this paper as we detail in IV.
To define Anonymity briefly we say that the condition of
Sender Anonymity is achieved when, sure as hell or nearly, you
cannot establish who is the sender of a message whereas Re-ceiver Anonymity is achieved when, sure as hell or nearly, you
cannot establish who is the receiver of a message. We said sure
as hell or nearly because in literature are addressed different
degrees of anonymity, as detailed for example in [8] and [9].
In this paper our goal is achieve sender anonymity with regard
to DNS server in the basic scheme proposed(IV-A), and also
against other peers in the enhanced scheme(IV-B). However
out of the scope of this paper is the effort to guarantee receiver
anonymity, because is well known that each request is directed
to the DNS server.
IV. SYSTEM OVERVIEW
Figure 1 summarizes our privacy-preserving DNS archi-
tecture: a group of users, maybe belonging to different ISPs
but using the same DNS server, are organized in an overlay
network, where DNS queries are forwarded before reaching
the privacy-enabled server.
Before the forwarding operation, each query is split in Krandom shares by using a secret sharing technique (in the
figure, the query is split in two shares). Random shares are
generated by choosing K − 1 random values r1, · · · , rK−1
and by computing the last share as rK = s⊕ r1⊕· · ·⊕ rK−1.
Each share is then forwarded to a random picked node in the
overlay network, which is responsible to forward the share to
2
Fig. 1. Our reference scenario.
the privacy-enabled DNS. One of the random picked nodes can
be the query initiator itself, as discussed in the next subsection.
The DNS recovers the query by summing the shares belong-
ing to the same query, resolves the IP address as in common
DNS hierarchical systems, and sends back two shares of the
desired IP address to the proxy nodes from which the query
have been forwarded. The user collects the reply shares from
the proxy nodes and is finally able to connect to the desired IP
address. Following this scheme the DNS server does not know
who the query initiator, (thus guaranteeing sender anonymity),
it knows the url to be solved, i.e. what the user is looking for.
This information can be used for performing statistics. On the
other side, the forwarding nodes do not know the requested urlbut only that a specific user made a DNS query. This condition
guarantees the query confidentiality.
Two different threat models are considered. The first one
is the honest but curious model, according to which all the
involved parties follow the protocol without arbitrarily modify-
ing messages passing through them or introducing inconsistent
values. We build the scheme in order to face with adversaries
able to compromise up to K − 1 nodes receiving the same
query share as we detail later on. The second threat model
takes into account malicious users, i.e. users that can cooperate
with the DNS Server for revealing to the server the query
initiator.
A. Basic Scheme Operations
When a given user needs to resolve a url, it picks K random
nodes on the overlay network (with K higher or equal to 2).
With probability (1-p) all the nodes U1, ..., UK are different
from the query initiator, while with probability p one of
the nodes is the query initiator itself. This feature has been
introduced for avoiding that an eavesdropper monitoring all
the random selected nodes is deterministically able to disclose
the requested url (he can collect only K − 1 shares), while
preventing the DNS server from deterministically assuming
that one of the nodes forwarding the url shares is the query
initiator.
Once the peers are chosen, the user applies a (K,K) secret
sharing scheme for generating the shares of the query. Each
share is sent to a different node Uj , which forwards them to the
DNS default server. In order to allow the query reconstruction,
all the shares belonging to the same query need to be labeled
with the same query identifier. Obviously, this identifier should
not be related neither to the query initiator, nor to the url itself,
for preventing leakage of information to the DNS or to the peer
nodes. For this reason, we chose to create the identifier as a
random value generated by the query initiator. Each j-th share
qij of a given i-th url query is then composed as:
(random), share(url, j)
where the first part of the query is equal for all the shares
of the same url. When all the K shares are received by the
privacy-preserving DNS, the DNS is able to recover the secret
url as∑K
j=1⊕qij .
After this phase, the DNS will resolve the query into an IP
address making use, if needed, of its hierarchical network. In
order to send back the reply to the query initiator, the privacy-
preserving DNS server will apply the same secret sharing
scheme on the resolved IP address. For each proxy node Uj ,
it will forward the share of the reply rij as:
(random), share(IP, j)
where (random) is the same identifier used for the query.
Security Analysis. We consider that the user data are dis-
closed only when both the query content and the user IP are
known by an external attacker. Our system is robust against
a passive attacker that corrupts up to K − 1 peer nodes
involved in the same query. Indeed, in this condition the query
confidentiality is still guaranteed because the attacker is not
able to recover the secret url by K − 1 shares only (while
the sender identity is obviously known to the attacker that can
observer the IP of the query initiator). In case the attacker
is able to corrupt all the K nodes selected for a query, the
url can be recovered and user data are disclosed. However,
being the K nodes randomly selected at each query, this case
can practically occur when the attacker is able to corrupt all
the nodes in the overlay network. Even in this case, with
probability p, one of the shares is managed by the query
initiator itself and url recovery is impossible.
B. Enhanced Scheme Operations
The basic scheme described in the previous section provides
conditional privacy, i.e. the privacy is conditioned to the
number of entities that can collude together for a given user
query. In this section, we consider an extension of this scheme
able to deal with collaborative users, i.e. malicious users which
may reveal the sender of the received share to the DNS.
The extension is based on a simple idea: rather than using
a two-hops forwarding scheme for sending each share to the
DNS by means of one proxy node, the query initiator will
extract a random number of proxy nodes each share has to
cross before reaching the DNS. The goal of this extension is
trying to prevent the collaborative peers to understand, and
3
hence reveal, if the managed share is originated by the actual
query initiator or not. For trying to prevent we mean that
a collaborative user has a small probability to identify the
query initiator, where this probability can be tuned by some
parameters which we detail in the following.
The idea of using a random number of peers a message has
to pass through before reaching a generic recipient node has
been introduced in [8] and [9], where the random number
of hops is not bounded (being dependent on a probability
to forward the message to the recipient or to another proxy
node). However, in a system like a DNS, in which performance
in terms of users waiting time is of primary importance, it
is necessary to limit the number of proxy nodes to avoid a
dramatic impact on the url resolving time.
We designed our scheme as follows. Each j-th query share
qij of a given url is composed as:
(counterj), (random), share(url, j)
where the counter value is randomly chosen by the query ini-
tiator in an interval [1, C]. Each peers decrements the counter
before forwarding the message. If the counter is higher than
zero, the share is forwarded to another proxy peer, otherwise it
is forwarded to the DNS server. The reconstruction of the urlfollows the same method of the basic scheme, provided that
each proxy node has to memorize the address of the previous
node from which it has received a given query share. The
shares of the resolved IP sent by the DNS are then able to
reach the query initiator by passing through the same nodes
used in the direct path in reverse order.
Security Analysis. It is worth to point out that, because each
users knows the right bound C, if a collaborative user receives
a message with the counter field set to C, it understands that
the sender is also the initiator. Being q the probability that a
given node in the overlay network cooperates with the DNS
server, the occurrence probability of this leakage event for one
share is equal to q ·1/C, that is the probability that the selected
node is a collaborative node and that the extracted counter is
equal to the maximum possible value C. Since each query
has to be forwarded to K different peers, the probability to
pick exactly x collaborative users over K selected peers has a
binomial distribution with parameters K and q. Therefore, the
query leakage probability PI that at least for one malicious
user it is extracted the maximum counter value C is given by:
Pl = 1−K∑
x=0
(K
x
)qx (1− q)
K−x
(C − 1
C
)x
(1)
To avoid this leakage event, we consider an alternative
scheme according to which the maximum number of counter
C is not fixed, but it has to be chosen randomly by the query
initiator in an range [C ′, C ′′] that can be different for each
share. This assures that the first hop peer receiving one share
cannot be sure that the sender is the initiator.
It is straightforward to note that the sender anonymity is
still ensured in this scheme because the counter is different
for each share. Nevertheless in these conditions there is the
possibility that a single node can collect all the shares related
to a query, due to the random choice of the relay forwarding
nodes. In order to avoid this event, which is unlikely if the
number of involved peers is high, each query initiator should
selects 1 as counter value for at least 2 shares (where K > 2).
This feature guarantees that no peer can collect all the other
K − 1 shares acting as a random forwarding node, being two
shares sent to the DNS without additional forwarding nodes.
V. SYSTEM EVALUATION
Our architecture has been designed in order to minimize the
additional complexity to be supported at the DNS side and
the additional delay due to the forwarding operations. Under
the hypothesis that the privacy-preserving DNS caches the
most common queries, the address resolution delay is basically
given by the round trip time needed to reach the DNS server
and to send back the resolved IP address to the query initiator.
For sake of simplicity, we also assume that the transmission
delay between any two peers of the network, including the
transmission delay between one peer and the DNS server is
fixed and equal to T , and that each node can be involved into
a single transmission or reception operation at a given time.
In the basic scheme, being K the number of proxy nodes,
the time required to reach the DNS is obviously (K + 1)T .
Indeed, after that the query initiator completes the transmission
of the first share towards the first proxy node, it can transmit
the second one in parallel to the transmission between the
first proxy node and the DNS server. The third share can be
transmitted in parallel to the transmission between the second
proxy and the DNS server, and so on, until the last proxy
node has to perform the last forwarding operation towards the
DNS. Similarly, on the opposite path for going from the DNS
to the query initiator other (K + 1)T transmission times are
needed. It follows that the address resolution time is 2(K+1)T(linearly increasing with K), while an higher number of shares
can provide an higher conditional security.
In the enhanced scheme operation, with the maximum
number of forwarding hops fixed to C for each share, the
maximum delay from the query initiator to the DNS is
(K + C)T . By assuming also in this case that peer nodes
can transmit in parallel if the destination nodes are different,
and by neglecting the probability that a proxy selects one of
the proxy nodes selected by the query initiator as a forwarder
node, the maximum delay is given by time for transmitting
the K-th share to the DNS server. Since the K-th proxy node
receives such a share after K ·T times, the maximum delivery
time for the last share is given by further C transmission times.
Note that by the end of this time, any previous share has been
surely delivered to the DNS server. The address resolution
time is therefore linearly increasing with K +C, while a low
number of intermediate forwarding hops C can degrade the
information leakage probability about the identity of the query
initiator.
Figure 2 plots the information leakage probability as a
function of the maximum number of forwarding hops Cvarying the number of shares K (whose value affects the
4
1 2 3 4 5 6 7 8 9 1010-2
10-1
100
Number of hops C
Per
-Que
ry L
eaka
ge P
roba
bilit
yK=2K=3K=4K=5K=7K=10
Fig. 2. Information leakage probability for different number of shares Kand q = 0.1 as a function of the maximum number of forwarding hops C.
conditional security of the scheme). The figure refers to a case
in which the cooperation probability is pretty high (q = 0.1).However, even with such a probability, for K = 2 it is enough
to consider 4 hops to have a leakage probability lower than
5%. The case C = 1 corresponds to the basic scheme, in which
the share sent to the proxy nodes is immediately forwarded
to the DNS server. In such a case, the leakage probability is
equal to the probability to select at least one cooperative node,
i.e. 1− (1− q)K , being the query initiator identity obviously
known to all the proxy nodes.
Figure 3 plots the C value required for guaranteeing a
leakage probability lower than 5% as a function of the number
of shares K for different values of the cooperation probability
q. Such a parameter can dramatically grows in case of high
cooperation probabilities. For example, for K = 4, the
required C values change from 1 when q = 0.01 to 40 when
q = 0.5. Therefore, the selection of the scheme parameters
has to be carefully performed as a trade-off between address
resolution delay and information leakage probability.
VI. CONCLUSIONS AND FUTURE WORKS
In this paper we show how to apply a secret sharing scheme
for performing privacy-preserving DNS queries by exploiting
a peer-overlay network in which each peer can act as a proxy.
This solution could be very interesting in a context where
the DNS server provided by the ISP is not available or the
user selects an untrusted DNS server. similarly to [8] and [9],
our system achieves sender anonymity avoiding leakage of
sensitive user data in two different threat models. Differently
from [7], also the DNS server can benefit to adopt this
mechanism because only the query initiator is hidden to the
DNS, but the request itself is clear and can be used for caching
or other statistical operations.
Although our system has been currently considered for a
scenario in which the DNS server is untrusted because external
to the ISP, we are also considering some scheme extensions
able to guarantee sender anonymity against the ISP while
2 3 4 5 6 7 8 9 101
10
100
Number of shares K
Num
ber o
f hop
s C
q=0.02q=0.05q=0.1q=0.2q=0.5q=0.01
Fig. 3. Number of hops C as a function of the number of secrets K, fordifferent cooperation probability values.
using its trusted DNS. Indeed, because of tis role, the ISP
is always able to monitor the user traffic and collects all
the shares to be sent to the DNS. Therefore, the presented
scheme cannot prevent the ISP from identifying the query
initiator and the required url. However, in a multi-provider
scenario, in which for example the user can simultaneously
adopt two different providers by means of two different
access technologies (e.g. the ADLS and the 3G access), the
scheme can be easily generalized in order to send at least one
different share by using each of the available providers. This
mechanism guarantees that no provider can be able to identify
the requested url by monitoring the user traffic, since only a
subset of shares are transmitted in its access network. This new
scenario need however to be deeply investigated and currently
is still a work in progress.
REFERENCES
[1] Rfc 1034. [Online]. Available: http://tools.ietf.org/html/rfc1034[2] Opendns. [Online]. Available: http://www.opendns.com/[3] Fooldns. [Online]. Available: http://www.fooldns.com/[4] F. Zhao, Y. Hori, and K. Sakurai, “Analysis of privacy disclosure in dns
query,” in Multimedia and Ubiquitous Engineering, 2007. MUE ’07.International Conference on, 2007, pp. 952–957.
[5] ——, “Two-servers pir based dns query scheme with privacy-preserving,” in Intelligent Pervasive Computing, 2007. IPC. The 2007International Conference on, 2007, pp. 299–302.
[6] B. Chor, E. Kushilevitz, O. Goldreich, and M. Sudan, “Privateinformation retrieval,” J. ACM, vol. 45, no. 6, pp. 965–981, Nov. 1998.[Online]. Available: http://doi.acm.org/10.1145/293347.293350
[7] S. Castillo-Perez and J. Garcia-Alfaro, “Evaluation of two privacy-preserving protocols for the dns,” in Information Technology: NewGenerations, 2009. ITNG ’09. Sixth International Conference on, 2009,pp. 411–416.
[8] M. K. Reiter and A. D. Rubin, “Crowds: anonymity for webtransactions,” ACM Trans. Inf. Syst. Secur., vol. 1, no. 1, pp. 66–92, Nov.1998. [Online]. Available: http://doi.acm.org/10.1145/290163.290168
[9] S. Rass, R. Wigoutschnigg, and P. Schartner, “Crowds based on secret-sharing,” in Availability, Reliability and Security (ARES), 2011 SixthInternational Conference on, 2011, pp. 359–364.
[10] G. R. Blakley, “Safeguarding cryptographic keys,” in AFIPS Conf. Proc.,vol. 48, 1979, pp. 313–317.
[11] A. Shamir, “How to share a secret,” in Communications of the ACM,vol. 22, November 1979, pp. 612–613.
5
top related