[ieee 2012 ieee 14th international conference on communication technology (icct) - chengdu, china...

7
Privacy-Preserving Multi-set Operations Meishan Huang , Bogang Lin College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China Key Lab of Information Security of Network System, Fuzhou University, Fujian Province E-mail: [email protected], [email protected] Abstract— We consider several multiset operations in secure two- party setting where Alice and Bob each have a multiset and they want to perform some private computations over the two multisets without revealing and private information which means no one of them would learn more information than what can be deduced from the result. We design some methods to compute union, intersection and element reduction operations effectively and securely and apply these techniques to solve the multiset operations problems in semi-honest setting and consider their extension to the malicious setting. Keywords-component; multiset operation; privacy-preserving; secure computation; characteristic bit-string I. INTRODUCTION Secure multiparty computations are still cryptography research focus. There are many research works focus on the set operation in secure multiparty computation field, specifically: computing the union of sets and its size[1], intersecting sets and calculating the size of the intersection of sets[1,2,9], testing disjointness of sets[3], determining the inclusion relation of two sets[4,10] and variants related to intersection and the inclusion relation[1-4,10]. General multi-party secure computing protocols can be used to resolve any multiparty computation problems, and naturally also includes set operations. However, general solutions focus on its versatility, and pay less attention to the efficiency. Therefore many scholars work on a specific set of issues related to the secure computing protocols. After all, the security protocols are designed targeted, of course, there are some limitations. A protocol to determine the inclusion relation of two sets was proposed in [1] which requires the threshold encryption scheme and the decryption process is complex. Some scholars proposed a private matching solution in [2]. It can intersect sets and calculate the cardinality of the result in a safe two-party setting. Private match actually is an unfair version of intersection. However, they didn't deal with specific issues, such as intersecting multi-sets. Three secure computing protocols to determine whether the sets are disjoint were presented in [3] where sets are represented in the two forms: characteristics bit-strings and polynomials, and a new encryption called Super-posed Encryption was put forward. A protocol was proposed to determine the inclusion relation of two sets [4] . Strictly speaking, this protocol is not a protocol to determine the inclusion relation of two sets, because it leaks the size of intersection. When the cardinality is 1, the set intersection problem becomes privacy equivalent test. Many scholars consider this issue [5-8] . As for the malicious setting, a protocol was proposed to determine the inclusion relation of sets [9] . Some secure protocols were also proposed for related issues. Prior to this, there are no specific solutions for these problems except for the general secure multi-party solution. And there are three protocols to determine the inclusion relation of sets with different security levels were proposed [10] which are different from those in the [4]. The first two protocols applied the idea of testing disjointness of sets in [3]. In this paper, we consider various the multi-set operations in secure two-party setting drawing on the some idea from the literature [3]. By way of establishing the basic framework for the multi-set operations, including union, intersection, and element reduction and applying a number of cryptographic techniques and protocols, such as additively homomorphic cryptography, Yao's millionaire protocol, equality-testing protocol and permutation protocol, we design secure protocols to determine the inclusion relation of two multisets, to calculate over-threshold multiset-intersection, threshold set- intersection, multiset-intersection and cardinality multiset- intersection effectively in semi-honest model. What's more, we make use of the concept of commitment scheme and zero- knowledge proof to guarantee the security of the protocols in the malicious attacker model. The rest of this paper is organized in the following way: Section 2 we give the detailed problems description and models where we deal the problems with and some necessary technologies. In Section 3 we give an overview to our protocol and introduce some intuition, Section 4 and Section 5 present protocols in semi-honest and malicious respectively. Finally Section 5 concludes the paper and lays out future work. II. PROBLEMS AND MODELS A. Problem Definition Assume that the two participants A and B, and each holds a private input multiset(multiple sets means that the same element can appear several times in the data collection), that is, A has a private multiset 1 { ,... } A A N S a a and B holds 1 { ,..., } B B N S b b where , [ ] {1,..., } i i ab N N , (1 , A i N 1 ) B j N , A N and B N are slightly less than N . By engaging in the protocols for the problems defined below, particular player(s) will learn the specified answer. ___________________________________ 978-1-4673-2101-3/12/$31.00 ©2012 IEEE

Upload: vantruc

Post on 16-Apr-2017

214 views

Category:

Documents


0 download

TRANSCRIPT

Privacy-Preserving Multi-set Operations

Meishan Huang , Bogang LinCollege of Mathematics and Computer Science, Fuzhou University, Fuzhou, China

Key Lab of Information Security of Network System, Fuzhou University, Fujian Province E-mail: [email protected], [email protected]

Abstract— We consider several multiset operations in secure two-party setting where Alice and Bob each have a multiset and they want to perform some private computations over the two multisets without revealing and private information which means no one of them would learn more information than what can be deduced from the result. We design some methods to compute union, intersection and element reduction operations effectively and securely and apply these techniques to solve the multiset operations problems in semi-honest setting and consider their extension to the malicious setting.

Keywords-component; multiset operation; privacy-preserving; secure computation; characteristic bit-string

I. INTRODUCTION

Secure multiparty computations are still cryptography research focus. There are many research works focus on the set operation in secure multiparty computation field, specifically: computing the union of sets and its size[1], intersecting sets and calculating the size of the intersection of sets[1,2,9], testing disjointness of sets[3], determining the inclusion relation of two sets[4,10] and variants related to intersection and the inclusion relation[1-4,10].

General multi-party secure computing protocols can be used to resolve any multiparty computation problems, and naturally also includes set operations. However, general solutions focus on its versatility, and pay less attention to the efficiency. Therefore many scholars work on a specific set of issues related to the secure computing protocols. After all, the security protocols are designed targeted, of course, there are some limitations.

A protocol to determine the inclusion relation of two sets was proposed in [1] which requires the threshold encryption scheme and the decryption process is complex. Some scholars proposed a private matching solution in [2]. It can intersect sets and calculate the cardinality of the result in a safe two-party setting. Private match actually is an unfair version of intersection. However, they didn't deal with specific issues, such as intersecting multi-sets.

Three secure computing protocols to determine whether the sets are disjoint were presented in [3] where sets are represented in the two forms: characteristics bit-strings and polynomials, and a new encryption called Super-posed Encryption was put forward.

A protocol was proposed to determine the inclusion relation of two sets[4] . Strictly speaking, this protocol is not a protocol to determine the inclusion relation of two sets,

because it leaks the size of intersection. When the cardinality is 1, the set intersection problem becomes privacy equivalent test. Many scholars consider this issue[5-8]. As for the malicious setting, a protocol was proposed to determine the inclusion relation of sets[9]. Some secure protocols were also proposed for related issues. Prior to this, there are no specific solutions for these problems except for the general secure multi-party solution. And there are three protocols to determine the inclusion relation of sets with different security levels were proposed [10] which are different from those in the [4]. The first two protocols applied the idea of testing disjointness of sets in [3].

In this paper, we consider various the multi-set operations in secure two-party setting drawing on the some idea from the literature [3]. By way of establishing the basic framework for the multi-set operations, including union, intersection, and element reduction and applying a number of cryptographic techniques and protocols, such as additively homomorphic cryptography, Yao's millionaire protocol, equality-testing protocol and permutation protocol, we design secure protocols to determine the inclusion relation of two multisets, to calculate over-threshold multiset-intersection, threshold set-intersection, multiset-intersection and cardinality multiset-intersection effectively in semi-honest model. What's more, we make use of the concept of commitment scheme and zero-knowledge proof to guarantee the security of the protocols in the malicious attacker model. The rest of this paper is organized in the following way: Section 2 we give the detailed problems description and models where we deal the problems with and some necessary technologies. In Section 3 we give an overview to our protocol and introduce some intuition, Section 4 and Section 5 present protocols in semi-honest and malicious respectively. Finally Section 5 concludes the paper and lays out future work.

II. PROBLEMS AND MODELS

A. Problem Definition Assume that the two participants A and B, and each holds a

private input multiset(multiple sets means that the same element can appear several times in the data collection), that is, A has a private multiset 1{ ,... }

AA NS a a� and B holds 1{ ,..., }BB NS b b�

where , [ ] {1,..., }i ia b N N� � , (1 ,Ai N� � 1 )Bj N� � , ANand BN are slightly less than N . By engaging in the protocols for the problems defined below, particular player(s) will learn the specified answer.

___________________________________ 978-1-4673-2101-3/12/$31.00 ©2012 IEEE

Inclusion-Relation A learns whether A BS S� holds without revealing two private multisets.

Over-Threshold Multiset-Union A learn which elements appear in the combined private input of the players at least a threshold number t times, and the number of times these elements appear in the players’ private inputs. For example, an element a appears in the combined private inputs of the players 15 times. If t = 10, then A will learn a has appeared 15 times. However, if t = 16, then no player learns a or the number of times it has appeared.

Threshold Multiset-Union The problem can be seen as a variant of over-threshold multiset-union, and the difference lies where A does not know the number of occurrences of these elements in A BS S� .

Multiset-Intersection A learns the intersection of the private input sets A BS S� with no information is revealed about the players' private inputs.

Cardinality Multiset-Intersection A learns the size of the intersection set of all private input sets A BS S� .

B. Adversary Models In this paper, we present protocols for the aforementioned

problems in two standard adversary models. We first describe our protocols in the so called semi-honest setting (parties are assumed to follow the protocol specifications) and then we consider their extension to the malicious setting (players may deviate arbitrarily from protocol specifications). We give only an intuitive notion of each. These notions are formalized in literature.

Honest-But-Curious Adversaries. In this model, all parties act according to their prescribed actions in the protocol. Security in this model is straightforward, particularly as in our case where only one party (C) learns an output.

� The client's security – indistinguishability: Given that the server S gets no output from the protocol, the definition of C's privacy requires simply that the server cannot distinguish between cases in which the client has different inputs.

� The server's security – comparison to the ideal model: The definition ensures that the client does not get more or different information than the output of the function. This is formalized by considering an ideal implementation where a trusted third party (TTP) gets the inputs of the two parties and outputs the defined function. We require that in the real implementation of the protocol—that is, one without a TTP—the client C does not learn different information than in the ideal implementation.

Malicious adversaries. In this model, an adversary may behave arbitrarily. In particular, we cannot hope to avoid parties (i) refusing to participate in the protocol, (ii) substituting an input with an arbitrary value, and (iii) prematurely aborting the protocol. Informally, the definition is

based on a comparison to the ideal model with a TTP, where a corrupt party may give arbitrary input to the TTP. The definition also is limited to the case where at least one of the parties is honest: for any strategy that the dishonest party can play in the real execution, there is a strategy that it could play in the ideal model, such that the real execution is computationally indistinguishable from execution in the ideal model.

C. Cryptographic Primitives Here is a brief introduction to several common

cryptographic preliminaries which are used in our proposed protocols.

Additively Homomorphic Cryptography

In this paper we utilize a semantically-secure, additively homomorphic public-key cryptography. Let Epk(·) denote the encryption function with public key pk. The crypto-system supports the following two operations, which can be performed without knowledge of the private key: (1) For any plaintext aand b, given the encryptions of a and b, Epk(a) and Epk(b), we can efficiently compute the encryption of a+b, denoted Epk(a+b) := Epk(a) +h Epk(b); (2) For any plaintext a, given a constant c and the encryption of a, Epk(a), we can efficiently compute the encryption of ca, denoted Epk(c·a) :=c ×h Epk(a).

Yao’s Millionaire Problem

This is another protocol used as a primitive in our solutions. The purpose of the protocol is to compare two private numbers (i.e., determine which is larger). This private comparison problem was first proposed by Yao and is referred as Yao’s Millionaire Problem (because two millionaires wish to know who is richer, without revealing any other information about their net worth). The early cryptographic solution by Yao has communication complexity that is exponential in the number of bits of the numbers involved. Many scholars have carried out a number of studies on this issue, including literature [13-16].

Equality-Testing Protocol

Equality-Testing Protocol [5] is used to compare the two entities and determine whether they are equal, and many researchers have studied the problem [6,8].

Permutation Protocol

Permutation Protocol[5] can be used to solve the two-party permutation problem in secure two-party setting: A holds a private vector 1( ,..., )nX x x� , and B holds a permutation

function� and his own private vector 1( ,..., )nR r r� . A wants

to know ( )X R� , and cannot know � and any ir , and B

cannot learn any ix .

Vector Dominance Protocol

Vector Dominance Protocol was present in to solve secure two-party vector dominance problem. Let 1 2( , ,..., )nA a a a� , 1 2( , ,..., )nB b b b� , if for all 1,2,...,i n�we have i ia b , then we say that A dominates B. Alice has a

vector A and Bob has a vector B. Alice wants to know whether A dominates B. Note in the case where A does not dominate B, neither Alice nor Bob should learn the relative ordering of any individual ai, bi pair (i.e., whether ai<bi or not). In order to meet the needs of the protocol, we propose a modified version (where only one party would know the result).

III. OVERVIEW AND INTUITION

In this section, we give the overview and the intuition of our protocols.

A. Characteristic Bit-string Representation of Multiset Consider a situation here: the number of different elements

in input multiset is very close to the size of alphabet. In the case, the use of the characteristics bit-string representation may achieve better efficiency. Consider two players A has a set of values 1{ ,... }

AA NS a a� and B has the other set 1{ ,..., }BB NS b b�

where , [ ]i ia b N� (1 ,Ai N� � 1 )Bj N� � and [ ] {1,..., }N N� .Note that each element can appear several times in dataset, then we call the dataset multiset. The two multisets are stored by the players in the form of their characteristic bit-string of length N:

XS is represented by a bit-string 1 ,....,X XX NBit bit bit� , so that

Xjbit a� iff element j appears a times in XS , where

{ , }X A B� .

B. Characteristic Bit-string Representation of Multiset Operations Consider that most of the operations of the multiset can be

seen as combination of three basic operation, which is intersection, union and element deletion. By representing sets of elements as characteristics bit-strings, we will describe three basic multiset operations in form of characteristic bit-string, requiring any operating results reveal the private information of the input characteristics bit-strings. Let 1{ ,... }

AA NS a a� and

1{ ,..., }BB NS b b� be the two multisets, 1 ,....,A A

A NBit bit bit� and

1 ,....,B BB NBit bit bit� be the corresponding characteristics bit-

string respectively.

� Intersection If element a appears 0Ab � times in AS ,

and 0Bb � times in BS , then it will appear min( , )A Bb btimes in A BS S� . The characteristics bit-string of

A BS S� can be denoted as

1 1min( , ),....,min( , )A B A BN Nbit bit bit bit .

� Union If element aappears 0Ab � times in AS , and

0Bb � times in BS , then it will appears ( )A Bb b times

in A BS S� . The characteristics bit-string of A BS S� can be denoted as

1 1( ),...., ( )A B A BN Nbit bit bit bit .

� Removing duplicate elements To reduce the number of duplicate elements represented in S by d, the resulting multiset can be denoted as ( )dRd S . For an

element a in S, it will appears max{ ,0}b d times in

( )dRd S .The characteristics bit-string of ( )dRd S can be denoted as

1max{ ,0},....,max{ ,0}S SNbit d bit d .

IV. PROTOCOLS FOR THE HONEST-BUT-CURIOUS CASE

We present protocols for the problems of: inclusion-relation, over-threshold multiset-union, threshold multiset-union, multiset-intersection and cardinality multiset-intersection in this section.

A. Inclusion-Relation In order to determine the inclusion-relation of two multisets,

here we use a new variant of the vector dominance protocol. The first protocol we present is for the inclusion-relation problem, given in Fig. 1.

Protocol: Inclusion-Relation-HBC

Input: There are 2 honest-but-curious players A and B, at most one party dishonestly behave, each with a private input multiset

AS and BS . 1i ( ,...., )A AA NB t bit bit� is the characteristics bit-string

of AS and 1( ,...., )B BB NBit bit bit� is the characteristics bit-string

of BS . Party A holds the secret key sk, and pk is the corresponding public key to a homomorphic cryptography.

Output: A will learn whether A BS S� holds.

Step1

A disguises i AB t to get the disguised input 1 4' ( ' ,..., ' )NA a a�and B gets 1 4' ( ' ,..., ' )NB b b� ,

where 1 1

1 1

' (2 ,...,2 ,(2 1),...,(2 1),2 ,..., 2 , (2 1),..., (2 1))

N N

N N

A a a a aa a a a

,

1 1

1 1

' ((2 1),...,(2 1),2 ,...,2 ,(2 +1),..., (2 +1), 2 ,..., 2 )

N N

N N

B b b b bb b b b

.

Let ��2 2

(1,...,1,0,...0)AN N

V � .

Step2:

B generates a random permutation� and a random vector R .

Using the Permutation Protocol, A gets '' ( ' )A A R�� and B

also computes '' ( ' )B B R�� , ' ( )A AV V�� .

Step 3:

A and B use Yao’s Millionaire protocol as subroutine to compare ''iA and ''iB for i =1,...,4n, where ''iA (resp., ''iB )isthe ith element of vector ''A (resp., ''B ). At the end, A gets the 1 4{ ,..., }NU u u� , where 1iu � if '' ''

i iA B� ,

otherwise 0iu � .

Step4:A and B use a private equality-testing protocol to compares U with '

AV : if 'AU V� , then i AB t dominates BBit ; otherwise,

i AB t does not dominate the BBit .Step5:If i AB t dominates BBit , A will learn A BS S� holds, otherwise

A BS S� is false.

Figure 1. Inclusion-Relation-HBC

B. Over-Threshold Multiset-Union An over-threshold multiset-intersection protocol allows A

to learn which elements appear in the combined private inputs of the players at least a threshold number t times, and the number of times these elements appear in the players’ private inputs. A protocol for the over-threshold multiset-intersection is given in Fig. 2.

Protocol: Over-threshold-HBC

Input: There are 2 honest-but-curious players A and B, at most one party dishonestly behave, each with a private input set, that is AS from A and BS from B. 1i ( ,...., )A A

A NB t bit bit� is the

characteristics bit-string of AS and 1( ,...., )B BB NBit bit bit� is the

characteristics bit-string of BS . Party A holds the secret key sk,and pk is the corresponding public key to a homomorphic cryptography. The threshold number of repetitions at which an element appears in the output is t.

Output: A will learn which elements appear in the combined private input of the players at least a threshold number t times, and the number of times these elements appear.

Step 1:

A computes 1 2, ,...,c NBit c c c� , where ( )Aj pk jc E bit� .

B computes ' 1 2' , ' ,..., 'c NBit c c c� , where ' ( )Bj pk jc E bit� .

A disguises i AB t to get the disguised input 1 4' ( ' ,..., ' )NA a a� ,

and B gets 1 4' ( ' ,..., ' )NB b b� , where

1 1

1 1

' (2 ,...,2 ,(2 1),...,(2 1),2 ,..., 2 , (2 1),..., (2 1))

N N

N N

A a a a aa a a a

1 1

1 1

' ((2 1),...,(2 1),2 ,...,2 ,(2 +1),..., (2 +1), 2 ,..., 2 )

N N

N N

B b b b bb b b b

.

Let ��2 2

(1,...,1,0,...0)AN N

V � .

B generates a random permutation� and computes its inverse function 1� .

Step 2:

Players engage in the permutation protocol, where A inputs

cBit and B inputs 'cBit and � . At the end, A

gets '( ) ( ' ), 1,...,p c c i iBit Bit Bit c c i N� �� � � .

Step 3:

A decrypts pBit and compares piBit and 1t for 1,...,i N� .

A gets the 1,..., NU u u� , where pi iu Bit� if ( 1)p

iBit t ,

otherwise 0iu � .

Step 4:

Players engage in the permutation protocol, where A inputs Uand B inputs 0 0Bit � and 1� .

At the end, A gets 10Re ( )sult U Bit� � .

Step 5:

A computes (Re )R skBit D sult� . For 1,...,i N� ,

0RiBit � indicates element i appears (Re )sk iD sult t�

times in A BS S� .

Figure 2. Over Threshold Multiset-Union-HBC

C. Threshold Multiset-Union Out of the difference between over-threshold multiset-

union, threshold multiset-union, the protocols for the threshold multiset problem, which is given in Fig. 3, is identical to the protocol for over-threshold multiset-intersection from step 1-3. To prevent A from learning the number of times all elements appear in 1( )t A BRd S S � , we make a little modifications. In

Step 3*, A gets 1,..., NU u u� where 1iu � if ( 1)piBit t ,

otherwise 0iu � . Then in Step 4*, at the end of the permutation protocol, the Re sult A gets is actually the ciphertext

of 1(( ) ( 1)?1:0A Bibit bit t . For 1,...,i N� , if

(Re ) 1sk iD sult � , A learns that element i appears at least ttimes in A BS S� .

D. Multiset-Intersection The protocol for the multiset-intersection problem is given

in Fig. 3.

Protocol: Multiset-Intersection-HBC

Input: There are 2 honest-but-curious players A and B, at most one party dishonestly behave, each with a private input set, that is AS from A and BS from B. 1i ( ,...., )A A

A NB t bit bit� is the

characteristics bit-string of AS and 1( ,...., )B BB NBit bit bit� is the

characteristics bit-string of BS . Party A holds the secret key sk,and pk is the corresponding public key to a homomorphic cryptography.

Output: A will learn A BS S� .

Step 1:

A and B use Yao’s Millionaire protocol as subroutine to compare A

iBit and BiBit for i=1,...,N, where A

iBit (resp., BiBit )

is the ith element of vector i AB t (resp., BBit ). At the end, A gets

the 1 ,...,A AA NU u u� , where 0iu � if A B

i iBit Bit ,

otherwise Ai iu Bit� . B gets 1 ,...,B B

B NU u u� , where B

i iu Bit� if A Bi iBit Bit , otherwise 0iu � .

Step 2:

A computes 1( ) ( ),..., ( )A Apk A pk pk NE U E u E u� and sends it to B.

B computes 1( ) ( ),..., ( )B Bpk B pk pk NE U E u E u� .

Step3:

Upon receiving ( )pk AE U , B computes ( ) ( )pk A h pk BE U E U , the above expression is equivalent to

1 1

( ) ( ) ( )

( ),..., ( )pk A h pk B pk A B

A B A Bpk pk N N

E U E U E U U

E u u E u u

� .

Then B sends ( ) ( )pk A h pk BE U E U to A.

Step 4:

A decrypts ( ) ( )pk A h pk BE U E U and gets 1 1( ),...,( )A B A BN Nu u u u ,

that is 1 1min( , ),....,min( , )A B A BN Nbit bit bit bit , the characteristics

bit-string of A BS S� .

Figure 3. Multiset-Intersection-HBC

Note that, the two parties take part in Yao's Millionaire protocol and it may leak extra information. For example, A may get some information about BS by making comparison

between AiBit and B

iBit . However it does not matter because A can deduce the extra information from the result. So we can say the use of the subroutine won't lower the security of the protocol.

E. Cardinality Multiset-Intersection Protocol: Cardinality Multiset-Intersection-HBC

Input: There are 2 honest-but-curious players A and B, at most one party dishonestly behave, each with a private input set, that is AS from A and BS from B. 1i ( ,...., )A A

A NB t bit bit� is the

characteristics bit-string of AS and 1( ,...., )B BB NBit bit bit� is the

characteristics bit-string of BS . Party A holds the secret key sk,and pk is the corresponding public key to a homomorphic cryptography.

Output: A will learn A BS S� .

Step 1:

A and B use Yao’s Millionaire protocol as subroutine to compare A

iBit and BiBit for i =1,...,N, where A

iBit (resp., BiBit ) is

the ith element of vector i AB t (resp., BBit ). At the end, A gets

the 1 ,...,A AA NU u u� ,where 0iu � if A B

i iBit Bit , otherwise Ai iu Bit� .

B gets 1 ,...,B BB NU u u� , where B

i iu Bit� if A Bi iBit Bit ,

otherwise 0iu � .

Step 2:

A computes 1

( )N AA pk ii

Sum E u�

� � and send it to B.

B computes1

( )N BB pk ii

Sum E u�

� � .

Step3:

Upon receiving ASum , B computes A h BSum Sum . The expression is equivalent to

A h BSum Sum �1 1

( ) ( )N NA Bpk i h pk ii i

E u E u� �

� �

1 1( )N NA B

pk i ii iE u u

� �� � � 1

( min( , ))N A Bpk i ii

E bit bit�

� �B sends A h BSum Sum to A.

Step 4:

A decrypts A h BSum Sum to get1min( , )N A B

i iibit bit

�� , which is

| |A BS S� .

Figure 4. Cardinality Multiset-Intersection-HBC

The cardinality multiset-intersection protocol, given in Fig. 4, is essentially a combination of the over-threshold protocol in Fig. 2 and the multiset-intersection protocol in Fig. 3.

F. Security and Correctness A protocol is correct if particular player(s) would learn the

appropriate answer at its termination. Each of these protocols is secure in the honest-but-curious model; no player gains information that it would not gain when using its input in the ideal model. Due to space constraints, we only prove the correctness and security for over-threshold multiset-intersection protocol and omit proof for other protocols.

Theorem Assuming that the additively homomorphic cryptography is semantically secure, over-threshold multiset-intersection of Fig. 2, with over whelming probability, A learns each element a which appears at least t times in the union of the players' private inputs, as well as the number of times it so appears. No party learns any more information than would be gained by using the same private inputs in the ideal model with a trusted third party.

Proof.

(Correctness)According to the protocol, at the end of Step 4 A actually gets the ciphertext of characteristics bit-string of

1( )t A BRd S S � (( ) ( 1))?( ):0A b A bi i i ibit bit t bit bit

1,...,i N� . For 1,...,i N� , if ( ) ( 1)A Bi ibit bit t , then A

would know element i appears A Bi ibit bit times in A BS S� ,

otherwise, it means element i doesn't appear as much as ttimes, then A won't know how much times it appears because its times denoted by 0(here 0 does not mean no appearance).

0RiBit � indicates elements i appears (Re )sk iD sult t� times

in A BS S� .

(Security)In some special cases, A can infer some elements in B's multiset. Let us prove that the participants' private inputs during the process of protocol won't be leaked in general.

According to the rival's input and output information, we construct a simulator (A and B are simulated by the simulator) to simulate the operation to execute the protocol, making the simulation results and the information can be seen in the real execution of the protocol computationally indistinguishable, then we can say the security achieved.

When A is dishonest. Simulator uses A's input AS and output Result as its own input. First we will conduct

1 ,....,A AA NBit bit bit� , then conduct ' '

' 1 ,....,B BB NBit bit bit� based

on (Re )R skBit D sult� . For 1,...,i N� , 'B Ai iBit t Bit� if 0R

iBit � ,

otherwise ' 0BiBit � . The simulated steps are as follows:

Simulated Step 1:

Player A executes the key generation algorithm for a public-key encryption to obtain 'pk and 'sk , and sends 'pk to B.

Simulated Step 2:

A computes ˆ 1 2ˆ ˆ ˆ, ,...,c NBit c c c� ,where 'ˆ ( )Aj pk jc E bit� .

B computes ˆ' 1 2ˆ ˆ ˆ' , ' ,..., 'c NBit c c c� , where ''ˆ ' ( )B

j pk jc E bit� ,

then B generates a random permutation �̂ and computes its inverse function 1�̂ .

Simulated Step 3:

Simulator computes ˆ ˆ ˆ' ˆ ˆ( ) ( ' )p c c i iBit Bit Bit c c� �� � 1,...,i N� and sends it to A.

Simulated Step 4:

A decrypts p̂Bit and compares p̂iBit and 1t for 1,...,i N� .

A gets the 1ˆ ˆ ˆ,..., NU u u� , where ˆˆ p

i iu Bit� if ˆ 1piBit t ,

otherwise ˆ 0iu � .

Simulated Step 5:

Simulator computes 10

ˆˆRe ' ( )sult U Bit� � , and sends it to A.

Simulated Step 6:

A computes ˆ '(Re ')skRBit D sult� . ˆ 0RiBit � ( 1,...,i N� )

indicates elements i appears R̂iBit t� times in A BS S� .

Simulation results: AS , p̂Bit in simulated Step 3 and the Result' in simulated Step 5.

Obviously , the AS and Result' in simulation are identical

with those in the real execution of the protocol, because B̂S is conducted by simulator based on the A's output in the real execution of the protocol, Result' is the ciphertext of the correct answer, which is computationally indistinguishable with the Result in real execution. So we can say every ciphertext in the simulation is computationally indistinguishable with the corresponding one in real execution thanks to the semantic security of cryptography.

When B is dishonest. All B can see in the simulation is

ˆ' 1 2ˆ ˆ ˆ' , ' ,..., 'c NBit c c c� , where 'ˆ' ( )Bj pk jc E bit� , which makes

the simulation simple. Assuming that the additively homomorphic cryptography is semantically secure, and then all information B can see in the simulation is computationally indistinguishable with the one in real execution.

Figure 5 The simulation for over-threshold multiset-intersection protocol in malicious setting

The simulation for over-threshold multiset-intersection protocol in malicious setting is shown in Figure 5. To sum up,

no matter A or B is rival the simulation results are computationally indistinguishable with the corresponding ones in real execution. This means that all the information the adversary obtains in the protocol is the information can be calculated in based on its input and output. In other words, besides the information can be inferred from the input and output, the adversary cannot obtain any more information from the execution of the protocol.

V. THE MALICIOUS CASE

Protocols secure against malicious players largely follow those secure against honest-but-curious players. We utilize two techniques to deal with the malicious: zero-knowledge proofs[17,18] and commitment scheme[19,20]. In malicious, every party is asked to: 1) make a commitment to own private input; 2) use zero-knowledge proof to ensure the correctness of all calculations, which can be verified by the other party. Due to lack of space we omit from the protocols secure in the malicious setting, more detail can be found in [1-3].

VI. CONCLUSION

We consider several multiset operations in secure two-party setting. In this paper, we design some methods and apply these techniques to solve multiset operations, including inclusion relation, over-threshold multiset-intersection, threshold multiset-intersection, multiset-intersection and cardinality multiset-intersection. However our techniques can be used to solve other multiset operation problems which are not be considered in the paper. We can do this by utilizing these techniques to construct particular protocol for those multiset operation problems.

REFERENCES

[1] KISSNER L, Song Dawn. Privacy-Preserving set operations[C] Advances in Cryptology – CRYPTO 2005 Springer Berlin/ Heidelberg, 2005: 241-257.

[2] FREEDMAN M, NISSIM K, PINKAS B. Efficient private matching and set intersection[C] Advances in Cryptology - EUROCRYPT 2004Springer Berlin Heidelberg, 2004: 1-19.

[3] AGGELOS K, ANTONINA M. Testing disjointness of private datasets[C]Financial Cryptography and Data Security Springer Berlin Heidelberg, 2005: 578-578.

[4] Li Shundong, Si Tiange and Dai Yiqi. Secure Multi-Party Computation of Set-Inclusion and Graph-Inclusion [J]. JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT, 2005(10):4-10.

[5] ATALLAH M, WENLIANGDU. Secure Multi-party computational geometry [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2005, 20(2): 258-263.

[6] NAOR M, PINKAS B. Oblivious transfer and polynomial evaluation[C]Proceedings of the thirty-first annual ACM symposium on Theory of computing Atlanta, Georgia, United States, New York, NY, USA ACM, 1999: 245-254.

[7] LIPMAA H. Verifiable homomorphic oblivious transfer and private equality test[C]In Proc. of Asiacrypt 2003, 2003: 416-433.

[8] FAGIN R, NAOR M, WINKLER P. Comparing information without leaking it[J]. Commun. ACM, 1996, 39(5): 77-85.

[9] KISSNER L, Dawn Song. Private and threshold Set-IntersectionTechnical Report CMU-CS-05-113[R], 2005.

[10] LI Rong-Hua, WU Chuan-Kun and ZHANG Yu-Qing. Secure Computation Protocol for Testing the Inclusion Relation of Sets [J]. CHINESE JOURNAL OF COMPUTERS,2009,32(7):1337-1345

[11] GOLDREICH O. Secure Multi-Party computation[C] Proceedings of the 2001 workshop on New security p, 2001: 13-22.

[12] YAO C-andrew. Protocols for secure computations [C] Foundations of Computer Science, 1982. SFCS'08. 2, 1982: 160-164.

[13] GOLDREICH O, MICALI S, WIGDERSON A. How to play any mental game[C] Proceedings of the nineteenth annual ACM symposium New York, New York, United State, New York, NY, USA ACM, 1987: 218-229.

[14] IOANNIDIS I, GRAMA A. An efficient protocol for Yao’s millionaires’ problem[C] In Proceedings of the 36th Annual Hawaii International Conference on System Sciences, 2003.

[15] SCHOENMAKERS B, TUYLS P. Practical Two-party Computation Based on the conditional gate[C]Advances in Cryptology - ASIACRYPT 2004 Springer Berlin Heidelberg, 2004: 129-145.

[16] ZHA Jun, SU Jin-hai, YAN Shao-ge ,etc. Efficient Solution to Yao's Millionaires' Problem [J]. COMPUTER ENGINEERING, 2010, 36(14):124-126.

[17] GOLDWASSER S, MICALI S, RACKOFF C. The knowledge complexity of interactive proof-systems[C] Proceedings of the seventeenth annual ACM symposium on Theory of computing Providence, Rhode Island, United States, New York, NY, USA ACM, 1985: 291-304.

[18] MANUEL B, FELDMAN P, SILVIO M. Non-interactive zero-knowledge and its applications[C] Proceedings of the twentieth annual ACM symposium on Theory of computing Chicago, Illinois, United States, New York, NY, USA ACM, 1988: 103-112.

[19] CANETTI R, FISCHLIN M. Universally composable commitments[C] Lecture Notes in Computer Science, 2001: 19-40.

[20] IVAN D, JESPER N. Perfect hiding and perfect binding universally composable commitment schemes with constant expansion factor[C]Advances in Cryptology — CRYPTO 2002 Springer Berlin / Heidelberg, 2002: 3-42.