database query privacy using homomorphic encryptions

4
Database Query Privacy using Homomorphic Encryptions Sudharaka Palamakumbura and Hamid Usefi Department of Mathematics and Statistics Memorial University of Newfoundland, St. John’s, NL, Canada, A1C 5S7 {sudharakap, usefi}@mun.ca Abstract—Homomorphic encryption is a novel encryption method for it enables computing over encrypted data. This has a wide range of real world ramifications such as being able to blindly compute a search result sent to a remote server without revealing it’s content. In this paper we summarize how SQL queries can be made secure using a homomorphic encryption scheme based on the ideas of Gahi, et al [1]. Gahi’s model is based on the DGHV scheme which is a homomorphic encryption scheme which acts on plaintext bits and produce corresponding ciphertext. We use Gahi’s blueprint to propose an improved model which is based on the more recent Ring based homomorphic encryption scheme by Braserski, et al [2]. Our method is more general in the sense that it can be extended to use with any fully homomorphic encryption scheme rather than being restricted to the DGHV scheme as in Gahi’s method. I. I NTRODUCTION With the advent of the digital era in the 1980’s there was a rapid growth in communication and networking all over the globe. This was followed with the trend of miniaturization of digital equipment and current standards allows us to use a smartphone with the same computational power as the whole of NASA back in the 1970s [3]. The fast moving trend of digitalization has enabled the access of most services from a distant location. For example a recent study shows that 74% of smartphone users use a location based service (such as Google Maps) to find directions and other location based information [4]. Moreover the adaptation of these kind of services in healthcare are becoming increasingly common with cloud based health record and genomic data management tools such as Microsoft Health. The widespread adaptation of location based services poses a threat to users since their personal data such as location, health records and sometimes even genomic data is shared on the web without any guarantee of privacy. The privacy of data sent via the web can be guaranteed if they are encrypted before sending. However encryption also makes server side computations impossible unless the datacenter is provided with the decryption key. This problem can be addressed by Fully Homomorphic Encryption schemes which enable operations on encrypted data such that when decrypted will output a corresponding plaintext. This work hopes to improve upon a method proposed by Gahi, et al [1] to homomorphically encrypt database queries. Their work uses the DGHV fully homomorphic encryption scheme [5]. The DGHV scheme operates on plaintext bits. The disadvantage of this model is that it requires a large amount of computations to do even a simple operation such as integer multiplication. We propose an improvement to Gahi’s method by employing the ring based homomorphic encryption scheme proposed by Braserski, et al [2]. The advantage of this new method is that the ring based scheme works on blocks of data (such as integers) rather than single bits and hence the number of computations will be greatly reduced compared to the previous scheme. Also our scheme can be generalized to use with any fully homomorphic encryption scheme rather than being restricted to the DGHV scheme. II. HOMOMORPHIC ENCRYPTION The idea of fully homomorphic encryption was first pro- posed by Rivest, et al [6] in 1978. In 2009, Craig Gentry published the first fully homomorphic encryption scheme based on ideal lattices [7]. Since then numerous fully homo- morphic encryption schemes have been introduced along with optimizations to make them more practical. In this section we shall briefly describe the two fully homomorphic encryption schemes that we will use in this work. A. DGHV Scheme Shortly after the introduction of the first fully homomorphic encryption scheme by Gentry, Dijk, et al introduced a much simpler and practical fully homomorphic encryption scheme (hereafter referred as the DGHV scheme) based on Genry’s original blueprint [5]. The DGHV scheme works as follows. Let λ be the security parameter and set, N = λ, P = λ 2 and Q = λ 5 . The scheme is based on the following algorithms; KeyGen(λ): The key generation algorithm which randomly chooses a P -bit integer p as the secret key. Encrypt(m): The bit m ∈{0, 1} is encrypted by c m 0 + pq where m 0 = m (mod 2) and q, m 0 are random Q-bit and N -bit numbers respectively. Decrypt(c): Output (c mod p) mod 2 where (c mod p) is the integer c 0 in (-p/2, p/2) such that p divides c - c 0 .

Upload: sudharaka-palamakumbura

Post on 21-Aug-2015

40 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Database Query Privacy using Homomorphic Encryptions

Database Query Privacy using HomomorphicEncryptions

Sudharaka Palamakumbura and Hamid UsefiDepartment of Mathematics and StatisticsMemorial University of Newfoundland,

St. John’s, NL, Canada, A1C 5S7{sudharakap, usefi}@mun.ca

Abstract—Homomorphic encryption is a novel encryptionmethod for it enables computing over encrypted data. This hasa wide range of real world ramifications such as being ableto blindly compute a search result sent to a remote serverwithout revealing it’s content. In this paper we summarizehow SQL queries can be made secure using a homomorphicencryption scheme based on the ideas of Gahi, et al [1]. Gahi’smodel is based on the DGHV scheme which is a homomorphicencryption scheme which acts on plaintext bits and producecorresponding ciphertext. We use Gahi’s blueprint to proposean improved model which is based on the more recent Ringbased homomorphic encryption scheme by Braserski, et al [2].Our method is more general in the sense that it can be extendedto use with any fully homomorphic encryption scheme ratherthan being restricted to the DGHV scheme as in Gahi’s method.

I. INTRODUCTION

With the advent of the digital era in the 1980’s there wasa rapid growth in communication and networking all over theglobe. This was followed with the trend of miniaturization ofdigital equipment and current standards allows us to use asmartphone with the same computational power as the wholeof NASA back in the 1970s [3]. The fast moving trend ofdigitalization has enabled the access of most services froma distant location. For example a recent study shows that74% of smartphone users use a location based service (suchas Google Maps) to find directions and other location basedinformation [4]. Moreover the adaptation of these kind ofservices in healthcare are becoming increasingly common withcloud based health record and genomic data managementtools such as Microsoft Health. The widespread adaptationof location based services poses a threat to users since theirpersonal data such as location, health records and sometimeseven genomic data is shared on the web without any guaranteeof privacy.

The privacy of data sent via the web can be guaranteedif they are encrypted before sending. However encryptionalso makes server side computations impossible unless thedatacenter is provided with the decryption key. This problemcan be addressed by Fully Homomorphic Encryption schemeswhich enable operations on encrypted data such that whendecrypted will output a corresponding plaintext.

This work hopes to improve upon a method proposed byGahi, et al [1] to homomorphically encrypt database queries.

Their work uses the DGHV fully homomorphic encryptionscheme [5]. The DGHV scheme operates on plaintext bits.The disadvantage of this model is that it requires a largeamount of computations to do even a simple operation such asinteger multiplication. We propose an improvement to Gahi’smethod by employing the ring based homomorphic encryptionscheme proposed by Braserski, et al [2]. The advantage of thisnew method is that the ring based scheme works on blocksof data (such as integers) rather than single bits and hencethe number of computations will be greatly reduced comparedto the previous scheme. Also our scheme can be generalizedto use with any fully homomorphic encryption scheme ratherthan being restricted to the DGHV scheme.

II. HOMOMORPHIC ENCRYPTION

The idea of fully homomorphic encryption was first pro-posed by Rivest, et al [6] in 1978. In 2009, Craig Gentrypublished the first fully homomorphic encryption schemebased on ideal lattices [7]. Since then numerous fully homo-morphic encryption schemes have been introduced along withoptimizations to make them more practical.

In this section we shall briefly describe the two fullyhomomorphic encryption schemes that we will use in thiswork.

A. DGHV SchemeShortly after the introduction of the first fully homomorphic

encryption scheme by Gentry, Dijk, et al introduced a muchsimpler and practical fully homomorphic encryption scheme(hereafter referred as the DGHV scheme) based on Genry’soriginal blueprint [5]. The DGHV scheme works as follows.

Let λ be the security parameter and set, N = λ, P = λ2

and Q = λ5. The scheme is based on the following algorithms;

KeyGen(λ): The key generation algorithm which randomlychooses a P -bit integer p as the secret key.

Encrypt(m): The bit m ∈ {0, 1} is encrypted by c← m′+ pqwhere m′ = m (mod 2) and q, m′ are random Q-bit andN -bit numbers respectively.

Decrypt(c): Output (c mod p) mod 2 where (c mod p) is theinteger c′ in (−p/2, p/2) such that p divides c− c′.

Page 2: Database Query Privacy using Homomorphic Encryptions

This scheme is homomorphic with respect to addition andmultiplication and decrypts correctly as long as the noise leveldo not exceed p/2 in absolute value.

B. Ring Based Fully Homomorphic Encryption

This encryption scheme was introduced by Braserski, et al[2] and operates on the polynomial ring R = Z[X]/(Xn +1)where n is a power of 2. The key generation and encryptionfunctions makes use of two distributions χkey and χerr on Rfor generating small elements. χkey is the uniform distributionand used in the key generation. χerr is the discrete Gaussiandistribution which is used to sample small noise polynomials.Specific details can be found in [2]. The scheme is based onthe following algorithms.

KeyGen(n, q, t, χkey, χerr): Operating on the input degreen and moduli q and t, this algorithm generates the publicand private keys (pk, sk) = (h, f) where f = [tf ′ + 1]q andh = [tgf−1]q . Here the key generation algorithm samplessmall polynomials from the key distribution f ′, g → χkey

such that f is invertible modulo q.

Encrypt(h,m): Given a message m ∈ R, outputs,c = [bq/tc[m]t + e + hs]q ∈ R where the Encryptalgorithm samples small error polynomials s, e → χerr, [.]qdenotes coefficients of polynomials in R reduced by moduloq and b.c denotes the floor function.

Decrypt(f, c): Given a ciphertext c this algorithm outputs,m =

[⌊tq .[fc]q

⌉]t∈ R

This encryption scheme is homomorphic with respect toaddition and multiplication of plaintexts modulo t. We canuse this encryption scheme to encrypt real numbers insteadof dealing with bits as in [5]. For example if z is an integerand its binary representation is, z = (±1)

∑li=0 2

izi wherezi ∈ {0, 1} and l = dlog2 |z|e. Then as long as l < n we canencode the number z as

∑li=0 ziX

i. That is if z = 20 wehave the polynomial representation X2 +X4. Since any realnumber can be approximated to a given precision by rationalnumbers, in this way we can encode any real number througha polynomial.

III. EVALUATING DATABASE QUERIES USING DGHV

Having these homomorphic encryption schemes in mind,we shall now give a brief description of Gahi’s [1] modelof homomorphically evaluating database queries. This methoduses the DGHV scheme. Suppose we need to retrieve aparticular record(s) from the database. Typically we send aquery to the database encrypted using the DGHV scheme. Letci be the i-th bit of the ciphertext query c and vi be the i-th bitof a record R in database D. We only encrypt the query thatwe are sending to the database. The database records does notneed to be encrypted in order to use this scheme. The servershall compute the following sum for each record.

TABLE IEXAMPLE DATABASE AND CORRESPONDING Ir AND Sr VALUES

Database Records Ir Sr

(1, 1, 0, 0) Enc(1) Enc (1)(1, 0, 1, 0) Enc(0) Enc (1)(1, 1, 0, 0) Enc(1) Enc (2)(1, 1, 0, 1) Enc(0) Enc (2)(1, 0, 0, 0) Enc(0) Enc (2)

Ir =∏i

(1 + ci + vi) for every R ∈ D (1)

where r denotes the index of record R.If c = Enc(v) then ci = Enc(vi) for every i. Therefore we

could get, 1 + ci + vi = Enc(1). This results in Ir = Enc(1).Otherwise, Ir = Enc(0). Hence for each record in the databasewe shall have an Ir value which is equal to Enc(1) orEnc(0) depending on whether the search query matches thecorresponding record or not. Secondly we calculate the partialsums of the Ir values.

Sr =∑i≤r

Ir for every R ∈ D (2)

As a toy example let us consider a database which has 5records each encoded with 4 bits. If the query sent by theuser is (Enc(1),Enc(1),Enc(0),Enc(0)) we would have thecorresponding Ir and Sr values as shown in Table 1.

Using these partial sums we can then calculate the sequence(I ′r,j)∀j corresponding to each record as follows,

I ′r,j = Ir∏i

(1 + ji + Sr,i) for every R ∈ D (3)

where Sr,i is the ith bit of Sr and ji represents the i-thbit of the integer j where j ≤ r. Hence these sequences havethe property that whenever Ir = Enc(1) and Sr = Enc(j) wehave, I ′r,j = Enc(1). And I ′r,j = Enc(0) otherwise.

For our example above we would have,

(I ′1) = (Enc(1))

(I ′2) = (Enc(0),Enc(0))

(I ′3) = (Enc(0),Enc(1),Enc(0))

(I ′4) = (Enc(0),Enc(0),Enc(0),Enc(0))

(I ′5) = (Enc(0),Enc(0),Enc(0),Enc(0),Enc(0))

And therefore,

(R′) =∑k

Enc(Rk)(I′k) (4)

where Rk is the k-th record in D will give us a sequencecontaining only the encrypted records that matches our searchquery. In the above example we would get,

(R′) = (Enc(R1),Enc(R3),Enc(0),Enc(0))

Page 3: Database Query Privacy using Homomorphic Encryptions

At this point the sequence (R′) will contain all the recordsthat matches our query but with trailing encryptions of zeroswhich we do not need. Hence a second sum is calculated at theserver side to determine the number of terms that are usefulin the sequence.

n =∑r

Ir

This result can be returned to the user and decrypted toobtain the number of records that matches his search query.Hence the sequence (R′) can be truncated at the appropriatepoint and returned to the user for decryption.

An update query can be performed by,

Rnew = (1 + Ir)R+ IrU for every R ∈ D

where U is the new value that we wish to insert wheneverthe query matches R (or Ir = Enc(1)).

A deletion of a record can be performed by,

Rnew = (1 + Ir)R for every R ∈ D

To perform all these operations without exceeding themaximum noise permitted (p/2) it is necessary to choose theparameters N,P and Q appropriately.

IV. IMPROVEMENT USING RING BASED HOMOMORPHICENCRYPTION

The main drawback in Gahi’s method above is that itrequires an enormous number of homomorphic operationssince the scheme works on bitwise encryption. We propose toimprove this scheme using the Ring based fully homomorphicencryption introduced by by Braserski, et al [2]. The majoradvantage is that Braserski’s method works on plaintext andciphertext blocks and thus the number of homomorphic oper-ations required can be greatly reduced if this scheme could beincorporated in the above model instead of the DGHV scheme.

We begin by defining the value Fi for the i-th record(denoted by Ri) in the database. Here Enc(x) stands forencryption of x under the ring based scheme.

Fi =

∏Rk 6=Ri

Enc(m−Rk)∏Rk 6=Ri

Enc(Ri −Rk),

where each of the products above is over all the records Rk

such that Rk 6= Ri. Note that if the database is sorted thencomputing the Fi is easy, however this might compromisesome privacy. If the database is not sorted, a list of recordsequal to Ri will have to be computed prior to computing Fi.This will involve a loop that checks for each record Rk = Ri

and exclude these Rk values from the computation of Fi.Let Enc(m) be a message (encrypted under the ring based

scheme) sent by the user to the database for comparison.Since we are dealing with a fully homomorphic encryptionscheme we can compute Enc(m − Rj) values by computingEnc(m) − Enc(Rj). In fact since all the Ri s’ are known tothe server, the denominator can be reduced to a simpler formusing the homomorphic property of the encryption scheme soas to perform a single encryption on the denominator.

TABLE IIEXAMPLE DATABASE AND CORRESPONDING Fi AND Gi VALUES

Database Records Fi Gi

(0, 0, 1, 0) Enc(1) Enc (1)(1, 0, 1, 1) Enc(0) Enc (1)(1, 0, 0, 1) Enc(1) Enc (2)(1, 0, 1, 1) Enc(0) Enc (2)(1, 1, 0, 0) Enc(0) Enc (2)

Fi =

∏Rk 6=Ri

Enc(m−Rk)

Enc(∏

Rk 6=Ri(Ri −Rk))

As we can see the value Fi is similar to Ir in equation (1).That is whenever m = Ri (query being equal to the recordwe are comparing) we have Fi = Enc(1) and Fi = Enc(0)otherwise. Note that here we are assuming that the query iscontained somewhere in the database.

Now we can calculate Gi s’ which are the partial sums ofthe Fi values similar to equation (2).

Gi =∑j≤i

Fj for every R ∈ D

Our counterpart to equation (3) would be,

F ′i,k = Fi

(∏j 6=k (Gi − Enc(j))

Enc(∏

j 6=k(k − j))

)where k ≤ i. It could be seen that F ′i,k = Enc(1) if

Fi = Enc(1) and Gi = Enc(k) are both satisfied. Hence thesequences (F ′i,k)

lk=1 have the property that whenever Fi =

Enc(1) (i.e: i-th record matches the query) we have an Enc(1)at the k-th position of the sequence where Gi = Enc(k).All other entries of the sequence are encryptions of zeros.These sequences are identical to Gahi’s sequences; (I ′r,j) givenabove. Hence replacing I ′i,j by F ′i,k in equation (4) will giveus a sequence containing the encrypted records that matchesour search query and the rest of the process is identical toGahi’s method above.

To further illustrate our scheme let us consider the sameexample above with 5 records, each with 4 bits of data. Alsolet our encryption scheme encrypt 2 bits at a time. Then if thesearch query is (Enc(2), Enc(3)) the corresponding Fi andGi values are given in Table 2. The resulting sequences (F ′i )would be similar as in Gahi’s scheme,

(F ′1) = (Enc(0))

(F ′2) = (Enc(1),Enc(0))

(F ′3) = (Enc(0),Enc(0),Enc(0))

(F ′4) = (Enc(0),Enc(1),Enc(0),Enc(0))

(F ′5) = (Enc(0),Enc(0),Enc(0),Enc(0),Enc(0))

Page 4: Database Query Privacy using Homomorphic Encryptions

V. COMPARISON OF OUR SCHEME VS. GAHI’S SCHEME

Our scheme has the main advantage that it could be usedwith any fully homomorphic encryption scheme rather thanbeing restricted to the DGHV scheme. This gives the flexibilityto use our method with block based encryption schemeswhich reduces the number of encryption steps. For examplethe complexity of Gahi’s scheme is O(mnλ10) where m, nand λ are the number of records in the database, numberof bits used in encoding plain-texts and cipher-texts, andthe security parameter respectively (complexity of the DGHVscheme is O(λ10) [5]). On the other hand using a scheme suchas Braserski’s which is based on the Learning with Errors(LWE) problem will result in a complexity of O(mkp3L2)where k and m are the number of blocks (per record) andthe number of records respectively (Braserski’s scheme hascomplexity O(p3L2) [8]). Here p and L are the depth anddimension parameters as defined in [9] and both are functionsof the security parameter λ. Hence by choosing the parametersjudiciously such that kp3L2 ≤ nλ10 a more efficient schemecould be generated using Braserski’s encryption.

VI. CONCLUSION

In this paper we have introduced an improvement to Gahi’s[1] query privacy scheme by employing the ring based ho-momorphic encryption scheme proposed by Braserski, et al[2]. This improves upon the previous method in that insteadof specifically relying on the DGHV scheme we can now usethe same blueprint but with more efficient encryption schemessuch as the that of Braserski’s which can encrypt blocks ofdata (e.g: integers and real numbers). In fact this method is ageneralization of Gahi’s method such that it is not dependentupon the particular homomorphic encryption scheme in use.

However the tradeoff in this method is that we introducedivisions which were not present in Gahi’s scheme. Homomor-phic division can only be carried out by certain bitwise circuitswhich are quite inefficient in practice [10], [11]. Therebyfurther research should be focused on improving the proposedmethod by employing encrypted division techniques with lowcomplexity such as the Garbled Circuit method of Lazzeretti etal [12]. Furthermore unlike Gahi’s scheme if the query that wassent to the database is not contained in the database at all, ourscheme would crash giving a random encrypted result. Hencethis scheme should be improved to make it able to recognizewhether the query is contained in the database or not.

ACKNOWLEDGMENT

This research is supported by Research & Development Cor-poration of Newfoundland and Labrador and Natural Sciencesand Engineering Research Council of Canada.

REFERENCES

[1] Y. Gahi, M. Guennoun, and K. El-Khatib, “A secure database systemusing homomorphic encryption schemes,” in The Third InternationalConference on Advances in Databases, Knowledge, and Data Applica-tions, January 2011.

[2] Z. Braserski and V. Vaikuntanathan, “Fully homomorphic encryptionfrom ring-lwe and security for key dependent messages,” Advances inCryptology - CRYPTO, vol. 6841, pp. 505–524, 2011.

[3] M. Kaku, Physics of the Future, 1st ed. Anchor Books, February 2012.[4] K. Zickuhr, “Location based services,” http://www.pewinternet.org/2013/

09/12/location-based-services/, September 2013.[5] M. van Dijk, C. Gentry, S. Halevi, and V. Vaikuntanathan, “Fully

homomorphic encryption over the integers,” Advances in Cryptology –EUROCRYPT 2010, vol. 6110, pp. 24–43, 2010.

[6] R. L. Rivest, A. Shamir, and L. Adleman, “A method for obtaining digitalsignatures and public-key cryptosystems,” Communications of the ACM,vol. 21, no. 2, pp. 120–126, 1978.

[7] C. Gentry, “Fully homomorphic encryption using ideal lattices,” in STOC’09 Proceedings of the forty-first annual ACM symposium on Theory ofcomputing, 2009, pp. 169–178.

[8] C. Gentry, A. Sahai, and B. Waters, “Homomorphic encryptionfrom learning with errors: Conceptually-simpler, asymptotically-faster,attribute-based,” Advances in Cryptology, vol. 8042, pp. 75–92, 2013.

[9] ——, “Homomorphic encryption from learning with errors:Conceptually-simpler, asymptotically-faster, attribute-based,” Advancesin Cryptology – CRYPTO 2013, no. 8042, pp. 75–92, 2013.

[10] M. Naehrig, K. Lauter, and V. Vaikuntanathan, “Can homomorphicencryption be practical?” in CCSW ’11 Proceedings of the 3rd ACMworkshop on Cloud computing security workshop, 2011, pp. 113–124.

[11] K. Lauter, A. Lopez-Alt, and M. Naehrig, “Private computationon encrypted genomic data,” Tech. Rep. MSR-TR-2014-93, June2014. [Online]. Available: http://research.microsoft.com/apps/pubs/default.aspx?id=219979

[12] R. Lazzeretti and M. Barni, “Division between encrypted integers bymeans of garbled circuits,” in 2011 IEEE International Workshop onInformation Forensics and Security (WIFS). IEEE, 2011, pp. 1 – 6.