Download - CSE Identity-Based Distributed Provable Data
-
8/20/2019 CSE Identity-Based Distributed Provable Data
1/13
Identity-Based Distributed Provable DataPossession in Multicloud Storage
Huaqun Wang
Abstract—Remote data integrity checking is of crucial importance in cloud storage. It can make the clients verify whether their
outsourced data is kept intact without downloading the whole data. In some application scenarios, the clients have to store their
data on multicloud servers. At the same time, the integrity checking protocol must be efficient in order to save the verifier’s cost. From
the two points, we propose a novel remote data integrity checking model: ID-DPDP (identity-based distributed provable data
possession) in multicloud storage. The formal system model and security model are given. Based on the bilinear pairings, a concrete
ID-DPDP protocol is designed. The proposed ID-DPDP protocol is provably secure under the hardness assumption of the standard
CDH (computational Diffie-Hellman) problem. In addition to the structural advantage of elimination of certificate management,
our ID-DPDP protocol is also efficient and flexible. Based on the client’s authorization, the proposed ID-DPDP protocol can realize
private verification, delegated verification, and public verification.
Index Terms—Cloud computing, provable data possession, identity-based cryptography, distributed computing, bilinear pairings
Ç
1 INTRODUCTION
OVER the last years, cloud computing has become animportant theme in the computer field. Essentially, ittakes the information processing as a service, such asstorage, computing. It relieves of the burden for storagemanagement, universal data access with independentgeographical locations. At the same time, it avoids of capital expenditure on hardware, software, and personnelmaintenances, etc. Thus, cloud computing attracts moreintention from the enterprise.
The foundations of cloud computing lie in the outsourc-ing of computing tasks to the third party. It entails thesecurity risks in terms of confidentiality, integrity andavailability of data and service. The issue to convince thecloud clients that their data are kept intact is especially vitalsince the clients do not store these data locally. Remote dataintegrity checking is a primitive to address this issue. Forthe general case, when the client stores his data onmulticloud servers, the distributed storage and integritychecking are indispensable. On the other hand, theintegrity checking protocol must be efficient in order tomake it suitable for capacity-limited end devices. Thus,
based on distributed computation, we will study distrib-uted remote data integrity checking model and present thecorresponding concrete protocol in multicloud storage.
1.1 Motivation
We consider an ocean information service corporationCor in the cloud computing environment. Cor canprovide the following services: ocean measurementdata, ocean environment monitoring data, hydrologicaldata, marine biological data, GIS information, etc. Besidesof the above services, Cor has also some privateinformation and some public information, such as the
corporation’s advertisement. Cor
will store these differentocean data on multiple cloud servers. Different cloudservice providers have different reputation and chargingstandard. Of course, these cloud service providers needdifferent charges according to the different security-levels. Usually, more secure and more expensive. Thus,Cor will select different cloud service providers to storeits different data. For some sensitive ocean data, it willcopy these data many times and store these copies ondifferent cloud servers. For the private data, it will storethem on the private cloud server. For the publicadvertisement data, it will store them on the cheappublic cloud server. At last, Cor stores its whole data on
the different cloud servers according to their importanceand sensitivity. Of course, the storage selection will takeaccount into the Cor’s profits and losses. Thus, thedistributed cloud storage is indispensable. In multicloudenvironment, distributed provable data possession is animportant element to secure the remote data.
In PKI (public key infrastructure), provable data pos-session protocol needs public key certificate distributionand management. It will incur considerable overheadssince the verifier will check the certificate when it checksthe remote data integrity. In addition to the heavycertificate verification, the system also suffers from the
other complicated certificates management such as certifi-cates generation, delivery, revocation, renewals, etc. Incloud computing, most verifiers only have low computa-tion capacity. Identity-based public key cryptography can
. The author is with the School of Information Engineering, Dalian OceanUniversity, Dalian 116023, China, and also with the State Key Laboratoryof Information Security, Institute of Information Engineering, Chinese
Academy of Sciences, 100049, China. E-mail: [email protected].
Manuscript received 6 Jan. 2013; revised 5 Apr. 2013; accepted 16 Nov. 2013.Date of publication 10 Mar. 2014; date of current version 10 Apr. 2015.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TSC.2014.1
1939-1374 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015328
-
8/20/2019 CSE Identity-Based Distributed Provable Data
2/13
eliminate the complicated certificate management. In orderto increase the efficiency, identity-based provable datapossession is more attractive. Thus, it will be verymeaningful to study the ID-DPDP.
1.2 Related Work
In cloud computing, remote data integrity checking is animportant security problem. The clients’ massive data isoutside his control. The malicious cloud server maycorrupt the clients’ data in order to gain more benefits.Many researchers proposed the corresponding systemmodel and security model. In 2007, provable data posses-sion (PDP) paradigm was proposed by Ateniese et al. [1].In the PDP model, the verifier can check remote dataintegrity with a high probability. Based on the RSA, theydesigned two provably secure PDP schemes. After that,Ateniese et al. proposed dynamic PDP model and concretescheme [2] although it does not support insert operation.In order to support the insert operation, in 2009, Erway et al.proposed a full-dynamic PDP scheme based on the authen-ticated flip table [3]. The similar work has also been done byF. Sebé et al. [4]. PDP allows a verifier to verify the remotedata integrity without retrieving or downloading the wholedata. It is a probabilistic proof of possession by samplingrandom set of blocks from the server, which drasticallyreduces I/O costs. The verifier only maintains smallmetadata to perform the integrity checking. PDP is aninteresting remote data integrity checking model. In 2012,
Wang proposed the security model and concrete scheme of proxy PDP in public clouds [5]. At the same time, Zhu et al.proposed the cooperative PDP in the multicloud storage [6].
Following Ateniese et al.’s pioneering work, manyremote data integrity checking models and protocolshave been proposed [7], [8], [9], [10], [11], [12]. In 2008,Shacham presented the first proof of retrievability (POR)scheme with provable security [13]. In POR, the verifier cancheck the remote data integrity and retrieve the remotedata at any time. The state of the art can be found in [14],[15], [16], [17]. On some cases, the client may delegate theremote data integrity checking task to the third party. It
results in the third party auditing in cloud computing [18],[19], [20], [21]. One of benefits of cloud storage is to enableuniversal data access with independent geographicallocations. This implies that the end devices may be mobile
and limited in computation and storage. Efficient integritychecking protocols are more suitable for cloud clientsequipped with mobile end devices.
1.3 Contributions
In identity-based public key cryptography, this paperfocuses on distributed provable data possession in multi-cloud storage. The protocol can be made efficient by
eliminating the certificate management. We propose thenew remote data integrity checking model: ID-DPDP. Thesystem model and security model are formally proposed.Then, based on the bilinear pairings, the concrete ID-DPDPprotocol is designed. In the random oracle model, our ID-DPDP protocol is provably secure. On the other hand, ourprotocol is more flexible besides the high efficiency. Basedon the client’s authorization, the proposed ID-DPDPprotocol can realize private verification, delegated verifi-cation and public verification.
1.4 Paper Organization
The rest of the paper is organized as follows. Section 2formalizes the ID-DPDP model. Section 3 presents our ID-DPDP protocol with a detailed performance analysis.Section 4 evaluates the security of the proposed ID-DPDPprotocol. Finally, Section 5 concludes the paper.
2 SYSTEM MODEL AND SECURITY MODELOF ID-DPDP
The ID-DPDP system model and security definition arepresented in this section. An ID-DPDP protocol comprisesfour different entities which are illustrated in Fig. 1. Wedescribe them below:
1. Client: an entity, which has massive data to bestored on the multicloud for maintenance andcomputation, can be either individual consumer orcorporation.
2. CS (Cloud Server): an entity, which is managed bycloud service provider, has significant storage spaceand computation resource to maintain the clients’data.
3. Combiner: an entity, which receives the storagerequest and distributes the block-tag pairs to thecorresponding cloud servers. When receiving thechallenge, it splits the challenge and distributesthem to the different cloud servers. When receivingthe responses from the cloud servers, it combinesthem and sends the combined response to theverifier.
4. PKG (Private Key Generator): an entity, whenreceiving the identity, it outputs the correspondingprivate key.
First, we give the definition of interactive proof system.It will be used in the definition of ID-DPDP. Then, wepresent the definition and security model of ID-DPDPprotocol.
Definition 1 (Interactive Proof System) [22]. Let c; s :N ! R be functions satisfying cðnÞ 9 sðnÞ þ ð 1 pðnÞÞ for some polynomial pðÞ. An interactive pair ðP; V Þ is called a
Fig. 1. System model of ID-DPDP.
WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 329
-
8/20/2019 CSE Identity-Based Distributed Provable Data
3/13
interactive proof system for the language L, with complete-ness bound cðÞ and soundness bound sðÞ, if
1. Completeness: for every x 2 L, Pr½hP; V iðxÞ ¼ 1 cðjxjÞ.
2. Soundness: for every x 62 L and every interactivemachine B , Pr½hB; V iðxÞ ¼ 1 sðjxjÞ.
Interactive proof system is used in the definition of ID-DPDP, i.e., Definition 2.
Definition 2 (ID-DPDP). An ID-DPDP protocol is a collectionof three algorithms ðSetup; Extract; TagGenÞ and an inter-active proof system ðProofÞ. They are described in detailbelow.
1. Setupð1kÞ: Input the security parameter k, it outputs thesystem public parameters params, the master public keympk and the master secret key msk.
2. Extractð1k; params; mpk; msk; IDÞ: Input the public parameters params, the master public key mpk, the
master secret key msk
, and the identity ID
of a client, itoutputs the private key skID that corresponds to the clientwith the identity ID.
3. TagGenðskID; F i; PÞ: Input the private key skID, theblock F i and a set of CS P ¼ fCS jg, it outputs the tuplefi; ðF i; T iÞg, where i denotes the i-th record of metadata, ðF i; T iÞ denotes the i-th block-tag pair. Denoteall the metadata fig as .
4. ProofðP ; C ðCombinerÞ; V ðVerifierÞÞ: i s a p ro to co lamong P , C and V . At the end of the interactive protocol,V outputs a bit f0j1g denoting false or true.
Besides of the high efficiency based on the communica-
tion and computation overheads, a practical ID-DPDPprotocol must satisfy the following security requirements:
1. The verifier can perform the ID-DPDP protocolwithout the local copy of the file(s) to be checked.
2. If some challenged block-tag pairs are modified orlost, the response can not pass the ID-DPDP protocoleven if P and C collude.
To capture the above security requirements, we definethe security of an ID-DPDP protocol as follows.
Definition 3 (Unforgeability). An ID-DPDP protocol is
unforgeable if for any (probabilistic polynomial) adversary A (malicious CS and combiner) the probability that A wins theID-DPDP game on a set of file blocks is negligible. The ID-DPDP game between the adversary A and the challenger C can be described as follows:
1. Setup: The challenger C runs Setupð1kÞ and getsð params; mpk; mskÞ. It sends the public parametersand master public key ð params; mpkÞ to A while it keepsconfidential the master secret key msk.
2. First-Phase Queries: The adversary A adaptively makesExtract; Hash; TagGen queries to the challenger C as follows:
. Extract queries. The adversary A queries the private key of the identity ID. By runningExtractð params; mpk; msk; IDÞ, the challenger C
gets the private key skID and forwards it to A . LetS 1 denote the extracted identity set in the first- phase.
. Hash queries. The adversary A queries hash function adaptively. C responds the hash valuesto A .
. TagGen queries. The adversary A makes block-tag pair queries adaptively. For a block tag query F i, the
challenger calculates the tag T i and sends it back tothe adversary. Let ðF i; T iÞ be the queried block-tag pair for index i 2 I1, where I1 is a set of indices thatthe corresponding block tags have been queried in the first-phase.
3. Challenge: C generates a challenge chal which defines aordered collection fID; i1; i2; . . . ; icg, where ID
62 S 1,fi1; i2; . . . ; icg 6 I1, and c is a positive integer. Theadversary is required to provide the data possession proof for the blocks F i1 ; . . . ; F ic .
4. Second-Phase Queries: Similar to the First-PhaseQueries. Let the Extract query identity set be S
2 and
the TagGen query index set be I2. The restriction is thatfi1; i2; . . . ; icg 6 ðI1 [ I2Þ and ID
62 ðS 1 [ S 2Þ.5. Forge: The adversary A responses for the challenge
chal.
We say that the adversary A wins the ID-DPDP game if the response can pass C ’s verification.
Definition 3. States that, for the challenged blocks, themalicious CS or C cannot produce a proof of data possessionif the blocks have been modified or deleted. On the other hand,the definition does not state clearly the status of the blocks
that are not challenged. In practice, a secure ID-DPDP protocol also needs to convince the client that all of hisoutsourced data is kept intact with a high probability. We givethe following security definition.
Definition 4 (ð; Þ Security). An ID-DPDP protocol isð; Þ secure if the CS corrupted fraction of the whole blocks,the probability that the corrupted blocks are detected is atleast .
The notations throughout this paper are listed in Table 1.
3 THE PROPOSED ID-DPDP PROTOCOLIn this section, we present an efficient ID-DPDP protocol. Itis built from bilinear pairings which will be brieflyreviewed below.
3.1 Bilinear Pairings
Let G 1 and G 2 be two cyclic multiplicative groups with thesame prime order q . Let e : G 1 G 1 ! G 2 be a bilinear map[25] which satisfies the following properties:
1. Bilinearity: 8g 1; g 2; g 3 2 G 1 and a; b 2 Z q ,
eðg 1; g 2g 3Þ ¼ eðg 2g 3; g 1Þ ¼ eðg 2; g 1Þeðg 3; g 1Þe g 1
a; g 2b
¼ eðg 1; g 2Þ
ab:
2. Non-degeneracy: 9g 4; g 5 2 G 1 such that eðg 4; g 5Þ 6¼ 1G 2 .
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015330
-
8/20/2019 CSE Identity-Based Distributed Provable Data
4/13
3. Computability: 8g 6; g 7 2 G 1, there is an efficientalgorithm to calculate eðg 6; g 7Þ.
Such a bilinear map e can be constructed by the modifiedWeil [23] or Tate pairings [24] on elliptic curves. Our ID-
DPDP scheme relies on the hardness of CDH (Computa-tional Diffie-Hellman) problem and the easiness of DDH(Decisional Diffie-Hellman) problem. They are defined below.
Definition 5 (CDH Problem on G 1). Let g be the generatorof G 1. Given g; g
a; g b 2 G 1 for randomly chosen a; b 2 Z q ,calculate g ab 2 G 1.
Definition 6 (DDH Problem on G 1). Let g be the generator of G 1. Given ðg; g
a; g b; ĝ Þ 2 G 41 for randomly chosen a; b 2 Z q ,
decide whether g ab ¼? ĝ .
In the paper, the chosen group G 1 satisfies that CDHproblem is difficult but DDH problem is easy. The DDH
problem can be solved by making use of the bilinearpairings. Thus, ðG 1; G 2Þ are also defined as GDH (Gap
Diffie-Hellman) groups.
3.2 The Concrete ID-DPDP Protocol
This protocol comprises four procedures: Setup, Extract,TagGen, and Proof. Its architecture can be depicted inFig. 2. The figure can be described as follows:
1. In the phase Extract, PKG creates the private key forthe client.
2. The client creates the block-tag pair and uploads itto combiner. The combiner distributes the block-tagpairs to the different cloud servers according to the
storage metadata.3. The verifier sends the challenge to combiner and the
combiner distributes the challenge query to thecorresponding cloud servers according to the stor-age metadata.
4. The cloud servers respond the challenge and thecombiner aggregates these responses from the cloudservers.
The combiner sends the aggregated response to theverifier. Finally, the verifier checks whether the aggregatedresponse is valid.
The concrete ID-DPDP construction mainly comes from
the signature, provable data possession and distributedcomputing. The signature relates the client’s identity withhis private key. Distributed computing is used to store theclient’s data on multicloud servers. At the same time,distributed computing is also used to combine the multi-cloud servers’ responses to respond the verifier’s chal-lenge. Based on the provable data possession protocol [13],the ID-DPDP protocol is constructed by making use of thesignature and distributed computing.
Without loss of generality, let the number of stored blocks be n. For different block F i, the corresponding tupleðN i; CS li ; iÞ is also different. F i denotes the i-th block.
Denote N i as the name of F i. F i is stored in CS li where li isthe index of the corresponding CS. ðN i; CS li ; iÞ will be usedto generate the tag for the block F i. The algorithms can bedescribed in detail below.
TABLE 1Notations and Descriptions
Fig. 2. Architecture of our ID-DPDP protocol.
WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 331
-
8/20/2019 CSE Identity-Based Distributed Provable Data
5/13
. Setup: Let g be a generator of the group G 1 with theorder q . Define the following cryptographic hashfunctions:
H : f0; 1g ! Z q
h : f0; 1g Z q ! G 1
h1 : f0; 1g ! Z q :
Let f be a pseudo-random function and be apseudo-random permutation. They can be describedin detail below
f : Z q f1; 2; . . . ; ng ! Z q
: Z q f1; 2; . . . ; ng ! f1; 2; . . . ; ng:
PKG picks a random number x 2 Z q and calculatesY ¼ g x. The parameters fG 1; G 2;e;q;g;Y;H;h;h1; f ; gare made public. PKG keeps the master secret key xconfidential.
. Extract: Input the identity ID, PKG picks r 2 Z q and
calculates
R ¼ g r; ¼ r þ xH ðID;RÞ mod q:
PKG sends the private key skID ¼ ðR; Þ to the client by the secure channel. The client can verify thecorrectness of the received private key by checkingwhether the following equation holds
g ¼?
RY H ðID;RÞ: (1)
If the formula (1) holds, the client ID accepts theprivate key; otherwise, the client ID rejects it.
. TagGenðskID; F; PÞ: Split the whole file F into n blocks, i.e., F ¼ ðF 1; F 2; . . . ; F nÞ. The client preparesto store the block F i in the cloud server CSli . Then,for 1 i n, each block F i is split into s sectors, i.e.,F i ¼ f ~F i1; ~F i2; . . . ; ~F isg. Thus, the client gets n ssectors f ~F ijgi2½1;n;j2½1;s. Picks s random u1; u2; . . . ;us 2 G 1. Denote u ¼ fu1; u2; . . . ; usg. For F i, the clientperforms the procedures below:
1. The client calculates F ij ¼ h1ð ~F ijÞ for everysector ~F ij; 1 j s.
2. The client calculates
T i ¼ h N i; CS li ; ið ÞYs j¼1
uF ij j !
:
3. The client adds the record i ¼ ði;u;N i; CS li Þ tothe table T cl. It stores T cl locally.
4. The client sends the metadata table T cl to thecombiner. The combiner adds the records of T clto its own metadata table T o.
5. The client outputs T i and stores ðF i; T iÞ in CSli .
. ProofðP ; C; VÞ: This is a 5-move protocol amongP ¼ fCS igi2½1;n̂), C , and V with the input ( public parameters, T cl). If the client delegates the verification
task to some verifier, it sends the metadata table T clto the verifier. Of course, the verifier may be thethird auditor or the client’s proxy. The interactionprotocol can be given in detail below.
1. Challenge 1 ðC VÞ: the verifier picks thechallenge chal ¼ ðc; k1; k2Þ where 1 c n; k1;k2 2 Z
q and sends chal to the combiner C ;
2. Challenge 2 ðP CÞ: the combiner calculatesvi ¼ k1ðiÞ; 1 i c and looks up the table T o toget the records that correspond to fv1; v2; . . . ;vcg ¼ M1
SM1
S
SMn̂. Mi denotes the in-
dex set where the corresponding block-tag pair
is stored in CSi. Then, C sends ðMi; k2Þ toCSi 2 P .
3. Response1 ðP ! CÞ: For CSi 2 P , it performs theprocedures below:
. For vl 2 Mi, CSi splits F vl into s sectors F vl ¼f ~F vl1; ~F vl2; . . . ; ~F vl sg and calculates F vl j ¼h1ð ~F vl jÞ for 1 j s. (Note: F vl j ¼ h1ð ~F vl jÞcan also be pre-computed and stored by CS).
. CSi calculates al ¼ f k2 ðlÞ; vl 2 Mi and
T ðiÞ ¼
Yvl2MiT vl
al :
. For 1 j s, CSi calculates
F̂ ðiÞ
j ¼X
vl2Mi
alF vl j:
Denote F̂ ðiÞ
¼ ð F̂ ðiÞ
1 ; . . . ; F̂
ðiÞ
s Þ.. CSi sends i ¼ ðF̂
ðiÞ; T ðiÞÞ to C .
4. Response2 ðC ! VÞ: After receiving all theresponses from CSi 2 P , the combiner aggre-gates figCS i2P into the final response as
T ¼YCS i T
ðiÞ
; ^F l ¼
XCS i
^F
ðiÞ
l :
Denote F̂ ¼ ðF̂ 1; F̂ 2; . . . ; F̂ sÞ. The combiner sends ¼ ð F̂ ; T Þ to the verifier V .
5. After receiving the response ¼ ðF̂ ; T Þ, theverifier calculates
vi ¼ k1ðiÞ
hi ¼ h N vi ; CS lvi ; vi
ai ¼ f k2 ðiÞ:
Then, it verifies whether the following formulaholds.
eðT; g Þ ¼?
eYci¼1
haiiYs j¼1
uF̂ j
j ; RY H ðID;RÞ
!: (2)
If the formula (2) holds, then the verifier outputs‘‘success’’. Otherwise, the verifier outputs‘‘failure’’.
3.2.1 Notes
In ProofðP ; C; VÞ, the verifier picks the challenge chal ¼
ðc; k1; k2Þ where k1; k2 are randomly chosen from Z q . Based
on the challenge chal, the combiner determines the chal-lenged block index set as fv1; v2; . . . ; vcg where vi ¼ k1 ðiÞ;1 i c. Since k1 ðÞ is a pseudo-random permutation
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015332
-
8/20/2019 CSE Identity-Based Distributed Provable Data
6/13
determined by the random k1, the challenged c block-tagpairs come from the random selection of the n stored block-tag pairs.
3.2.2 Correctness
An ID-DPDP protocol must be workable and correct. Thatis, if the PKG, C , V and P are honest and follow thespecified procedures, the response can pass V ’s checking.The correctness follows from below:
eðT; g Þ ¼ eYCS i
T ðiÞ; g !
¼ eYCS i
Yvl2Mi
T f k2 ðlÞvl ; g
!
¼ eYCS i
Yvl2Mi
hall
Ys j¼1
ualF vl j
j
!; g
!
¼ eYci¼1
haii
!Ys j¼1
uF̂ j
j ; RY H ðID;RÞ
!:
3.3 Performance Analysis
First, we analyze the performance of our proposed ID-DPDP protocol from the computation and communicationoverhead. We compare our ID-DPDP protocol with theother up-to-date PDP protocols. On the other hand, ourprotocol does not suffer from resource-consuming certifi-cate management which is require by the other existingprotocols. Second, we analyze our proposed ID-DPDPprotocol’s properties of flexibility and verification. Third,we give the prototypal implementation of the proposed ID-DPDP protocol.
3.3.1 Computation
Suppose there are n message blocks which will be stored in n̂cloud servers. The block’s sector number is s. The challenged block number is c. We will consider the computationoverhead in the different phases. On the group G 1, bilinearpairings, exponentiation, multiplication, and the hash func-tion h1 (the input may be large data, such as, 1 G byte)contribute most computation cost. Compared with them, thehash function h, the operations on Z q and G 2 are faster, thehash function H can be done once for all. Thus, we do notconsider the hash functions h and H , the operations on Z q and G 2. On the client, the computation cost mainly comesfrom the procedures of TagGen and Verification (i.e., the
phase 5 in the protocol ProofðP ; C; VÞ). In the phase TagGen,the client performs nðs þ 1Þ exponentiation, ns multiplica-tion on G 1, ns hash function h1 and n hash function h. At thesame time, for every file, the corresponding record i is
stored by the client and C . These stored metadata is small. Inthe phase Proof , in order to respond the challengechal ¼ ðc; k1; k2Þ and generate the response , P and thecombiner C perform c exponentiation, c 1 multiplicationon the group G 1 and cs hash function h1. In the verification of the response , V performs c þ s þ 1 exponentiation, 2pairings, c þ s multiplication on the group G 1 and c hashfunction h. The exponentiation and multiplication opera-tions are more efficient than pairing [26]. Based on the testpresented in [27], [28], [29], a Tate pairing operation
consumes about 10 ms on a platform with PIII 3.0 GHz,30 ms on a platform with Pentium D 3.0 GHz and 170 ms onan iMote2 sensor working at 416 MHz. On the other hand, in2012, Zhu et al. proposed the cooperative provable datapossession for integrity in multicloud storage [6]. Almost atthe same time, Zhu et al. proposed the dynamic auditservices for outsourced storages in clouds [21]. In 2012,Barsoum et al. proposed a provable multicopy data posses-sion protocol [30]. These three PDP protocols are designed inthe PKI. Thus, the certification verification is necessary whenthese three PDP protocols are performed. Compared withthem, our proposed ID-DPDP scheme is more efficient in the
computation cost. The computation comparison can besummarized in Table 2. In Table 2, C exp denotes the timecost of exponentiation on the group G 1; C mul denotes thetime cost of multiplication on the group G 1; C e denotes thetime cost of bilinear pairing; C h1 denotes the time cost of the hash function h1. In other schemes, the sector must be inZ q . Our scheme only requires the hash function h1’s valuelies in Z q . Thus, the hash function h1 can be used to generateless block-tag pairs for the same file. Less block-tag pairsonly incur less computation cost. It shows that our protocolcan be implemented in mobile devices which have limitedcomputation power.
3.3.2 Communication National Bureau of Standards and ANSI X9 have deter-mined the shortest key length requirements: RSA and DSAis 1024 bits, ECC is 160 bits [31]. Based on the requirements,we give our ID-DPDP protocol’s communication overhead.In the phase Proof, the communication overhead mainlycomes from the challenge chal and response. The block-tagpairs are uploaded once and for all. After that, the phaseProof will be performed periodically. Thus, the communi-cation overheads mainly come from the ID-DPDP queriesand responses. Suppose there are n message blocks arestored in the CS. G 1 and G 2 have the same order q . In ID-
DPDP query, the verifier sends the challenge chal ¼ ðc;k1; k2Þ to combiner, i.e., the communication overhead islog2 n þ 2log2 q . In the response, the combiner responds 1element in G 1 and s elements in Z
q to the verifier. On the other
TABLE 2Comparison of Computation Cost
WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 333
-
8/20/2019 CSE Identity-Based Distributed Provable Data
7/13
hand, Zhu et al. [6], Zhu et al. [21], and Barsoum et al. [30]proposed three different provable data possession scheme.Compared with these three schemes, our ID-DPDP scheme ismore efficient in the communication cost. The communicationcomparison can be summarized in Table 3. In Table 3, 1G 1denotes one element of G 1 and 1G 2 denotes one element of G 2.
3.3.3 Flexibility
In the ID-DPDP protocol, one block is split into s sectors.Then, the hash function h1 will be used on these sectors.
These hash values are shorter than the order q of G 1. Denotek ¼ log2 q . Let h1 be h1 : f0; 1g
~k ! f0; 1gk. This propertymakes our ID-DPDP protocol more flexible betweenstorage overhead and computation overhead. For example,for the file F , the client divides it into n blocks which will besplit into s sectors. The hash function h1 will be used onthese sectors. Thus, ~ks ¼ jF j=n. In order to save the storagespace, the client will divide F into less blocks because every block corresponds to one block-tag pair. Thus, s ¼ jF j=n~kwill increase when n decreases. The picked public para-meters u j; 1 j s will become more. These parameterswill incur more computation cost with less storage cost.And vice versa. When s decreases, n increases. Thecomputation cost will decrease along with the storagecost increases. Thus, our ID-DPDP protocol has theflexibility to adjust the computation cost and storage cost.
3.3.4 Private Verification, Delegated Verification, and
Public Verification
Our proposed ID-DPDP protocol satisfies the privateverification and public verification. In the verificationprocedure, the metadata in the table T cl and R areindispensable. Thus, it can only be verified by the clientwho has T cl and R i.e., it has the property of privateverification. On some cases, the client has no ability to
check its remote data integrity, for example, he takes part inthe battle in the war. Thus, it will delegate the third party toperform the ID-DPDP protocol. The third party may be thethird auditor or the proxy or other entities. The client willsend T cl and R to the consignee. The consignee can performthe ID-DPDP protocol. Thus, it has the property of delegated verification. On the other hand, if the clientmakes T cl and R public, every entity can perform the ID-DPDP protocol by himself. Thus, it has also the property of public verification.
3.3.5 Notes
In the verification, the verifier must have T cl and R. T clstores the metadata which has no any information on theprivate key ðR; Þ. On the other hand, even if R is madepublic, the private key still keeps secret. The private key
extraction process Extract is actually a modified ElGamalsignature scheme which is existentially unforgeable. Forthe identity ID, the corresponding private key ðR; Þ is asignature on ID. Since it is existentially unforgeable, theprivate key still keeps secret even if R is made public.Thus, the client can generate the tags for different files withthe same private key even if it makes T cl and R public.
3.3.6 Simulation
In order to study its prototypal implementation, we havesimulated the proposed ID-DPDP protocol by using C
programming language with GMP Library (GMP-5.0.5)[33], Miracl Library [34] and PBC Library (pbc-0.5.13) [35].In the simulation, CS works on an ASUS S56C Laptop withthe following settings:
. CPU: Intel Core [email protected] GHz;
. Physical Memory: 4 GB DDR3 1600 MHz;
. OS: Ubuntu 13.04 Linux 3.8.0-19-generic SMP i686.
The client works on an PC Laptop with the followingsettings:
. CPU: CPU I PDC E6700 3.2 GHz;
.
Physical Memory: DDR3 2G;. OS: Ubuntu 11.10 over VMware-workstation-
full-8.0.0.
In the simulation, we choose an elliptic curve with 160-bitgroup order, which provides a competitive security levelwith 1024-bit RSA. The certification verification schemecomes from the IEEE P1363 (Hybrid Schemes) [36].
In order to describe our ID-DPDP protocol’s computa-tion performance, we compare our ID-DPDP with Zhu’sscheme [6]. Fig. 3 depicts the comparison on computationcost for TagGen based on the sector number s ¼ 20. X-labeldenotes the block number n and Y-label denotes the time
cost (second) in order to generate n tags. It is easilyobserved that our ID-DPDP protocol becomes moreefficient along with the block number n increases. Fig. 4depicts the comparison on computation cost for Verify
TABLE 3Comparison of Communication Cost (Bits)
Fig. 3. Comparison on computation cost for TagGen based on s ¼ 20.
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015334
-
8/20/2019 CSE Identity-Based Distributed Provable Data
8/13
based on the same sector number s ¼ 20. X-label denotesthe challenged block number c and Y-label denotes the timecost (second) in order to verify the response. Fig. 5 depictsthe comparison on CS computation cost for response basedon the sector number s = 20 and the CS number be n̂ ¼ 10,i.e., jPj ¼ n̂. X-label denotes the challenged block numberc and Y-label denotes the time cost (second) in order tocreate the response. They also show that our ID-DPDPscheme is more efficient than Zhu’s scheme [6]. Then, wedescribe our ID-DPDP protocol’s communication perfor-mance by comparing our ID-DPDP with Zhu’s scheme [6].Fig. 6 depicts the comparison on communication cost for
challenge based on the challenged block number c ¼ 10.X-label denotes the whole block number n and Y-labeldenotes the communication cost (bit) of challenge. Fig. 7depicts the comparison on communication cost for re-sponse. X-label denotes the sector number s and Y-labeldenotes the communication cost (bit) of response from thecombiner. From Figs. 6 and 7, our ID-DPDP has shortercommunication length.
In order to show the feasibility of our proposedscheme in concrete terms, we assume that the clientwill store 50 Tbyte data to CS. The order q of G 1 and G 2 is160 bit. We take use of the hash function h which comesfrom the [37]. Let the hash functions h1 be SHA 1 whichmaps 1 G byte to 160 bit. Let the hash function H be alsoSHA 1. Let the sector number s be 1. Thus, 50 Tbyte datacan be split into 50 1024 blocks. The elliptic curve G 1 is theType A whose order is 160 bit and point size is 128 byte[35]. In the phase TagGen, the whole computation cost is501024 ð2C exp þ C mul þ C h þ C h1Þ. Based on the 100 sim-ulation, the average computation cost is 17.5535 hours.
Now, there are 50 1024 ¼ 101200 blocks are stored in CS.Suppose there are 1000 block-tag pairs are corrupted. If 230 block-tag pairs are challenged, the corrupted block-tagpairs can be detected with the probability at least 90 percent(Theorem 2). Thus, the challenge is chal ¼ ð230; k1; k2Þ. Inthe response, CS will perform c ¼ 230 exponentiation andc 1 ¼ 229 multiplication on the group G 1 (We omit the
Fig. 4. Comparison on computation cost for verify based on s ¼ 20.
Fig. 5. Comparison on computation cost for CS’s response based ons ¼ 20, n̂ ¼ 10.
Fig. 6. Comparison on communication cost for Chal based on c ¼ 10.
Fig. 7. Comparison on communication cost for response.
WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 335
-
8/20/2019 CSE Identity-Based Distributed Provable Data
9/13
hash function h1 since it can be pre-computed). The wholetime cost is 0.5 s. In the verification, the client will perform 2pairings, c þ 1 ¼ 231 exponentiation, c þ 1 ¼ 231 additionon the group G 1 and 1 hash function H . The whole time costis 1.08 s. On the other hand, we consider the communicationcost. For 50 Tbyte data, the whole block-tag pairs are50 10244 þ 50 1024 128 byte. The redundancy rate is1:2 107. In the local area network with the bandwidth
1000 Mb/s, in order to upload these block-tag pairs, the timecost is 116.509 hours. The challenge size is log 2ð101200Þþ160 þ 160 ¼ 337 bit and the response size is ð1G þ 128Þ 8 ¼8:6 109 bit. The whole time cost is 8.2 s. From thecomputation technology and communication technology,our proposed ID-DPDP scheme is feasible.
4 SECURITY ANALYSIS
The security of our ID-DPDP protocol mainly consists of the correctness and unforgeability. The property of correctness has been shown in the Section 3.2. On the other
hand, we study the universal unforgeability. It means thatthe attacker includes all the cloud servers P and thecombiner C . If our ID-DPDP protocol is universallyunforgeable, it can also prevent the collusion of maliciouscloud servers P and C . Our proof comprises two parts:single block-tag pair is universally unforgeable; theresponse is universally unforgeable.
Definition 7. A forger A ðt;;QH ; Qh; Qh1 ; QE ; QT Þ-breaks asingle tag scheme if it runs in time at most t ; A makes at mostQH queries to the hash function H , at most Qh queries to thehash function h, at most Qh1 queries to the hash function h1,at most QE queries to the extraction oracle Extract, at mostQT queries to the tag oracle TagGen ; and the probability thatA forges a valid block-tag pair is as least . A single tagscheme is ðt;;QH ; Qh; Qh1 ; QE ; QT Þ-existentially unforge-able under an adaptive chosen-message attack if no forgerðt;;QH ; Qh; Qh1 ; QE ; QT Þ-breaks it.
The following Lemma 1 shows that the single tagscheme is secure.
Lemma 1. Let ðG 1; G 2Þ be a ðt0; 0Þ-GDH group pair of order q .T he n t he t ag s ch em e o n ðG 1; G 2Þ i s ðt;;QH ; Qh;Qh
1
; QE ; QT Þ-secure against existential forgery under anadaptive chosen-block attack (in the random oraclemodel) for all t and satisfying 0 1=9 and t0 23ðQH þQhÞðtþOðQH ÞþOðQE ÞþOðQhÞþOðQh1 ÞþOðQT ÞÞ
ð1ÞQE ð1 ÞQT , where 0 G ;
G 1. Based on the difficulty of the CDH problem, our singletag scheme is secure.
Proof. Let the stored index-block pair set be fð1; F 1Þ; ð2; F 2Þ;. . . ; ðn; F nÞg. For 1 i n, each block F i is split into ssectors, i.e., F i ¼ f ~F i1; ~F i2; . . . ; ~F isg. Then, the hashfunction h1 is used on these sectors, i.e., F ij ¼ h1ð ~F ijÞ.Then, we get the hash value set fF ijg. Let the selected
identity set be I ¼ fID1; ID2; . . . ; ID~ng. Weshow how toconstruct a t0-algorithm B that solves CDH problem onG 1 with probability at least
0. This will contradict thefact that ðG 1; G 2Þ are GDH group pair.
Algorithm B is given ðg; g a; g bÞ 2 G 31. Its goal is tooutput g ab 2 G 1. Algorithm B simulates the challengerand interacts with forger A as follows.
Setup. Algorithm B starts by setting the master publickey as Y ¼ g a while a keeps unknown. It initializes thethree tables Tab1, Tab2, and Tab3. Then, it picks the twoparameters and from the interval (0, 1).
H-Oracle. At any time algorithm A can query the
random oracle H . To respond to these queries, algorithm B maintains a list of tuples ðID j; R j; H jÞ as explained below.We refer to this list asTab1. When A queries the oracle H atthe pair ðID j; R jÞ, algorithm B responds as follows:
1. If ðID j; R j; Þ 2 Tab1, then algorithm B retrieves thetuple ðID j; R j; H jÞ and responds with H j, i .e.,H ðID j; R jÞ ¼ H j.
2. Otherwise, B picks a random H j 2 Z q . Then, it addsthe tuple ðID j; R j; H jÞ to Tab1 and responds to A bysetting H ðID j; R jÞ ¼ H j.
Extract-Oracle. At any time algorithm A can query the
oracle Extract. To respond to these queries, algorithm B maintains a list of tuples ðci; IDi; Ri; H i; iÞ as explained below. We refer to this list as Tab2. When A queries theoracleExtract attheidentityIDi 2 I ,algorithm B picksabitci 2 f0; 1g according to the bivariate distribution function:Pr½ci ¼ 0 ¼ , Pr½ci ¼ 1 ¼ 1 . Based on ci, B respondsas follows:
1. If ci ¼ 1, B independently picks the random i; H i 2 Z q and calculates Ri ¼ g
i Y H i .
. If ðIDi; Ri; Þ 62 Tab1, then algorithm B adds thetuple ðIDi; Ri; H iÞ to Tab1 and the tuple ðci; IDi;
Ri; H i; iÞ to Tab2.. If ðIDi; Ri; Þ 2 Tab1, then algorithm B picks the
other random i; H i 2 Z q and calculates Ri ¼g i Y H i until ðIDi; Ri; Þ 62Tab1. Then, algorithmB adds the tuple ðIDi; Ri; H iÞ to Tab1 and thetuple ðci; IDi; Ri; H i; iÞ to Tab2.
. Finally, algorithm B responds with ðRi; iÞ.
2. If ci ¼ 0, algorithm B independently picks a randomRi 2 G 1 and calculates H i ¼ H ðIDi; RiÞ which satis-fies ðIDi; Ri; Þ 62 Tab1 (otherwise, picks another Riand calculates H ðIDi; RiÞ). Then, B adds the tupleðID
i; R
i; H
iÞ
to Tab
1 and the tuple ðc
i; ID
i; R
i; H
i;
iÞ
to Tab2, where i ¼?. B fails.
h-Oracle. At any time algorithm A can query therandom oracle h. To respond to these queries, algorithmB maintains a list of tuples ðN i; CS li ; i ;z i; bi; d iÞ as explained below. We refer to this list as Tab3. When A queries theoracle h at the tuple ðN i; CS li ; iÞ, B responds as follows:
1. If ðN i; CS li ; i ; z i; bi; d iÞ 2 Tab3 and 1 i n, thenalgorithm B responds with z i
Qs j¼1 u
F ij j , i . e .,
hðN i; CS li ; iÞ ¼ z iQs
j¼1 uF ij
j .2. Otherwise, algorithm B picks a random bi 2 Z
q and
a random coin d i 2 f0; 1g according to the bivariate
distribution Pr½d i ¼ 0 ¼ , Pr½d i ¼ 1 ¼ 1 .
. If d i ¼ 1, algorithm B calculates z i ¼ g bi . If d i ¼ 0,
algorithm B calculates z i ¼ ðg bÞ
bi .
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015336
-
8/20/2019 CSE Identity-Based Distributed Provable Data
10/13
. Algorithm B adds the tuple ðN i; CS li ; i ; z i; bi; d iÞto Tab3 and responds with z i
Qs j¼1 u
F ij j , i.e.,
hðN i; CS li ; iÞ ¼ z iQs
j¼1 uF ij
j .
h1-Oracle. Upon receiving the query F̂ , B takes arandom value from Z q and responds it to A . It is a realhash function.
TagGen-Oracle. Let ðN i; CS li ; i ;F iÞ be a tag query issued
by A . Algorithm B responds to this query as follows:
1. Algorithm B runs the above algorithm for respond-ing to h-queries to obtain hðN i; CS li ; iÞ. Let ðN i; CS li ;i; z i; bi; ciÞ be the corresponding tuple in Tab3. If d i ¼0, then algorithm B reports failure and terminates.
2. If d i ¼ 1, define i ¼ ðg aÞbi . Observe that T i ¼ ðg aÞ
bi ¼ðg bi Þ
a¼ ðhðN i; CS li ; iÞ
Qs j¼1 u
F ij j Þ
aand therefore T i is a
valid tag on F i under the public key ðg a; u1; ; usÞ.Algorithm B gives T i to algorithm A .
Output. Eventually, on behalf of the client ID whoseprivate keyis ðR; Þ, algorithmA producesa block-tag pair
ðF w; T wÞ such that no tag query was issued for ðN w; CS lw ;w; F wÞ. If ðN w; CS lw ; w; Þ 62 Tab3, B issues a query itself forhðN w; CS lw ; wÞ to ensure that such a tuple exists. Weassume T w is a valid tag on F w under the given public key;if it is not, B reports failure and terminates. Next, B findsthe tuple ðN w; CS lw ;j ;z w; bw; d wÞ in Tab3. If d w ¼ 1, B reports failure and terminates. Otherwise, d w ¼ 0, ðF w; T wÞsatisfies the following verification equation
ðT w; g Þ ¼ e h N w; CS lw ; wð ÞYs j¼1
uh1ðF wjÞ
j ; RY H ðID;RÞ
!:
This completes the description of algorithm B . It remainsto show that B solves the given instance of the CDHproblem with a high probability.
Breaking CDH: Based on the oracle-replay technique[32], B can get another block-tag pair ðF w; T̂ wÞ on the same block F w by using different hash functions ĥ and Ĥ . Then, wecan get eðT̂ w; g Þ ¼eðĥðN w; CS lw ; wÞ
Qs j¼1 u
h1ðF wjÞ j ;RY
H ðID;RÞÞ.A c co r di n g t o t h e s i mu l at i on , B k no w s t h e
corresponding bw; b̂w that satisfy
h N w; CS lw ; wð Þ ¼ ðg bÞ
bwYs j¼1
uh1ðF wjÞ
j
ĥ N w; CS lw ; wð Þ ¼ ðg bÞ^bwYs j¼1
uh1ðF wjÞ j :
Thus, we can get
eðT w; g Þ ¼ e ðg bÞ
bw; RY H ðID;RÞ
eðT̂ w; g Þ ¼ e ðg bÞ
b̂w; RY
Ĥ ðID;RÞ
e T w T̂ wbwb̂w ; g
¼ e ðg bÞ
bw; Y H ðID;RÞ
Ĥ ðID;RÞ
:
Finally, we get
g ab ¼ T w T̂ wbwb̂w
1bw H ðID;RÞ Ĥ ðID;RÞð Þ
:
Probability Analysis. To evaluate the success probabil-ity for algorithm B , we analyze the four events neededfor B to succeed:
. E 1: B does not abort as a result of any of A ’s Extractqueries.
. E 2: B does not abort as a result of any of A ’s tagqueries.
. E 3: A generates a valid block-tag forgery ðF w; T wÞ.
. E 4: Event E 3, cw ¼ 0 for the tuple containing F w inTab2 and d w ¼ 0 for the tuple containing F w in Tab3.
B succeeds if all of these events happen. Theprobability Pr½E 1 ^ E 2 ^ E 3 ^ E 4 decomposes as
Pr½E 1 ^ E 2 ^ E 3 ^ E 4
¼ Pr½E 1 Pr½E 2jE 1 Pr½E 3jE 1; E 2 Pr½E 4jE 1; E 2; E 3
¼ ð1 ÞQE ð1 ÞQT :
Thus, in B ’s simulation environment, A can forge avalid block-tag pair with probability
̂ P 1 ¼ ð1 ÞQE ð1 ÞQT :
Algorithm B ’s running time is the same as A ’s runningtime plus the time which is taken to respond to QH H queries, Qh h queries, Qh1 h1 queries, QE Extract queriesand QT tag queries. Hence, the total running time is at mostt̂ t þ OðQH Þ þ OðQE Þ þ OðQhÞ þ OðQh1Þ þ OðQT Þ. B ytaking use of the oracle replay technique [32], A can gettwo different tags on the same block and randomness withthe probability0 1=9withinthetime t0 23ðQH þQhÞt̂̂
1,
i.e., t0 23ðQH þQhÞðtþOðQH ÞþOðQE ÞþOðQhÞþOðQh1 ÞþOðQT ÞÞ
ð1ÞQE ð1 ÞQT . After
obtaining the two different tags on the same block andrandomness, algorithm B can break the CDH problem. Itcontradicts the assumption that ðG 1; G 2Þ is a GDH grouppair. This completes the proof. Ì
Lemma 1 states that the untrusted CS has no ability toforge the single tag. But, can the untrusted CS forge theaggregated block-tag pair to cheat the verifier? First, when allthe stored block-tag pairs are challenged, we study whetherthe untrusted CS can aggregate the fake block-tag pairs tocheat the verifier. It corresponds to Lemma 2. Second, whensome stored block-tag pairs are not challenged, we study
whether the untrusted CS can substitute the valid unchal-lenged block-tag pairs for the modified block-tag pairs tocheat the verifier. It corresponds to Lemma 3.
Lemma 2. If all the stored block-tag pairs are challengedand some block-tag pairs are modified, the malicious CSaggregates the fake block-tag pairs which are different fromthe challenged block-tag pairs. The combined block-tag pairðF̂ ; T Þ (i.e., response) only can pass the verification withnegligible probability.
Proof. We can use proof by contradiction to prove Lemma 2.For the challenge chal ¼ ðc; k1; k2Þ, the following para-
meters can be calculated
vi ¼ k1 ðiÞ; hi ¼ h N vi ; CS lvi ; vi
; ai ¼ f k2 ðiÞ:
WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 337
-
8/20/2019 CSE Identity-Based Distributed Provable Data
11/13
Assume the combined block-tag pair ð F̂ ; T Þ is forgedand it canpass theverification, where F̂ ¼ ð F̂ 1; . . . ; F̂ sÞ, i.e.,
eðT; g Þ ¼ eYci¼1
haiiYs j¼1
uF̂ j
j ; RY H ðID;RÞ
!
¼ e
Yc
i¼1
haii Ys
j¼1
u
Pci¼1
ai F̂ vi j j ; RY
H ðID;RÞ
!:
Then,
Yci¼1
e T vi ; g ð Þai ¼
Yci¼1
e hiYs j¼1
uF̂ vi j
j ; RY H ðID;RÞ
!ai:
Let d be a generator of G 2. There exist xi; yi 2 Z q withthe following properties:
e T vi ; g ð Þ ¼ d xi ; e hi
Ys j¼1
uF̂ vi j
j ; RY H ðID;RÞ
!¼ d yi :
We can get
d Pc
i¼1 aixi ¼ d
Pci¼1
ai yi
Xci¼1
aiðxi yiÞ ¼ 0: (3)
Since the single tag is existentially unforgeable (Lemma1), there exist at least two different indices i such thatxi 6¼ yi. Suppose there exist s c index pairs ðxi; yiÞ withthe property xi 6¼ yi. Then, there exist q
s1 tuplesða1; a2; . . . ; acÞ which satisfy the above equation (3). Sinceða1; a2; . . . ; acÞ is a random vector, the equation (3) holdsonly with the probability less than q s1=q c q c1=q c ¼ q 1.It is negligible. Thus, if some fake block-tag pairs areaggregated, the combined block-tag pair ð F̂ ; T Þ only canpass the verification with negligible probability. Ì
When some modified block-tag pairs are challenged andsome valid block-tag pairs are unchallenged, the maliciousCS still can not cheat the verifier.
Lemma 3. If some challenged block-tag pairs are modified, themalicious CS substitutes the other valid block-tag pairs (whichare not challenged) for the modified block-tag pairs (which arechallenged). The combined block-tag pair ð F̂ ; T Þ (i.e., response)
only can pass the verification with negligible probability.
Proof. Define the modified block-tag pair index set as Mand the valid block-tag index set as M. Let the verifier’sc ha ll eng e b e chal ¼ ðc; k1; k2Þ and S ¼ fk1 ð1Þ; . . . ;k1ðcÞg. For the CS set P , suppose some challenged block-tag pairs fðF l; T lÞ; l 2 S 1 Sg are modified, i.e.,S 1 ¼ S
TM 6¼ F . F denot es t he empt y set. T he
corresponding CSs (whose modified block-tag pairsare challenged) substitute the other valid block-tag pairsfðF ̂l; T ̂lÞ; l̂ 2 S 2
Mg (which are not challenged) forfðF l; T lÞ; l 2 S 1g where jS 1j ¼ jS 2j. By making use of theforged i ¼ ðF̂
ðiÞ; T ðiÞÞ, the combined response ¼ ðF̂ ; T Þ
only can pass verification with negligible probability.
If the challenged block-tag pairs ðF l; T lÞ; l 2 S 1 aremodified, P substitutes the other valid block-tag pairs
ðF ̂l; T ̂lÞ for them. For 1 i c, the following parametersare calculated
ai ¼ f k2 ðiÞ; vi ¼ k1 ðiÞ; hi ¼ h N vi ; CS lvi ; vi
:
Based on the forgery process, we know the forgedresponse ¼ ðF̂ ; T Þ satisfies the formulas below
T ¼ Yvi2S T
M
T aivi Yvi2S 2
T aivi
F̂ j ¼X
vi2S T
M
aiF vi j þX
vi2S 2
aiF vi j; 1 j s
where F̂ ¼ ð F̂ 1; F̂ 2; . . . ; F̂ sÞ.If the forged response can pass the verification, then
eðT; g Þ ¼ eYci¼1
haiiYs j¼1
uF̂ j
j ; RY H ðID;RÞ
!i.e.,
eY
vi2S T
MT aivi
Yvi2S 2
T aivi ; g 0B@ 1CA
¼ eY
vi2S T
M
haiiYs j¼1
u
Pvi2S T
MaiF vi j
j
0B@
1CA
0B@
Y
vi 2S 1
haiiYs j¼1
u
Pvi 2S 2
ai F vi j
j
!; RY H ðID;RÞ
!:
(4)
For the index in the set S T M, we can get
eY
vi2S T
M
T aivi ; g 0B@ 1CA
¼ eY
vi2S T
M
haiiYs j¼1
u
Pvi 2S
T M
aiF vi j
j ; RY H ðID;RÞ
0B@
1CA:(5)
According to the formulas (4), (5), we can get theformula below
e Yvi2S 2
T aivi
; g !¼ e Yvi2S 1
haii Ys
j¼1
uPvi 2S 2 aiF vi j j ; RY H ðID;RÞ !:(6)
On the other hand, we can get the following formula by making use of the tag scheme
eY
vi2S 2
T aivi ;g
!¼ e
Yvi2S 2
haiiYs j¼1
u
Pvi 2S 2
aiF vi j
j ; RY H ðID;RÞ
!: (7)
From the formulas (6), (7), we can get
Yvi2S 1 haii ¼ Yvi 2S 2 h
aii : (8)
Since h is a cryptographic hash function which iscollision-free, the probability that the equation (8) holds
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015338
-
8/20/2019 CSE Identity-Based Distributed Provable Data
12/13
is 1=q , i.e., it is negligible. The combined block-tag pairð F̂ ; T Þ (i.e., response) only can pass the verification with
negligible probability. Ì
Thus, from the above three lemmas (Lemma 1, Lemma 2,Lemma 3), we have the following claim.
Theorem 1. The proposed ID-DPDP protocol is existentiallyunforgeable in the random oracle model if the CDH problem on G 1 is hard.
Proof. Let the stored index-block pair set be fð1; F 1Þ; ð2; F 2Þ;. . . ; ðn; F nÞg and the CS set be P . For 1 i n, each block F i is split into s sectors, i.e., F i ¼ f ~F i1; ~F i2; . . . ;~F isg. Then, the hash function h1 is used on these
sectors, i.e., F ij ¼ h1ð~
F ijÞ. Then, we get the hash value setfF ijg. Let the selected identity set be I ¼ fID1; ID2;. . . ; ID~ng. Define the challenger as B and the adversary asA . Algorithm B is given ðg; g a; g bÞ 2 G 1. Its goal is tooutput g ab 2 G 1.
The Setup; ExtractOracle; HOracle; hOracle, h1Oracle; TagGen Oracle procedures are the same as thecorresponding procedures in the proof of Lemma 1.
Let the challenge be chal ¼ ðc; k1; k2Þ and the forgedresponse be ¼ ðF̂ ; T Þ where F̂ ¼ ð F̂ 1; F̂ 2; ; F̂ sÞ. As-sume that can pass the verification, i.e.,
eðT; g Þ ¼ eYci¼1 h
ai
iYs j¼1 u
F̂ j
j ; RY H ðID;RÞ !
(9)
where hi ¼ hðN k1 ðiÞ; CS lk1 ðiÞ; k1 ðiÞÞ, ai ¼ f k2ðiÞ, 1 i c.
According to Lemma 1 and Lemma 2, if somechallenged block-tag pairs are modified, the verificationformula (9) only holds with negligible probability. Sincethe verifier does not store block-tag pairs, A cansubstitute the other valid block-tag pairs for themodified block-tag pairs. According to Lemma 3, thesubstituted response only can pass the verification withnegligible probability, i.e., the formula (9) only holdswith negligible probability.
Thus, in the random oracle, except with negligibleprobability no adversary can cause the verifier to acceptthe challenged response if some challenged block-tagpairs are modified. Theorem 1 is proved. Ì
Theorem 2. Suppose that n block-tag pairs are stored, d block-tag pairs are modified and c block-tag pairs are challenged.T h e n , o u r p r o p o s e d I D - D P D P p r o t o c o l i sðd=n; 1 ððn d Þ=nÞcÞ-secure, i.e.,
1 n d
n
c P X 1
n c þ 1 d
n c þ 1
c
where P X denotes the probability of detecting themodification.
Proof. Let the challenged block-tag pair set be S and themodified block-tag pair set be M. Let the randomvariable X be X ¼ jS
TMj. According to the defini-
tion of P X , we can get
P X ¼ P fX 1g
¼ 1 P fX ¼ 0g
¼ 1 n d
n
n 1 d
n 1
n c þ 1 d
n c þ 1 :
Thus, we can get
1 n d
n
c P X 1
n c þ 1 d
n c þ 1
c:
This completes the proof of the theorem. Ì
Fig. 8 shows the probability curve of P X . From the picture,we know that our protocol has high integrity checkingprobability.
5 CONCLUSION
In multicloud storage, this paper formalizes the ID-DPDPsystem model and security model. At the same time, wepropose the first ID-DPDP protocol which is provablysecure under the assumption that the CDH problem ishard. Besides of the elimination of certificate management,our ID-DPDP protocol has also flexibility and highefficiency. At the same time, the proposed ID-DPDPprotocol can realize private verification, delegated verifi-cation and public verification based on the client’sauthorization.
ACKNOWLEDGMENT
The author sincerely thanks the Editor for allocatingqualified and valuable referees. The author sincerely thanksthe anonymous referees for their very valuable comments.This work was partly supported by the National NaturalScience Foundation of China (No. 61272522) and by by theOpen Foundation of State Key Laboratory of InformationSecurity of China (No. 2013-3-4).
REFERENCES
[1] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner,
Z. Peterson, and D. Song, ‘‘Provable Data Possession atUntrusted Stores,’’ in Proc. CCS, 2007, pp. 598-609.[2] G. Ateniese, R. DiPietro, L.V. Mancini, and G. Tsudik, ‘‘Scalable
and Efficient Provable Data Possession,’’ in Proc. SecureComm,2008, pp. 1-10.
Fig. 8. Probability curve P X of detecting the corruption.
WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 339
-
8/20/2019 CSE Identity-Based Distributed Provable Data
13/13
[3] C.C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia,‘‘Dynamic Provable Data Possession,’’ in Proc. CCS, 2009,pp. 213-222.
[4] F. Sebé, J. Domingo-Ferrer, A. Martı́nez-Ballesté, Y. Deswarte,and J. Quisquater, ‘‘Efficient Remote Data Integrity Checkingin Critical Information Infrastructures,’’ IEEE Trans. Knowl.Data Eng., vol. 20, no. 8, pp. 1034-1038, Aug. 2008.
[5] H.Q. Wang. (2013, Oct./Dec.). Proxy Provable Data Possession inPublic Clouds. IEEE Trans. Serv. Comput. [Online]. 6(4), pp. 551-559. Available: http://doi.ieeecomputersociety.org/10.1109/
TSC.2012.35.[6] Y. Zhu, H. Hu, G.J. Ahn, and M. Yu, ‘‘Cooperative Provable DataPossession for Integrity Verification in Multicloud Storage,’’IEEE Trans. Parallel Distrib. Syst., vol. 23, no. 12, pp. 2231-2244,Dec. 2012.
[7] Y. Zhu, H. Wang, Z. Hu, G.J. Ahn, H. Hu, and S.S. Yau, ‘‘EfficientProvable Data Possession for Hybrid Clouds,’’ in Proc. CCS, 2010,pp. 756-758.
[8] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, ‘‘MR-PDP:Multiple-Replica Provable Data Possession,’’ in Proc. ICDCS,2008, pp. 411-420.
[9] A.F. Barsoum and M.A. Hasan, ‘‘Provable possession andreplication of data over cloud servers,’’ Centre Appl. Cryptogr.Res., Univ. Waterloo, Waterloo, ON, Canada, Rep. 2010/32.[Online]. Available: http://www.cacr.math.uwaterloo.ca/techreports/2010/cacr2010-32.pdf .
[10] Z. Hao and N. Yu, ‘‘A Multiple-Replica Remote Data PossessionChecking Protocol with Public Verifiability,’’ in Proc. 2nd Int.Symp. Data, Privacy, E-Comm., 2010, pp. 84-89.
[11] A.F. Barsoum and M.A. Hasan, ‘‘On verifying dynamic multipledata copies over cloud servers,’’ Int. Assoc. Cryptol. Res., NewYork, NY, USA, IACR eprint Rep. 447, 2011. [Online]. Available:http://eprint.iacr.org/2011/447.pdf .
[12] A. Juels and B.S. Kaliski, Jr., ‘‘PORs: Proofs of Retrievability forLarge Files,’’ in Proc. CCS, 2007, pp. 584-597.
[13] H. Shacham and B. Waters, ‘‘Compact Proofs of Retrievability,’’in Proc. ASIACRYPT , vol. 5350, LNCS, 2008, pp. 90-107.
[14] K.D. Bowers, A. Juels, and A. Oprea, ‘‘Proofs of Retrievability:Theory and Implementation,’’ in Proc. CCSW , 2009, pp. 43-54.
[15] Q. Zheng and S. Xu, ‘‘Fair and Dynamic Proofs of Retrievability,’’in Proc. CODASPY , 2011, pp. 237-248.
[16] Y. Dodis, S. Vadhan, and D. Wichs, ‘‘Proofs of Retrievability viaHardness Amplification,’’ in Proc. TCC, vol. 5444, LNCS, 2009,pp. 109-127.
[17] Y. Zhu, H. Wang, Z. Hu, G.J. Ahn, and H. Hu, ‘‘Zero-KnowledgeProofs of Retrievability,’’ Sci. Chin. Inf. Sci., vol. 54, no. 8,pp. 1608-1617, Aug. 2011.
[18] C. Wang, Q. Wang, K. Ren, and W. Lou, ‘‘Privacy-PreservingPublic Auditing for Data Storage Security in Cloud Computing,’’in Proc. IEEE INFOCOM, Mar. 2010.
[19] Q. Wang, C. Wang, K. Ren, W. Lou, and J. Li, ‘‘Enabling PublicAuditability and Data Dynamics for Storage Security in CloudComputing,’’ IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 5,pp. 847-859, May 2011.
[20] C. Wang, Q. Wang, K. Ren, N. Cao, and W. Lou, ‘‘Toward Secureand Dependable Storage Services in Cloud Computing,’’ IEEETrans. Serv. Comput., vol. 5, no. 2, pp. 220-232, Apr./June 2012.
[21] Y. Zhu, G.J. Ahn, H. Hu, S.S. Yau, H.G. An, and S. Chen. (2013,Apr./June). Dynamic Audit Services for Outsourced Storages inClouds. IEEE Trans. Serv. Comput. [Online]. 6(2), pp. 227-238.Available: http://doi.ieeecomputersociety.org/10.1109/TSC.2011.51.
[22] O. Goldreich, Foundations of Cryptography: Basic Tools . Beijing,China: Publishing House of Electronics Industry, 2003,pp. 194-195.
[23] D. Boneh and M. Franklin, ‘‘Identity-Based Encryption from theWeil Pairing,’’ in Proc. CRYPTO, vol. 2139, LNCS, 2001,pp. 213-229.
[24] A. Miyaji, M. Nakabayashi, and S. Takano, ‘‘New ExplicitConditions of Elliptic Curve Traces for FR-Reduction,’’ IEICETrans. Fundam., vol. E84A, no. 5, pp. 1234-1243, May 2001.
[25] D. Boneh, B. Lynn, and H. Shacham, ‘‘Short Signatures from theWeil Pairing,’’ in Proc. ASIACRYPT , vol. 2248, LNCS, 2001,pp. 514-532.
[26] H.W. Lim, ‘‘On the Application of Identity-Based Cryptography
in Grid Security,’’ Ph.D. dissertation, Univ. London, London,U.K., 2006.[27] S. Yu, K. Ren, and W. Lou, ‘‘FDAC: Toward Fine-Grained
Distributed Data Access Control in Wireless Sensor Networks,’’IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 4, pp. 673-686,Apr. 2011.
[28] S. Yu, K. Ren, and W. Lou, ‘‘Attribute-Based on-DemandMulticast Group Setup with Membership Anonymity,’’ Calc.Netw., vol. 54, no. 3, pp. 377-386, Feb. 2010.
[29] P.S.L.M. Barreto, B. Lynn, and M. Scott, ‘‘Efficient Implementa-tion of Pairing-Based Cryptosystems,’’ J. Cryptol., vol. 17, no. 4,pp. 321-334, Sept. 2004.
[30] A.F. Barsoum and M.A. Hasan, ‘‘Integrity Verification of Multiple Data Copies over Untrusted Cloud Servers,’’ in Proc.12th IEEE/ACM Int. Symp. CCGRID, 2012, pp. 829-834.
[31] C. Research ‘‘SEC 2: Recommended Elliptic Curve Domain
Parameters.’’ [Online]. Available: http://www.secg.org/collateral/sec2_final.pdf .
[32] D. Pointcheval and J. Stern, ‘‘Security Arguments for DigitalSignatures and Blind Signatures,’’ J. Cryptol., vol. 13, no. 3,pp. 361-396, July 2000.
[33] The GNU Multiple Precision Arithmetic Library (GMP). [On-line]. Available: http://gmplib.org/.
[34] Multiprecision Integer and Rational Arithmetic C/C++ Library(MIRACL). [ Online]. Available: http://certivox.com/.
[35] The Pairing-Based Cryptography Library (PBC). [Online]. Avail-able: http://crypto.stanford.edu/pbc/howto.html.
[36] IEEE P1363: Hybrid Schemes. [Online]. Available: http://grouper.ieee.org/groups/1363/StudyGroup/Hybrid.html.
[37] B. Lynn, ‘‘On the implementation of pairing-based cryptosys-tems,’’ Ph.D. dissertation, Stanford Univ., Stanford, CA, USA,2008. [Online]. Available: http://crypto.stanford.edu/pbc/thesis.pdf .
Huaqun Wang received the BS degree inmathematics education from the ShandongNormal University, Jinan, China, in 1997, theMS degree in applied mathematics from the EastChina Normal University, Shanghai, China, in2000, and the PhD degree in cryptography fromNanjing University of Posts and Telecommuni-cations, Nanjing, China, in 2006. Since then, hehas been with Dalian Ocean University, Dalian,China, as an Associate Professor. His researchinterests include applied cryptography, network
security, and cloud computing security. Dr. Wang has published morethan 40 papers. He has served in the program committee of severalinternational conferences and the editor board of international journals.
. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015340