sadgur: secure auditing, deduplication and group user ... · batch auditing and reduces the time...
TRANSCRIPT
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 1
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP
USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M1, Mithila Lakshmi G1, Shreyas Raju R G1, Raghavendra S2, Rajkumar Buyya3,
Venugopal K R4, S S Iyengar5, and L M Patnaik6
1Department of Computer Science and Engineering, University Visvesvaraya College of
Engineering, Bangalore University, Contact:[email protected]
2Department of Computer Science and Engineering, Vivekananda College of Engineering and
Technology, Puttur, India
3Cloud Computing and Distributed Systems (CLOUDS) Lab, School of Computing and Information
Systems, The University of Melbourne, Australia
4Bangalore University, Bengaluru, India
5Department of Computer Science and Engineering, Florida International University, USA
6INSA, National Institute of Advanced Studies, Indian Institute of Science Campus, Bengaluru, India
ABSTRACT:
With cloud storage and sharing facilities provided by the cloud, customers can effectively change
and distribute information as a cluster. To convince distributed information sincerity be certified
publicly customers in the cluster need to estimate signatures on complete chunks in distributed
information. Distinct chunks in combined information are signed by distinct customers because of
information alterations carried out by individual user. For safety purpose, when a customer is
discarded from the cluster, the chunks that have been formerly signed by the renunciated client
required to be re-signed by the prevailing client. This approach is inefficient due to the huge
amount of distributed information in the cloud. By exploiting the approach of proxy re-signatures,
the cloud is authorized to re-sign chunks in support of current customers while customer
repudiation. When individual customers deploy the identical information to the cloud storage,
repository space has identical copies, hence deduplication technology is normally utilized to lower
the volume and bandwidth prerequisites of utilities by removing repetitious information and saving
only a single copy of them. Aiming at realizing both information honesty and deduplication in
cloud, we introduce a novel Secure Auditing, Deduplication and Group User Revocation of Shared
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 2
Data in Cloud (SADGUR). Our scheme is collusion resistant, supports efficient customer
revocation, CSP efficiently re-signs the revoked customer blocks. Performance analysis show that
our mechanism achieves protected file level and block level deduplication and has reduced the time
cost of tag generation, supports efficient batch auditing and average auditing time cost is decreased
compared to the existing mechanism.
Keywords: Cloud Computing, Deduplication, Public Auditing, Proof of Retrievability, Proof of
Ownership, User Revocation.
[1] INTRODUCTION
Distributed repository is a characteristic of networked organization repository where
information is hoarded in virtualized pools of repository that are globally accommodated by third
parties. Distributed repository grants customers with advantages, varying from reduction in cost and
decreased benefits, to portability conveniences and extensible utilities. These important
characteristics overwhelm the clients to make use of and save their respective documents to the
distributed repository. Though distributed depository scheme has been widely ratified, it is
unsuccessful to furnish few fundamental advanced requirements for instance, the proficiency of
verifying sincerity of cloud data and identifying duplicated documents by distributed servers.
With information repository and distributing facilities (e.g., Drop-box and Google Drive)
managed by the CSP, clients can comfortably function jointly as a cluster by distributing information
with one another. Additionally, once a customer generates distributed information in the cloud, each
customer in the cluster not only retrieves and change distributed information,
but also distributes the most recent version of the distributed information with all the customers of
the cluster. Even though cloud suppliers assure a protected and trustworthy setting to the customers,
the sincerity of information in the cloud may be negotiated, because of the presence of
hardware/software breakdown and individual mistakes [1]. To preserve the honesty of information in
the cloud, a signature is appended to each chunk of the document, and the sincerity of data depends
upon the preciseness of all the signatures. Hence, a public examiner systematically verifies the
information sincerity in the cloud without retrieving the complete information, known as public
auditing.
Almost all of the existing schemes [2], [3], aim on confirming the sincerity of individual
information. But, none of these schemes consider the adeptness of customer repudiation when
verifying the reliability of distributed information in the cloud. With distributed data, when a
customer alters a chunk, he further estimates a fresh signature for the altered chunk. As a result of
the changes made by distinct customers, different chunks are signed by distinct customers. For safety
purpose, while a customer misconducts, the customer need to be repudiated from the cluster.
Therefore, the repudiated customer is not able to retrieve and alter the distributed data. As a result,
though the content of distributed information is not altered while client renunciation, the chunks that
were earlier signed by the renunciated client, are re-signed by the Cloud Service Provider (CSP). By
exploiting the concept of agent re-signatures [4], the CSP is authorized to re-sign blocks in support
of current customers during customer repudiation. Therefore, the honesty of the whole data can be
validated utilizing public keys of clients in the cluster.
Since the cloud services have been utilized globally, it is associated with increasing size of
information hoarded at distant distributed servers. Amidst these remote cached documents, almost all
of them are identical; as per the study by EMC [5], 75 percent of present digital information are
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 3
identical documents. This evidence introduces a innovation namely deduplication, where the
distributed servers deduplicate by maintaining only one copy for each file (or chunk) and create a
link to the document (or chunk) for each user who claims or
inquires to save the identical document (or chunk). We propose Secure Auditing, Deduplication and
Group User Revocation of Shared Data in Cloud (SADGUR) scheme in which the CSP
performs efficient deduplication on the information uploaded by cluster manager as well as checks
for deduplication of the existing customers chunks. Further, the public verifier efficiently performs
shared information sincerity verification and also supports batch auditing.
1.1 Motivation
Presently the customers and organizations are attracted by the storage and data sharing
facilities administered by the cloud. When a group of customers are sharing data, there may
be a possibility of a customer to perform malicious activity. So the cluster manager detects such a
customer and repudiates him from the group. In addition to this, one basic challenge of
today’s cloud repository service is the administration of the ever-increasing capacity of information.
Instead of preserving infinite data duplicates, deduplication removes repetitive
information by maintaining only one genuine document and pointing other repetitive information to
that document. This paper focuses on shared data auditing with efficient customer repudiation and
also perform efficient integrity auditing and deduplication of the information uploaded by cluster
manager as well as checks for deduplication of the existing customer’s chunks. The scheme supports
batch auditing and reduces the time cost of tag generation.
1.2 Contribution
In this paper, we introduce a new Secure Auditing, Deduplication and Group User
Revocation of Shared Data in Cloud (SADGUR) that supports secure file level and chunk
level deduplication, safe integrity auditing of revoked customer chunks and integrity verification of
shared information by public verifier. Notably, our contributions can be outlined as follows:
We present a novel Secure Auditing, Deduplication and Group User Revocation of Shared
Data in Cloud (SADGUR) scheme.
The scheme is collusion resistant, supports efficient customer revocation, CSP efficiently
audits and re-signs the revoked customer blocks.
The scheme supports secure file level and blocks level deduplication and has reduced the
time cost of tag generation.
The scheme further supports efficient batch auditing and average auditing time cost is
decreased compared to the existing mechanism.
Performance analysis proves the efficiency and effectiveness of SADGUR.
1.3 Organization
The rest of the paper is organized as follows: Related works is outlined in Section 2 which
gives the pros and cons on existing integrity auditing and deduplication schemes. In Section 3, earlier
models and their drawbacks are discussed and several preliminaries are discussed in Section 4.
Problem statement and system framework are discussed in Section 5. In Section 6, scheme details of
Secure Auditing, Deduplication and Group User Revocation of Shared Data in Cloud has been
discussed. Section 7, presents Security analysis. Experiment results are analyzed in Section 8. We
conclude the paper in Section 9.
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 4
[2] RELATED WORKS
In this section we perform review on both sincerity verification, protected deduplication and
provide a table of comparison of recent existing schemes in both areas.
2.1 Integrity Auditing
Verifiable information possession scheme [6] assures that the distributed servers retain the
target documents without fetching the complete information. Further, Ateniese et al., [7] introduced
an effective PDP mechanism that does not support insertion operation. Zhu et al., [8] designed the
collaborative PDP in multi-cloud repository and Proof of Retrievability (PoR) [2] mechanism that
supports sincerity verification. In comparison with PDP, PoR guarantees the distributed servers
retain the target documents and also promises their full restoration. In [2], customers employ
obliteration ciphers and create proof for each chunk for provability and recoverability. In order to
realize productive information dynamics, Wang et al., [3] enhanced the PoR framework by
employing the Merkle Hash Tree construction for chunk label confirmation. Most Significant Index
Generation Technique (MSIGT) [9], enhances protected and adequate label construction period. The
advantage of the mechanism is that it decreases the cost of the information proprietor.
A protected multi-proprietor data collaboration for active group in the cloud [10] is
constructed with RSA Chinese Remainder Theorem (RSA-CRT). The advantages of the mechanism
are, it has minimum depository and data processing cost. The disadvantage is that it does not support
multi-media documents. Jiang et al., [11] introduced a dynamic public honesty
validation mechanism with defended group customer repudiation. The proposed scheme supports the
cluster information encoding and decoding at the time of data repair process and accomplish adept
and safe customer renouncement. The limitation is that the scheme has high reckoning cost effort in
the setup phase. Mastering C++ is used to carry out simulations in C++ [12].
Fu et al., [13] analyzed a Privacy-aware Public (NPP) validation scheme, for the distributed
cloud information with various cluster proprietors. Advantages of the scheme are that it supports
individuality secrecy, traceability, non-frameability and group user repudiation. Limitation of the
mechanism is that it has more correspondence cost. Wang et al., [14] proposed an Identification-
Based Data Outsourcing (IBDO) framework in a multi-client environment. The proposed scheme
supports character-based property and the extensive verification. The drawback is that the time cost
of the auditor is more.
2.2 Secure Deduplication
Data deduplication is a distinct information contraction procedure for deleting identical
documents of repetitious information in repository. The method is employed to increase repository
usage and decrease bandwidth utilization. Halevi et al., [21] presented the proof of proprietorship
convention that authorizes a customer effectively justify to a server that the customer particularly
possess the document. Ng et al., [22] originally studied the secret data deduplication as a counterpart
of public information deduplication convention of Halevi et al [21].
Venugopal et al., [23] have applied soft computation procedures for data mining applications
for repository. Hur et al., [18] studied server-side deduplication method that permits the cloud server
to control access to cached data though the ownership restores diligently. This opposes data leakage
to eliminated users and cloud repository server. It involves extra computational overhead. Zheng et
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 5
al., [19] suggested reliable deduplication technique that bolsters safe deduplication with
active video preservation against mischievous customers and dishonest cloud. The drawback
of the scheme is that the estimation cost is high in case of distributed servers.
Considering the security of sensitive information, Raghavendra et al., [24]
introduced the region and dimension limited multi-keyword inquiry technique. The method
minimizes label depository capacity and preserves the secrecy of keywords. The mechanism
needs more inquiry time on the image information set. Xu et al., [16] presents a dynamic
sincerity authentication scheme for distant information repository of version documents.
Advantages of the scheme are that it adequately extends the inclusion of confirmed
documents, achieves quick and extensive authentication for distinct version data. The
scheme does not support fast recovery of any version document at any time.
Jiang et al., [17] introduced μR-MLE2 (μRandomized- Message Locked Encryption
2) scheme. The advantage of the scheme is that the server decreases the time intricacy of
deduplication similarity test and achieves superior fulfilment for data equality test. The
disadvantage is that μR-MLE2 requires more deduplication testing time. Comparison of
Integrity Auditing and Deduplication schemes of recent works is shown in Table I.
[3] BACKGROUND WORK
Li et al., [20] designed two reliable schemes, SecCloud and SecCloud+ that achieves both
data forthrightness and deduplication in cloud. SecCloud uses MapReduce cloud that creates labels
for the information and also validates the truthfulness of the information cached in the cloud. In
SecCloud, the reckoning cost of the customer is significantly reduced. SecCloud+ bolsters sincerity
validation and protected deduplication on the encrypted data. Wang et al., [4] presents a public
reviewing framework for the truthfulness of deployed information with adequate customer
renunciation. By adopting the concept of mediator re-signatures, the CSP is acknowledged to re-sign
blocks for prevailing users during user renunciation. Further, the public examiner regularly validates
the truthfulness of collaborative information without fetching the entire information from the cloud
and also supports batch auditing. Limitation is that the scheme is not collusion resistant.
[4] PRELIMINARIES
The preliminaries form the foundations of Secure Auditing, Deduplication and Group User
Revocation of Shared Data in cloud scheme and are discussed below.
4.1 Bilinear Map: G and GT are two cyclic multiplicative groups of large prime order p. A bilinear
pairing is a map e: G * G → GT with the subsequent properties [25]:
Bilinear: e (g1c, g2 d) = e (g1, g2) cd and c, d ϵR Zp;
Non-degenerate: There exists g1, g2 ϵ G such that e (g1, g2) ≠ 1;
Computable: An effective algorithm prevails to estimate e (g1, g2) for all (g1, g2) ϵR G.
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 6
Computational Diffie-Hellman (CDH) Problem: The Computational Diffie-Hellman problem is that,
given g, gk, gl ϵ G for unknown k, l ϵ Zp, to estimate gkl.
4.2 Convergent Encryption
Convergent encryption [26], [27] bolsters information secrecy in deduplication. A customer
(or original customer) obtains a convergent key from the information content and
encodes the document with the convergent key. Further, the customer develops a label for the
document, such that the label is exploited to recognize corresponding copies. Here, we
TABLE I. COMPARISON OF INTEGRITY AUDITING AND DEDUPLICATION SCHEMES IN CLOUD
Authors Concept Performance Advantages Disadvantages Wang et
al.,, 2017
[14]
IBDO scheme allows a customer
and her approved delegates to
safely deploy documents to a
distant cloud server and facilitates
comprehensive auditing.
Achieves a high detection
probability of corruption
by
several auditors.
Supports identity-based
feature and the
comprehensive
auditing.
Time cost of the
auditor is more.
Fu et al.,,
2017
[13]
Novel Privacy-Preservation
(NPP) public verification
mechanism for various cluster
owners in shared cloud
repository.
NPP has the minimum
computation cost
compared with Knox and
PDM.
Supports identity privacy,
traceability, non-
frameability and group
user revocation.
Has more
communication
cost.
Jiang et al,
2016 [11]
Public sincerity verification
for distributed vital cloud
information with cluster customer
repudiation.
Verify algorithm has more
computation overhead.
Supports public auditing,
effective customer
repudiation
with confidentiality,
efficiency, countability
and traceability.
Expensive
computational
effort in the Setup
phase.
Zhang et
al.,
2016 [15]
Efficient chameleon hashing
based privacy-preserving
auditing.
Auditor has less
computation
cost.
Identity privacy preserved,
low computation cost.
Cloud server has
large computation
cost.
Wang et
al.,
2015 [4]
Public verification for combined
information with effective client
repudiation
No communication
overhead to existing
customers during
customer repudiation,
cloud has reduced
computation cost.
Secure User Revocation,
Public
auditing.
Collusion of
repudiated
customer
and cloud.
Xu et al.,
2018 [16]
Effective sincerity validation
scheme for distant data repository
of version documents.
The scheme effectively
expands the inclusion of
confirmed documents,
improves the efficiency of
file verification.
Achieves speedy and
extensive verification for
distinct version of
information.
The scheme does
not support active
recovery of any
version
document at any
time.
Jiang et al.,
2017 [17]
μR-MLE2 (μ Randomized-
Message Locked Encryption
2) minimizes the linear pairing
comparison times of the R-MLE2
to nearly logarithmic times.
The server decreases the
time complexity of
deduplication similarity
test.
Achieves greater
accomplishment
for data equality test
though the number of
information elements is
relatively immense.
μR-MLE2 requires
more
deduplication
testing time.
Hur et al., Protected information Low communication Prevents data leakage, Incurs additional
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 7
2016 [18] deduplication with active
ownership administration in cloud
repository.
overhead, guaranteed tag
consistency.
Assures information
truthfulness.
computational
overhead.
Zheng et
al.,
2016 [19]
Encrypted Cloud Media Center
with Secure Deduplication
Incurs little storage
overhead, more
computation cost.
Guard against brute force
assaults.
High computation
overhead.
Li and Jin,
2015 [20]
Secure verification and
deduplication of information in
cloud.
More auditing time cost. Secure deduplication on
encrypted data.
Increased time
cost response.
SADGUR Secure Auditing, Deduplication
and Group User Revocation
of Shared Data in Cloud.
Reduces the time cost of
label generation, reduced
batch auditing time cost.
The scheme is collusion
resistant, supports efficient
customer revocation, CSP
efficiently audits
and re-signs the revoked
customer blocks. Provides
secure document level and
chunk level deduplication,
Scheme does not
support sector
level auditing.
consider that the label accuracy feature [27] holds, i.e., if two documents are the identical, then their
labels are also identical. Conventionally, a convergent encryption mechanism has four
fundamental algorithms:
KeyGen(F): The key creation function receives a document content F as input and outputs
the convergent key ckF of F ;
Encrypt(ckF, F): The encryption function accepts the convergent key ckF and document
content F as input and outputs the ciphertext ctF ;
Decrypt(ckF, ctF): The decryption function receives the convergent key ckF and ciphertext ctF
as input and outputs the plaintext document F ;
TagGen (F): The label construction function accepts a document content F as input and
outputs the label tagF of F.
4.3 Homomorphic Authenticators
Homomorphic authenticators [6], grants a public examiner to examine the sincerity of
information hoarded in the cloud without fetching the complete information. The properties of
homomorphic verifiable signature mechanism are as follows:
(pk, sk) are the signers public/secret key pair, η1 is the signature on chunk η1 є Zp , and η2 is
the signature on chunk η2 є Zp.
Blockless provability: Given η1 and η2, two arbitrary values β1, β2 in Zp and a chunk
κ′ = β1 κ 1+ β2 κ2 є Zp, an auditor is capable of examining the accuracy of chunk κ′ without
the knowledge of κ 1 and κ2.
Non-elasticity: Given κ1 and κ2, η1 and η2, two arbitrary values β1, β2 in Zp and a chunk κ′ =
β1κ 1+ β2 κ2 є Zp, a customer, who do not have secret key sk, is not ready to create a
legitimate signature η’ on chunk κ′ by joining η1 and η2..
Blockless provability empowers an auditor to examine the sincerity of information in the cloud
with only a precise sequence of all the chunks via a challenge-and-response convention, and the
whole information need not be downloaded to the auditor. Non-elasticity illustrates that additional
entities, who are not possessing appropriate secret keys, cannot produce accurate signatures on
joined chunks by joining prevailing signatures.
4.4 Proxy Re-Signatures
Blaze et al., [28] suggested the agent re-signatures that will empower a semi-trusted
delegator to perform as interpreter of signatures among two customers. The delegator is unable to
determine any private keys of the two customers. In this paper, the semi-trusted Cloud Service
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 8
Provider (CSP) is permitted to perform as a delegator and transform signatures for customers while
customer repudiation. Conventional proxy re-signature schemes [28], [29] does not satisfy blockless
verifiability ie., if we utilize these schemes, the auditor has to retrieve entire data to audit the
sincerity, that automatically decreases the adeptness of verification. Therefore, we utilize
Homomorphic Authenticable Proxy Re-signature mechanism (HAPS) [4] that satisfies blockless
provability and non-modifiability. In our paper, the CSP checks the honesty of the repudiated
customer chunks and signs these chunks utilizing the re-signing key.
5. PROBLEM DEFINITION AND SYSTEM MODEL
5.1 Problem Definition
Given that the cluster manager encrypts and uploads file to the cloud server and cluster of
customers shares this file, the main objectives are:
The scheme is collusion resistant, supports efficient customer revocation; CSP efficiently re-
signs the revoked customer blocks.
The scheme supports secure file level and block level deduplication and has reduced the
time cost of tag generation.
The scheme further supports efficient batch auditing and average auditing time cost is
decreased compared to the existing mechanism.
5.2 System Model
Aiming at allowing auditable and deduplication of shared data repository, we propose Secure
Auditing, Deduplication and Group User Revocation of Shared Data in Cloud (SADGUR). The
cloud repository framework (shown in Fig. 1.) contains three objects:
Cluster manager with cluster of customers: The cluster consists of a cluster manager and
numerous customers. The cluster manager is the original proprietor of the information. This
cluster manager constructs and distributes data with prevailing customers in the cluster via
the cloud. Both the cluster manager and cluster customers are able to fetch, download and
alter distributed information. Distributed information is segregated into a number of chunks.
A customer in the cluster can alter a chunk in distributed information by accomplishing an
insert, delete or update operation on the chunk.
Cloud Servers: Distributed servers pragmatize the resources as per the needs of customers
and exhibit them as warehouse pools. Usually, the cloud customers are allowed to purchase
or hire repository space from distributed servers, and save their personal information in these
hired capacities for subsequent usage.
Auditor: The public auditor, for e.g., a user who desires to make use of cloud information for
distinct objectives (e.g., inquiry, estimation, data mining, etc.) or a Third-Party Auditor who
furnishes auditing assistance on information sincerity, aims to examine the sincerity of
distributed information via a challenge-and-response convention with the cloud.
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 9
Fig. 1. Cloud Storage Model
Cluster manager encrypts the document with the convergent key and outsource to the distributed
server. The CSP performs deduplication, if the file is existing in its storage, the CSP intimates cluster
manager that the file already exists and runs the PoW protocol. The cluster manager is allowed to
retrieve the file. If the document is not a duplicate copy, then the CSP saves the file. Further, the
group of customers shares the data that is uploaded by the cluster manager.
Shared data is divided into chunks and the existing customers perform modifications, signs with
their respective secret key τk and upload their respective chunks to the CSP. Now the CSP performs
deduplication, if the chunks are modified then the CSP saves the chunks else executes PoW protocol
with the respective existing customers. During this process, cluster manager keeps on watching every
activity of the existing customers. If the cluster manager finds any one of the existing customer
performing malicious activity or expiry of membership in the cluster, the cluster manager
immediately revokes him from the cluster withdrawing all his credentials and informs to the CSP.
In the proposed scheme, CSP is allowed to perform deduplication and integrity verification for
the revoked customer chunks [4]. After revoking the customer, cluster manager informs CSP to
verify the revoked customer chunks. The CSP performs deduplication and integrity verification for
the revoked customer chunks and re-signs with rke→f. While re-signing, we consider that the CSP
regularly transforms signatures of a repudiated customer into signatures of the cluster manager. After
re-signing, the cluster manager removes the customer’s id from Customer List (UL) and signs the
new UL. The auditing on distributed information integrity is carried out via a challenge-and-response
convention among the CSP and TPA. In particular, the cloud is capable of creating a proof of
ownership of distributed information in ProofGen as per the challenge of a TPA. In Proof-Verify, the
TPA examines the accuracy of the proof sent by the cloud.
6. THE ALGORITHM
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 10
In this section, we illustrate three conventions namely: file uploading convention, integrity
auditing convention and proof of ownership convention. Before our detailed explanation, we first
present the system setup phase of our scheme that initializes the public and secret specifications of
the model.
6.1 System setup
Let G1, G2 be two groups of order p, g be a generator of G1, e: G1 * G1 → G2 be a bilinear
map, w be another generator of G1. The global specifications are (e, p, G1, G2, g,
w, H) where H is a hash function with H:(0,1)* → G1. The overall number of chunks in distributed
information is n and distributed information is defined as S = (κ1, κ2 ... κn). The
total number of customers in the cluster is u.
6.2 File Uploading Protocol
Function: KeyGeneration
1) Generates the system public and secret parameters.
2) Input: u, u1, global parameter (g, Zp*)
3) Output: pki, ski
4) for each i upto u
5) Generate random number δi from Zp*
6) Assign Private key ski= δi
7) Compute Public key pki=g δi
8) u1 creates the UL, that comprises id’s of all customers in the cluster.
9) The UL is public and signed by u1. 10) End
Customer u1 is considered as the cluster manager of the cluster. Cluster manager constructs
private key si and public key pi for all the existing customers in the cluster as shown in the Function
KeyGeneration. Cluster manager also generates the customers list (UL) that consists of id′s of all the
existing customers of the cluster and publishes it as public. Cluster
Algorithm 1: SADGUR: Secure Auditing, Deduplication and Group User Revocation
of Shared Data in Cloud
Input: F1 = (κ1, κ 2,..... κ n), u, u1, δe, κk є Zp, idk where k є [1, n], pke , ηk, ρk rk e →f , Auditing
message.
Output: ρk, rk e →f, Auditing proof
(1) Phase 1: File Level Deduplication
(2) For each uploading file F1 by u1, the following tasks are performed:
(3) CSP checks for the deduplication of the file. If it is a new file then it moves to step 4. If the
file is present then PoW protocol is executed between CSP and u1.
(4) After the confirmation that there is no duplicate copy of the file, u1 divides the file into
chunks F1 = (κ1, κ 2,..... κ n) and encrypts the whole shared data and uploads to the CSP.
(5) CSP creates tag for each chunk that is generated dynamically using Pairing Based
Cryptography, where the tags are represented in the form of κ (x, y) where κ is block and
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 11
(x, y) is vector.
(6) Once the tag is generated for respective chunks, the keys of each chunk are sent to u1.
(7) Phase 2: Block Level Deduplication (8) Existing customer retrieves his respective chunks from the cloud server, performs
modifications and then signs with his private key δ and sends to the CSP.
(9) for each κk with idk
(10) Compute ηk = (H (idk), wκk )δe
(11) end for
(12) This process follows the following steps:
(13) CSP checks for the deduplication of the respective existing customer blocks. If it is a new
block then it moves to step 14. If the block is present then PoW protocol is executed
between CSP and the respective existing customer.
(14) If the block does not exist in the cloud then the existing customer uploads the modified
block to cloud.
(15) Phase 3: Integrity Auditing for Revoked Customer Chunks by CSP (16) u1 generates the re-signing key rk e →f, = pkf ske and sends to the CSP and informs CSP to
perform auditing and re-sign the revoked customer chunks.
(17) The CSP initially verifies e(ηk,, g)= ((H(idk), wκk ), pke). if the auditing result is 0, the CSP
outputs ⊥ else outputs ρk rk e →f = (H (idk), wκk) δf
(18) After re-signing, u1 removes the customer ue’s id from UL and signs the new UL.
(19) Phase 4: Integrity Auditing for Shared Information by Third Party Auditor
(20) GenProof (21) TPA arbitrarily chooses ν-element subset Q of chunks used for auditing, where Q⊂ S.
(22) For each element l ϵ Q, TPA generates an arbitrary ϑl ϵ Zq* where q < p.
(23) Public verifier sends the verification message V = {(l, ϑl ) where l ϵ Q } to the server.
(24) The CSP divides the challenged blocks set Q into u subsets i.e., Q= {Q1, Q2,.........,Qu }
where Qi is the subset of modified and signed chunks by existing customers ui in the
cluster.
(25) Let the number of elements in Qi be vi. Then v= Σd i=1 vi and Q= Q1 ∪ Q2,........., ∪ Qu.
(26) For each existing customer ui′s modified and signed chunks set Qi, CSP estimates linear
sequence of sampled chunks specified in V:ψi= Σ l ϵ Qi ϑl κk є Zp and also computes
aggregated signature: ϴ=Πd i=1 ϴi = Π l ϵ Qi ηl
ϑl ϵ G1.
(27) CSP sends the verification proof to the TPA { ψ, ϴ , {idl} l ϵ Q }
(28) VerifyProof (29) With the response from the CSP, the public verifier approves the response by examining
the auditing equation
(30) e(ϴ, g)= Πd i=1 e( Π l ϵ Qi H(idl) ϑl . w
ψi , pki )
(31) If the output is 1, the TPA considers that the integrity of all the chunks in distributed information S is correct or else the TPA outputs 0.
manager executes the deduplication test by transmitting hash value of the document F1 to the CSP. If
there is a duplicate, the cloud customer executes proof of ownership convention with the CSP. If it is
passed, the customer is permitted to download this cached document without deploying the
document as shown in Algorithm 1 (Phase 1). Now the cluster manager segregates the document F1
into chunks, encrypts and uploads to the CSP. CSP produces label for each chunk that is generated
dynamically using Pairing Based Cryptography, where the tags are represented in the form of b(x, y)
where b is block and (x, y) is vector. Then the information of tags are sent to the cluster manager u1.
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 12
Existing customers retrieves their respective chunks, performs modifications, signs with
their respective secret key and then outsource to the distributed server. The CSP performs the
deduplication of the chunk with the respective customer. If it is the modified chunk then CSP allows
to upload otherwise CSP runs the proof of ownership convention, if it is a duplicate then CSP allows
the respective customer to retrieve the chunk as shown in SADGUR (Phase 2).
6.2 Integrity Auditing Protocol
In the proposed scheme, integrity auditing occurs in two levels:
(I) Integrity auditing for the revoked customer chunks by the CSP and
(II) Integrity auditing for shared information by Third Party Auditor.
I) Integrity auditing for the revoked customer chunks by the CSP: Cluster manager (u1)
intimates the CSP to perform auditing and re-sign the revoked customer chunks. The cluster
manager generates the re-signing key [4], rke→f= pkf ske and sends to the CSP. The CSP performs
deduplication of revoked customer chunks, checks the integrity of these blocks and re-signs using the
re-signing key rk e→f retrieved above as illustrated in the Algorithm 1 (Phase 3).
II) Integrity Auditing for Shared Information by Third Party Auditor: The analysis on
information sincerity is carried out via a challenge-and-response convention within the CSP and a
Third Party Auditor (TPA). The CSP creates a proof of possession {Ψ, ϴ, {idl}lϵQ } of shared
information in GenProof under the challenge V = (l, ϑl) of TPA. In VerifyProof, the public
verifier audits the accuracy of a proof acknowledged by the CSP as illustrated in SADGUR (Phase
4). Summary of the Notations used in the Algorithm 1. is shown in Table II.
6.3 Proof of Ownership Protocol
The objective of PoW convention is to grant safe deduplication at distributed server. The
distributed server arbitrarily chooses a set of chunk identifiers for challenge. Upon collecting the
challenge set, the cluster manager searches in the information of tags sent by the CSP as illustrated in
SADGUR (Phase 1), for the corresponding tags of blocks. If the respective tags are retrieved then the
cluster manager sends the tags as response to cloud to prove his ownership.
TABLE II. SUMMARY OF THE NOTATIONS USED IN THE ALGORITHM
Notation Description
G1 , G2 Groups of order p
g, w Generator polynomial of G1
H Hash function with H:(0,1)* → G1
tagF Tag of file F
Pk Public key
Sk Private key
ηk Signature on block k
n Overall number of chunks in distributed.data.
S Distributed information
u Total number of customers in the cluster.
u1 Cluster manager
UL Customer list
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 13
κK kth block
idk kth block identifier
Q Subset of v random blocks
rke →f Re-key used by the CSP
C Challenge
ψi Linear combination of sampled blocks
ϴi Aggregated signature
7. SECURITY ANALYSIS
We carry out security analysis by considering two games for integrity auditing and secure
deduplication. Adversary and challenger are the two players considered in these games. The
adversary is aiming to gain the goal condition as said in the game.
Integrity Auditing for Shared Data: Let us consider the case that CSP is an adversary i.e.,
since the CSP is a semitrusted party, he might forge the tag for the file F. TPA acts as a
challenger, generates the sk and public key pk and forwards the pk to the attacker (CSP). Then
the attacker is permitted to inquiry the document upload oracle for any document F, then the
document F with accurate labels are created and uploaded to the distributed server. The
challenger (TPA) can publicly verify these tags with respect to the pk. The challenger runs
the integrity verification protocol with the adversary. Adversary performs forgery for file tag
as tag′ and sends to the challenger. Challenger after verification successfully detects the
forgery and proves that the adversary has not provided the original file and has performed
the forgery on the uploaded file.
Integrity auditing for the revoked customer chunks by the CSP: CSP verifies the revoked
customer chunks and re-signs with the re-signing key (Re − key). He sends the integrity
verification report to the cluster manager. The cluster manager receives the verification
report and checks with the meta-data information of the revoked customer’s chunks. If the
tags of the chunks matches then it is proved that the revoked customer chunks are correct.
Secure File-level Deduplication: Let us assume that a mischievous customer to request it has
a challenge document F by conspiring with any of the existing customers who does not
possess this document. A challenge document F is randomly picked and is sent to the
challenger. The challenger executes the summary algorithm and generates the summary of
the document F. Attacker colludes with other clients and provokes them to communicate
with the distributed server to try to confirm the proprietorship of document F. Here the
distributed server acts as the honest examiner. Then the proof of ownership convention is
executed. The adversary outputs a challenge for this document F to the distributed server. If
the distributed server accepts the document F, then we say the adversary succeeds. But the
distributed server, by executing the proof of ownership convention securely verifies that the
challenger for this file F is an unauthorised person and hence the proposed mechanism
satisfies secure file level deduplication.
Secure Block level deduplication: Let us assume that an adversary tries to upload his chunks
of the file F to the server by colluding with the existing clients in the cluster. He sends these
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 14
chunks as challenge to the CSP. After receiving these chunks CSP runs the proof of
ownership protocol and identifies that the challenger is an attacker and
informs to the cluster manager. Thus the CSP securely performs chunk level deduplication
and protects the shared data from the adversaries efficiently.
8. PERFORMANCE EVALUATION
In this section, we present a experimental evaluation of our proposed scheme. We make use
of Pairing Based Cryptography (PBC) Library [30] to implement cryptographic operations in our
scheme. We have used Intel(R) Core(TM) i5-5200U, CPU @2.20GHz, 8GB RAM. In order to
accomplish λ = 80 bit security, the prime order p of the bilinear group G and GT are
respectively chosen as 160 and 512 bits in length. We also set the chunk size as 4 KB.
Fig. 2. Tag generation
Fig. 2. shows the time cost for generating file tags. It is clear that the time cost is growing
with the size of file. This is because the more blocks in file, the more homomorphic signatures are
needed to be computed for file uploading. When compared to the Mapreduce algorithm [20]
(SecCloud) the time cost of tag generation by using AES and MD5 hash function is reduced. In this
implementation some part of data in the file is selected and the key is computed using AES and we
input the output of the AES to MD5, the output of MD5 is the final tag generated for each file.
Whereas mapreduce is a lengthy process which has the complicated multiplication over slave node.
So the time taken to generate tag in SADGUR is reduced compared to the time taken by Mapreduce
(SecCloud). So we have reduced this complexity by replacing mapreduce to AES and MD5 for
generating tags.
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 15
Fig. 3. Impact of t on average auditing time (ms) per task where u=10.
The significant advantages of cluster auditing is that the overall number of pairing operations
are reduced, as the operations consume lot of time while auditing. From Figs.
3 and 4 we see that, with cluster verification, the average verification duration used on every
verifying job can be minimized effectively. Fig. 3. shows the Impact of t on average verification
duration (ms) per job where u=10. If the overall number of verifying tasks accepted in a small
duration is τ, then the cluster capacity for every job is uj for jϵ [1, τ], batch auditing performed by
TPA reduces the overall number of pairing operations for τ verification jobs to τu+1, at the same
time, examining these τ verification tasks independently needs τu+ τ pairing operations. Hence from
Fig. 3. the average verification task with cluster auditing for t=40, is around 283 ms (Panda) and 217
ms (SADGUR) while the average verification time per task with independent auditing is 315 ms.
Fig. 4. Impact of t on average auditing time (ms) per task where u=10.
Fig. 4. shows the Impact of t on average auditing time (ms) per task where u=10. If entire
verifying jobs are all from the same cluster, the capacity of the cluster is u and the prevailing
customers public keys for the cluster are (pk1,...., pku), then cluster auditing on τ auditing tasks is
additionally improved as the overall number of pairing operations while cluster auditing can be
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 16
automatically decreased to only u+1. As illustrated in Fig. 4, if the collective jobs are all from the
same cluster then the performance of cluster auditing is notably enhanced by reducing a pairing
operation because the users are belonging to the same group. At t=40, the average auditing time (ms)
for SADGUR is 137ms, Panda scheme takes 219 ms whereas independent auditing takes 350 ms.
9. CONCLUSIONS
Aiming at realizing both information sincerity and deduplication in cloud, we present a
Secure Auditing, Deduplication and Group User Revocation of Shared Data in Cloud (SADGUR)
mechanism. The proposed scheme is collusion resistant, CSP performs secure deduplication and
generates data tags for the revoked user blocks as well as audits the integrity of revoked user blocks
efficiently that have been stored in cloud. The public verifier effectively performs shared information
auditing without fetching the complete information from the cloud. Performance analysis shows that
the time cost of tag generation has been reduced, supports efficient batch auditing and average
auditing time cost is decreased compared to the existing mechanism. Further, researchers need to
design effective auditing schemes that provide sector level auditing.
REFERENCES
[1] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D.
Patterson, A. Rabkin, I. Stoica et al., “A View of Cloud Computing,” Communications of
the ACM, vol. 53, no. 4, pp. 50–58, 2010.
[2] H. Shacham and B. Waters, “Compact Proofs of Retrievability,” Journal of Cryptology,
vol. 26, no. 3, pp. 442–483, 2013.
[3] Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling Public Verifiability and Data
Dynamics for Storage Security in Cloud Computing,” Computer Security–ESORICS, pp.
355–370, 2009.
[4] B. Wang, B. Li, and H. Li, “Panda: Public Auditing for Shared Data with Efficient User
Revocation in the Cloud,” IEEE Transactions on Services Computing,, vol. 8, no. 1, pp. 92-
106, 2015.
[5] J. Yuan and S. Yu, “Secure and Constant Cost Public Cloud Storage Auditing with
Deduplication,” in IEEE Conference on Communications and Network Security (CNS).
IEEE, pp. 145–153, 2013.
[6] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song,
“Provable Data Possession at Untrusted Stores,” in Proceedings of the 14th ACM
Conference on Computer and Communications Security, pp. 598–609, 2007.
[7] G. Ateniese, R. Di Pietro, L. V. Mancini, and G. Tsudik, “Scalable and Efficient Provable
Data Possession,” in Proceedings of the 4th International Conference on Security and
Privacy in Communication Networks, pp. 1–9, ACM, 2008.
[8] Y. Zhu, H. Hu, G.-J. Ahn, and M. Yu, “Cooperative Provable Data Possession for Integrity
Verification in Multicloud Storage,” IEEE Transactions on Parallel and Distributed
Systems, vol. 23, no. 12, pp. 2231–2244, 2012.
[9] S. Raghavendra, C. M. Geeta, R. Buyya, K. R. Venugopal, S. S. Iyengar, and L.M. Patnaik,
“MSIGT: Most Significant Index Generation Technique for Cloud Environment,” in
Proceedings of the Annual IEEE India Conference (INDICON), pp. 1–6, 2015.
[10] S. Raghavendra, P. A. Doddabasappa, C. M. Geeta, R. Buyya, K. R. Venugopal, S. S.
Iyengar, and L. M. Patnaik, “Secure Multi-Keyword Search and Multi-User Access Control
over an Encrypted Cloud Data,” International Journal of Information Processing, vol. 10,
International Journal of Computer Engineering and Applications, Volume XIII, Issue II, Feb. 19, www.ijcea.com ISSN 2321-3469
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 17
no. 2, pp. 51–61, 2016.
[11] T. Jiang, X. Chen, and J. Ma, “Public Integrity Auditing for Shared Dynamic Cloud Data
with Group User Revocation,” IEEE Transactions on Computers, vol. 65, no. 8, pp. 2363–
2373, 2016.
[12] K. R. Venugopal and R. Buyya, “Mastering C++,” McGraw-Hill Education, 2013.
[13] A. Fu, S. Yu, Y. Zhang, H. Wang, and C. Huang, “NPP: A New Privacy- Aware Public
Auditing Scheme for Cloud Data Sharing with Group Users,” IEEE Transactions on Big
Data, 2017.
[14] Y. Wang, Q. Wu, B. Qin, W. Shi, R. H. Deng, and J. Hu, “Identity-Based Data Outsourcing
with Comprehensive Auditing in Clouds,” IEEE Transactions on Information Forensics and
Security, vol. 12, no. 4, pp. 940–952, 2017.
[15] J. Zhang and X. Zhao, “Efficient Chameleon Hashing-Based Privacy-Preserving Auditing in
Cloud Storage,” Cluster Computing, vol. 19, no. 1, pp. 47–56, 2016.
[16] G. Xu, M. Lai, J. Li, L. Sun, and X. Shi, “A Generic Integrity Verification Algorithm of
Version Files for Cloud Deduplication Data Storage,” EURASIP Journal on Information
Security, vol. 2018, no. 1,p. 12, 2018.
[17] T. Jiang, X. Chen, Q. Wu, J. Ma, W. Susilo, and W. Lou, “Secure and Efficient Cloud Data
Deduplication with Randomized Tag,” IEEE Transactions on Information Forensics and
Security, vol. 12, no. 3, pp. 532–543, 2017.
[18] J. Hur, D. Koo, Y. Shin, and K. Kang, “Secure Data Deduplication with Dynamic Ownership
Management in Cloud Storage,” IEEE Transactions on Knowledge and Data Engineering,
vol. 28, no. 11, pp. 3113–3125, 2016.
[19] Y. Zheng, X. Yuan, X. Wang, J. Jiang, C. Wang, and X. Gui, “Towards Encrypted Cloud
Media Centre with Secure Deduplication,” IEEE Transactions on Multimedia, pp. 1–16,
2016.
[20] J. Li, J. Li, D. Xie, and Z. Cai, “Secure Auditing and Deduplicating Data in Cloud,” IEEE
Transactions on Computers, vol. 65, no. 8, pp. 2386–2396, 2016.
[21] S. Halevi, D. Harnik, B. Pinkas, and A. Shulman-Peleg, “Proofs of Ownership in Remote
Storage Systems,” in Proceedings of the 18th ACM Conference on Computer and
Communications Security. ACM, pp. 491–500, 2011.
[22] W. K. Ng, Y. Wen, and H. Zhu, “Private Data Deduplication Protocols in Cloud Storage,” in
Proceedings of the 27th Annual ACM Symposium on Applied Computing, pp. 441–446, 2012.
[23] K. R. Venugopal, K. G. Srinivasa, and L. M. Patnaik, “Soft Computing for Data Mining
Applications,” Springer, 2009.
[24] S. Raghavendra, C. M. Geeta, R. Buyya, K. R. Venugopal, S. S. Iyengar, and L. M. Patnaik,
“DRSMS: Domain and Range Specific Multi-Keyword Search over Encrypted Cloud Data,”
International Journal of Computer Science and Information Security, vol. 14, no. 5, pp. 69–
78,2016.
[25] D. Boneh, B. Lynn, and H. Shacham, “Short Signatures from the Weil Pairing,” Journal of
Cryptology, vol. 17, no. 4, pp. 297–319, 2004.
[26] M. Abadi, D. Boneh, I. Mironov, A. Raghunathan, and G. Segev, “Message-Locked
Encryption for Lock-Dependent Messages,” in Advances in Cryptology–CRYPTO. Springer,
pp. 374–391, 2013.
[27] M. Bellare, S. Keelveedhi, and T. Ristenpart, “Message-Locked Encryption and Secure
Deduplication,” in Annual International Conference on the Theory and Applications of
Cryptographic Techniques. Springer, pp. 296–312, 2013.
[28] M. Blaze, G. Bleumer, and M. Strauss, “Divertible Protocols and Atomic Proxy
Cryptography,” in International Conference on the Theory and Applications of Cryptographic
Techniques. Springer, pp. 127– 144, 1998.
[29] G. Ateniese and S. Hohenberger, “Proxy Re-signatures: New Definitions, Algorithms, and
Applications,” in Proceedings of the 12th ACM Conference on Computer and
Communications Security. ACM, pp. 310–319, 2005.
[30] “Pairing Based Cryptography (PBC) Library.” [Online]. Available:
http://crypto.stanford.edu/pbc/, 2014.
SADGUR: SECURE AUDITING, DEDUPLICATION AND GROUP USER REVOCATION OF SHARED DATA IN CLOUD
Geeta C M, Mithila Lakshmi G, Shreyas Raju R G, Raghavendra S, Rajkumar Buyya, Venugopal K R,
S S Iyengar, and L M Patnaik 18