project report -secure and dependable storage
DESCRIPTION
A complete solution for cloudTRANSCRIPT
CHAPTER 1
INTRODUCTION
1.1 OVER VIEW OF THE PROJECT
Cloud computing has been envisioned as the next generation architecture of the
IT enterprise due to its long list of unprecedented advantages in IT: on demand
self-service, ubiquitous network access, location-independent resource pooling,
rapid resource elasticity, usage-based pricing, and transference of risk. One
fundamental aspect of this new computing model is that data is being centralized or
outsourced into the cloud. From the data owners’ perspective, including both
individuals and IT enterprises, storing data remotely in a cloud in a flexible on-
demand manner brings appealing benefits: relief of the burden of storage
management, universal data access with independent geographical locations, and
avoidance of capital expenditure on hardware, software, personnel maintenance,
and so on. While cloud computing makes these advantages more appealing than
ever, it also brings new and challenging security threats to the outsourced data.
Since cloud service providers (CSP) are separate administrative entities, data
outsourcing actually relinquishes the owner’s ultimate control over the fate of their
data. As a result, the correctness of the data in the cloud is put at risk due to the
following reasons. First of all, although the infrastructures under the cloud are
much more powerful and reliable than personal computing devices, they still face a
broad range of both internal and external threats to data integrity. Outages and
security breaches of noteworthy cloud services appear from time to time. Amazon
S3’s recent downtime [Gmail’s mass email deletion incident, and Apple
MobileMe’s post-launch downtime are all such examples]. Second, for benefits of
their own, there are various motivations for CSPs to behave unfaithfully toward
1
cloud customers regarding the status of their sourced data. Examples include CSPs,
for monetary reasons, reclaiming storage by discarding data that has not been or is
rarely accessed or even hiding data loss incidents to maintain a reputation In short,
although outsourcing data into the cloud is economically attractive for the cost and
complexity of long-term Large scale data storage, it does not offer any guarantee
on data integrity and availability. This problem, if not properly addressed, may
impede successful deployment of the cloud architecture.
1.2 SYSTEM ANALYSIS & FEASIBILITY STUDY
System Analysis is the detailed study of the various operations performed by
the system and their relationship within and outside the system. Analysis is the
process of breaking something into its parts so that the whole may be understood.
System analysis is concerned with becoming aware of the problem, identifying the
relevant and most decisional variables, analyzing and synthesizing the various
factors and determining an optional or at least a satisfactory solution. During this a
problem is identified, alternate system solutions are studied and recommendations
are made about committing the resources the resources used to the system.
1.2.2 FEASIBILITY STUDY
A feasibility analysis usually involves a through assessment of the
operational (need), financial and technical aspects of a proposal. Feasibility study
is the test of made to identify whether the user needs may be satisfied using the
current software and hardware technologies, whether the system will be cost
effective from a business point of view and whether it can be developed with the
given budgetary constraints. A feasibility study should be relatively cheap and
done at the earlier possible time. Depending on the study, the decision is made
whether to go ahead with a more detailed analysis.
2
When a new project is proposed, it normally goes through feasibility
assessment. Feasibility study is carried out to determine whether the proposed
system is possible to develop with available resources and what should be the cost
consideration. Facts consideration in the feasibility analysis were
Technical Feasibility
Economic Feasibility
Behavioral Feasibility
1.2.2.1 Technical Feasibility
Technical feasibility include whether the technology is available in the
market for development and its availability. The assessment of technical feasibility
must be based on an outline design of system requirements in terms of input,
output, files, programs and procedures. This can be qualified in terms of volumes
of data, trends, frequency of updating, cycles of activity etc, in order to give an
introduction of technical system. Considering our project it is technically feasible,
with its emphasis on a more strategic decision making process is fast gaining
ground as a popular outsourced function.
1.2.2.2 Economic Feasibility
This feasibility study presents tangible and intangible benefits from the project by
comparing the development and operational cost. The technique of cost benefit
analysis is often used as a basis for assessing economic feasibility. This system
needs some more initial investment than the existing system, but it can be
justifiable that it will improve quality of service.
Thus feasibility study should center along the following points:
Improvement resulting over the existing method in terms of accuracy,
timeliness.
Cost comparison
3
Estimate on the life expectancy of the hardware.
Overall objective.
Our project is economically feasible. It does not require much cost to be involved
in the overall process. The overall objective is in easing out the recruitment
processes.
1.2.2.3 Behavioral / Operational Feasibility
This analysis involves how it will work when it is installed and the
assessment of political and managerial environment in which it is implemented.
People are inherently resistant to change and computers have been known to
facilitate change. The new proposed system is very much useful to the users and
therefore it will accept broad audience from around the world.
1.3 SYSTEM DESIGN
The most creative and challenging face of the system development is System
Design. It provides the understanding and procedural details necessary for the
logical and physical stages of development. In designing a new system, the system
must have a clear understanding of the objectives, which the design is aiming to
fulfill. The first step is to determine how the output is to be designed to meet the
requirements of the proposed output. The operational phases are handled through
program construction and testing.
Design of the system can be defined as a process of applying various
techniques and principles for the purpose of defining a device, a process or a
system in sufficient detail to permit its physical realization. Thus system design is
a solution to “how to” approach to the creation of a new system. This important
phase provides the understanding and the procedural details necessary for
4
implementing the system recommended in the feasibility study. The design step
provides a data design, architectural design, and a procedural design.
1.4 OVERVIEW OF LANGUAGE USED
1.4.1 JAVA
Java is a small, simple, safe, object oriented, interpreted or dynamically
optimized, byte coded, architectural, garbage collected, multithreaded
programming language with a strongly typed exception-handling for writing
distributed and dynamically extensible programs.
Java is an object oriented programming language. Java is a high-level, third
generation language like C, FORTRAN, Small talk, Pearl and many others. You
can use java to write computer applications that crunch numbers, process words,
play games, store data or do any o the thousands of other things computer software
can do. Special programs called applets that can be downloaded from the internet
and played safely within a web browser. Java a support this application and the
follow features make it one of the best programming language.
It is simple and object oriented
It helps to create user friendly interfaces.
It is very dynamic.
It supports multi threading.
It is highly secure and robust.
It supports internet programming
Java is programming language originally developed by Sun Microsystems and
released in 1965 as a core component of Sun’s Java platform. The language derives
much of its syntax from C and C++ but has a simpler object model and fewer low-
5
level facilities. Java applications are typically compiled to byte code which can run
on any Java virtual machine (JVM) regardless of computer architecture.
The original and reference implemented Java compilers, virtual machine, and class
libraries were developed by Sun from 1995. As of May 2007, in compliance with
the specifications of the Java Community Process, Sun made available most of
their Java technologies as free software under the GNU General Public License.
Other has also developed alternative implementations of these Sun technologies,
such as the GNU Compiler for Java and GNU Class path.
The Java platform is the name for a bundle of related programs, or platform, from
Sun which allow for developing and running programs written in the Java
programming language. The platform is not specific to any one
Processor or operating system, but rather an execution engine (called a virtual
machine) and a compiler with a set of standard libraries which are implemented for
various hardware and operating system so that Java programs can run identically
on all of them.
Different “editions” of the platform are available, including:
Java ME(Micro Edition):Specifies several different sets of libraries (know as
profiles) for devices which are sufficiently limited that supplying the full set of
Java libraries would take up unacceptably large amounts of storage.
Java SE (Standard Edition): For general purpose use on desktop PCs, servers
and similar devices.
Java EE (Enterprise Edition): Java SE plus various APIs useful for multi-tier
client-server enterprise applications.
Java began as a client side platform independent programming language that
enabled stand-alone Java applications and applets. The numerous benefits of Java
6
resulted in an explosion in the usage of Java in the back end server side enterprise
systems. The Java Development Kit (JDK), which was the original standard
platform defined by Sun, was soon supplemented by a collection of enterprise
APIs. The proliferation of enterprise APIs, often developed by several different
groups, resulted in divergence of APIs and caused concern among the Java
developer community.
Java byte code can execute on the server instead of or in addition to the
client, enabling you to build traditional client/server applications and modern thin
client Web applications. Two key server side Java technologies are servlets and
Java Server Pages. Servlets are protocol and platform independent server side
components which extend the functionality of a Web server. Java Server Pages
(JSPs) extend the functionality of servlets by allowing Java servlet code to be
embedded in an HTML file.
Features of Java
Platform Independence The Write-Once-Run-Anywhere ideal has not been achieved (tuning for
different platforms usually required), but closer than with other languages.
Object Oriented
Object oriented throughout - no coding outside of class definitions,
including main(). An extensive class library available in the core language
packages.
Compiler/Interpreter Combo
Code is compiled to byte codes that are interpreted by Java virtual machines
(JVM).
Robust
7
Exception handling built-in, strong type checking (that is, all data must be declared an explicit type), local variables must be initialized.
Several dangerous features of C & C++ eliminated
No memory pointers
No preprocessor
Automatic Memory Management
Automatic garbage collection - memory management handled by JVM.
Security
No memory pointers
Programs run inside the virtual machine sandbox.
Array index limit checking
Code pathologies reduced by
Byte code verifier - checks classes after loading
Dynamic Binding
The linking of data and methods to where they are located is done at run-
time.
New classes can be loaded while a program is running. Linking is done on
the fly.
Threading
Lightweight processes, called threads, can easily be spun off to perform
multiprocessing.
Great multimedia displays.
1.4.2 TECHNOLOGY SPECIFICATION
8
JDK1.5
The Java 2 Platform Standard Edition Development Kit (JDK) is a
development environment for building applications, applets, and components using
the Java programming language. The JDK includes tools useful for developing and
testing programs written in the Java programming language and running on the
Java platform. These tools are designed to be used from the command line. Except
for the applet viewer, these tools do not provide a graphical user interface.
JDK DOCUMENTATION
The on-line Java 2 Platform Standard Edition Documentation contains API
specifications, feature descriptions; developer guides, reference pages for JDK
tools and utilities, demos, and links to related information. This documentation is
also available in a download bundle which you can install on your machine.
CONTENTS OF THE JDK
The following table contains a general summary of the files and
directories in the JDK.
Development Tools
Runtime Environment
Additional Libraries
C header Files
Tools and utilities (Inside bin subdirectory) that will help you develop, C execute, debug,
An implementation of the J2SE runtime environment (Inside jre subdirectory.) For the use of the
Additional class libraries (In the lib subdirectory.) and support files required by the development tools.
(In the include subdirectory.) Header files that support native-code programming using the Java
9
and document programs written in the Java programming language. For further information, see the tool documentation.
JDK. The runtime environment includes a Java virtual machine, class libraries, and other files that support the execution of programs written in the Java programming language.
Native Interface, the JVMTM Tool Interface, and other functionality of the Java 2 Platform.
Table1.1: Contents of JDK
2) Source Code
(In src.zip.) Java programming language source files for all classes that
make up the Java 2 core API (that is, sources files for the java.*, javax.* and some
org.* packages, but not for com.sun.* packages). This source code is provided for
informational purposes only, to help developers learn and use the Java
programming language. These files do not include platform-specific
implementation code and cannot be used to rebuild the class libraries. To extract
these file, use any common zip utility. Or, you may use the Jar utility in the JDK's
bin directory: jar xvf src.zip. The Java Programming Language is a general-
purpose, concurrent, strongly typed, class-based object-oriented language. It is
normally compiled to the byte code instruction set and binary format defined in the
Java Virtual Machine Specification.
ENHANCEMENTS IN JDK 5
10
Generics - This long-awaited enhancement to the type system allows a type
or method to operate on objects of various types while providing compile-time
type safety. It adds compile-time type safety to the Collections Framework and
eliminates the drudgery of casting.
Enhanced for Loop - This new language construct eliminates the drudgery
and error-proneness of iterators and index variables when iterating over collections
and arrays.
Autoboxing/Unboxing - This facility eliminates the drudgery of manual
conversion between primitive types (such as int) and wrapper types (such as
Integer).
Typesafe Enums - This flexible object-oriented enumerated type facility
allows you to create enumerated types with arbitrary methods and fields. It
provides all the benefits of the Typesafe Enums pattern.
Varargs - This facility eliminates the need for manually boxing up argument
lists into an array when invoking methods that accept variable-length argument
lists.
Static Import - This facility lets you avoid qualifying static members with
class names without the shortcomings of the "Constant Interface anti pattern”.
Annotations (Metadata) - This language feature lets you avoid writing
boilerplate code under many circumstances by enabling tools to generate it from
annotations in the source code. This leads to a "declarative" programming style
where the programmer says what should be done and tools emit the code to do it.
Also it eliminates the need for maintaining "side files" that must be kept up to date
with changes in source files. Instead the information can be maintained in the
source file.
JAVA SERVER PAGES
11
Java Server Pages (JSP) is a technology based on the Java language and
enables the development of dynamic web sites. JSP was developed by Sun
Microsystems to allow server side development. JSP files are HTML files with
special Tags containing Java source code that provide the dynamic content. The
following shows the Typical Web server, different clients connecting via the
Internet to a Web server. In this example, the Web server is running on UNIX and
is the very popular Apache Web server
First static web pages were displayed. Typically these were people’s first
experience with making web pages so consisted of My Home Page sites and
company marketing information. Afterwards Perl and C were languages used on
the web server to provide dynamic content. Soon most languages including Visual
basic, Delphi, C and Java could be used to write applications that provided
dynamic content using data from text files or database requests. These were known
as CGI server side applications. ASP was developed by Microsoft to allow HTML
developers to easily provide dynamic content supported as standard by Microsoft’s
free Web Server, Internet Information Server (IIS). JSP is the equivalent from Sun
Microsystems, a comparison of ASP and JSP will be presented in the following
section.
JSP source code runs on the web server in the JSP Servlet Engine. The JSP
Servlet engine dynamically generates the HTML and sends the HTML output to
the client’s web browser.
MAIN REASONS TO USE JSP
Multi-platform
Component reuse by using JavaBeans and EJB.
Advantages of Java.
You can take one JSP file and move it to another platform, web server or
12
JSP Servlet engine. This means you are never locked into one vendor or
platform. HTML and graphics displayed on the web browser are classed as
the presentation layer. The Java code (JSP) on the server is classed as the
implementation. By having a separation of presentation and implementation,
web designer’s work only on the presentation and Java developers
concentrate on implementing the application.
JSP ARCHITECTURE
JSP’s are built on top of SUN Microsystems' servlet technology. JSP’s are
essential an HTML page with special JSP tags embedded. These JSP tags can
contain Java code. The JSP file extension is .jsp rather than .htm or .html. The JSP
engine parses the .jsp and creates a Java servlet source file. It then compiles the
source file into a class file; this is done the first time and this why the JSP is
probably slower the first time it is accessed. Any time after this the special
compiled servlet is executed and is therefore returns faster.
APACHE TOMCAT
Apache Tomcat (or simply Tomcat, formerly also Jakarta Tomcat) is an
open source web server and Servlet container developed by the Apache Software
Foundation (ASF). Tomcat implements the Java Servlet and the Java Server Pages
(JSP) specifications from Oracle Corporation, and provides a "pure Java" HTTP
web server environment for Java code to run.
Tomcat should not be confused with the Apache web server, which is a C
implementation of an HTTP web server; these two web servers are not bundled
together, although they are frequently used together as part of a server application
13
stack. Apache Tomcat includes tools for configuration and management, but can
also be configured by editing XML configuration files
Components
Tomcat5.x was released with Catalina (Servlet container), Coyote (an HTTP
connector) and Jasper (a JSP engine).
Catalina
Catalina is Tomcat's Servlet container. Catalina implements Sun
Microsystems' specifications for Servlet and Java Server Pages (JSP). In Tomcat,
a Realm element represents a "database" of usernames, passwords, and roles
(similar to Unix groups) assigned to those users. Different implementations of
Realm allow Catalina to be integrated into environments where such authentication
information is already being created and maintained, and then use that information
to implement Container Managed Security as described in the Servlet
Specification.
Coyote
Coyote is Tomcat's HTTP Connector component that supports the
HTTP 1.1 protocol for the web server or application container. Coyote listens for
incoming connections on a specific TCP port on the server and forwards the
request to the Tomcat Engine to process the request and send back a response to
the requesting client.
14
Jasper
Jasper is Tomcat's JSP Engine. Tomcat 5.x uses Jasper 2, which is an
implementation of the Sun Microsystems's JavaServerPages 2.0 specification.
Jasper parses JSP files to compile them into Java code as Servlets (that can be
handled by Catalina). At runtime, Jasper detects changes to JSP files and
recompiles them.
Jasper 2
From Jasper to Jasper 2, important features were added:
JSP Tag library pooling - Each tag markup in JSP file is handled by a tag
handler class. Tag handler class objects can be pooled and reused in the
whole JSP Servlet.
Background JSP compilation - While recompiling modified JSP Java code,
the older version is still available for server requests. The older JSP Servlet
is deleted once the new JSP Servlet has finished being recompiled.
Recompile JSP when included page changes - Pages can be inserted and
included into a JSP at runtime. The JSP will not only be recompiled with JSP file
changes but also with included page changes.
CASCADING STYLE SHEETS
Cascading Style Sheets (CSS) is a style sheet language used for describing
the presentation semantics (the look and formatting) of a document written in a
markup language. Its most common application is to style web pages written in
HTML and XHTML, but the language can also be applied to any kind of XML
document, including plain XML, SVG and XUL.
15
CSS is designed primarily to enable the separation of document content
(written in HTML or a similar markup language) from document presentation,
including elements such as the layout, colors, and fonts.[1] This separation can
improve content accessibility, provide more flexibility and control in the
specification of presentation characteristics, enable multiple pages to share
formatting, and reduce complexity and repetition in the structural content (such as
by allowing for table less web design). CSS can also allow the same markup page
to be presented in different styles for different rendering methods, such as on-
screen, in print, by voice (when read out by a speech-based browser or screen
reader) and on Braille-based, tactile devices. It can also be used to allow the web
page to display differently depending on the screen size or device on which it is
being viewed. While the author of a document typically links that document to a
CSS style sheet, readers can use a different style sheet, perhaps one on their own
computer, to override the one the author has specified.CSS specifies a priority
scheme to determine which style rules apply if more than one rule matches against
a particular element. In this so-called cascade, priorities or weights are calculated
and assigned to rules, so that the results are predictable.
The CSS specifications are maintained by the World Wide Web Consortium
(W3C). Internet media type (MIME type) text/css is registered for use with CSS by
RFC 2318 (March 1998).
16
XML
Extensible Markup Language (XML) is a markup language that defines a set
of rules for encoding documents in a format that is both human-readable and
machine-readable. It is defined in the XML 1.0 Specification produced by the
W3C, and several other related specifications, all gratis open standards.
The design goals of XML emphasize simplicity, generality, and usability over the
Internet. It is a textual data format with strong support via Unicode for the
languages of the world. Although the design of XML focuses on documents, it is
widely used for the representation of arbitrary data structures, for example in web
services.
1.5 SYSTEM SPECIFICATION
HARDWARE SPECIFICATION
Processor Type Pentium IV.
Speed 2.5 GHz.
Ram 128 MB.
Hard Disk 40GB.
Table1.2: Hardware Specification
17
SOFTWARE SPECIFICATION
Operating system Windows XP.
Tool MyEclipse Enterprise
Workbench 5.1
Programming Package Jdk1.5.0
Database MySQL.
Server Tomcat 5.5.
Browser Google Chrome
Table1.3: Software Specification
1.6 FUTURE ENHANCEMENT
We have described some suggested requirements for public auditing services
and the state of the art that fulfills them. However, this is still not enough for a
publicly auditable secure cloud data storage system, and further challenging issues
remain to be supported and resolved.
The three macro economics trends that are seen as fuelling the growth
of this industry are:
Shorter employment tenures
Shrinking labor pools
18
Need for technology workers
In wake of the new and related trends, it is imperative for frequent upgrades
to a company’s software or web applications to make it easier for clients and
employees to address new business needs.
1.7 ORGANISATION OF THE REPORT
Chapter 2 Literature overview
Chapter 3 Development environment
Chapter 4 Design Architecture
Chapter 5 Implementation
Chapter 6 Testing
Chapter 7 Conclusion and future work
Table1.4: Organization of Report
19
CHAPTER 2
LITERATURE REVIEW
2.1 EXISTING SYSTEM
cloud computing has been envisioned as the next generation architecture of
the IT enterprise due to its long list of unprecedented advantages in IT: on demand
self-service, ubiquitous network access, location-independent resource pooling,
rapid resource elasticity, usage-based pricing, and transference of risk. Preserving
user is most part for the company as well for the users. User information is
maintained only be admin. But in Cloud environment user information should be
preserved more that the other. User data is not monitored by anyone.
LIMITATION OF PRESENT SYSTEM
Byzantine Error Occurrence.
Integration of Database in Many Servers.
Pre-Requisition for Client manipulation.
Replication based file distribution system.
2.2 PROPOSED SYSTEM
We proposed to secure user information from cloud environment. How we
secure is by auditing user log and their behavior on this cloud. Third party who
coming to cracks our site is by using user account. So we trace each and every user
what the user doing in this site. Monitoring the user keenly can easily get the
cracker who misbehaving. We are placing an Auditor in common to both user and
company of whose information must be secure. Auditor monitors the information
in the cloud. Trace each and every transaction, updates, insertion and deletion of
data.
20
ADVANTAGES
The servers should be updated frequently.
An algorithm to find out the misbehaving servers.
Prior request should be made to the CSP for manipulation.
Allocating an agent for Supervising.
Providing a unique token for each client within the cloud.
Handling the unknown user under supervision.
2.3 COLLECTED INFORMATION
2.3.1 MODULES WITH MODULE DESCRIPTION:
Modules:
1. Desirable Properties for Public Auditing
2. Support Batch Auditing
3. Utilizing Homomorphic Authenticators
4. Handling Multiple Concurrent Tasks
Problem Definition:
In this project we faced some of the serious problem, it is not so easy to trace
out the behavior of the user making auditor as common. And this auditor is most
trustable for both owners as well as for user. Making secure in the auditor
authentication is challenging.
Desirable Properties for Public Auditing:
Our goal is to enable public auditing for cloud data storage to become a
reality. Thus, the whole service architecture design should not only be
cryptographically strong, but, more important, be practical from a systematic point
of view. We briefly elaborate a set of suggested desirable properties below that
satisfy such a design principle. The in-depth analysis is discussed in the next
21
section. Note that these requirements are ideal goals. They are not necessarily
complete yet or even fully achievable in the current stage.
Figure 2.1 :Public Auditing
Support Batch Auditing:
The prevalence of large-scale cloud storage service further demands auditing
efficiency. When receiving multiple auditing tasks from different owners’
delegations, a TPA should still be able to handle them in a fast yet cost-effective
fashion. This property could essentially enable the scalability of a public auditing
service even under a storage cloud with a large number of data owners.
Figure 2.2: Support Batch Auditing
Utilizing Homomorphic Authenticators:
To significantly reduce the arbitrarily large communication overhead for
public auditability without introducing any online burden on the data owner, we
resort to the Homomorphic authenticator technique. Homomorphic authenticators
are unforgivable metadata generated from individual data blocks, which can be
securely aggregated in such a way to assure a verifier that a linear combination of
data blocks is correctly computed by verifying only the aggregated authenticator.
22
AuditorUser
AuditorUser
User
User
Figure 2.3: Homomorphic authenticator technique
Handling Multiple Concurrent Tasks:
Keeping this natural demand in mind, we note that two previous works can
be directly extended to provide batch auditing functionality by exploring the
technique of bilinear aggregate signature. Such a technique supports the
aggregation of multiple signatures by distinct signers on distinct messages into a
single signature and thus allows efficient verification for the authenticity of all
messages. Basically, with batch auditing the K verification equations (for K
auditing tasks) corresponding to K responses {μ, σ} from a cloud server can now
be aggregated into a single one such that a considerable amount of auditing time is
expected to be saved. A very recent work gives the first study of batch auditing and
presents mathematical details as well as security reasoning’s.
2.3.2 G.Atenieseet al., “Provable Data Possession at Untrusted Stores,” Proc.
ACMCCS ‘07, Oct. 2007, pp. 598–609.
Verifying the authenticity of data has emerged as a critical issue in storing
data on untrusted servers. It arises in peer- to-peer storage systems, network file
systems, long-term archives, web-service object stores, and database systems. Such
systems prevent storage servers from misrepresenting or modifying data by
providing authenticity checks when accessing data. However, archival storage
requires guarantees about the authenticity of data on storage, namely that storage
23
Login Secure Authenticatio
n
Sign In
servers possess data. It is insufficient to detect that data have been modified or
deleted when accessing the data, because it may be too late to recover lost or
damaged data. Archival storage servers retain tremendous amounts of data, little of
which are accessed. They also hold data for long periods of time during which
there may be exposure to data loss from ad- ministration errors as the physical
implementation of storage evolves, e.g., backup and restore, data migration to new
systems, and changing memberships in peer-to-peer systems. Archival network
storage presents unique performance demands. Given that file data are large and
are stored at re- mote sites, accessing an entire file is expensive in I/O costs to the
storage server and in transmitting the file across a net- work. Reading an entire
archive, even periodically, greatly limits the scalability of network stores. (The
growth in storage capacity has far outstripped the growth in storage access times
and bandwidth). Furthermore, I/O incurred to establish data possession interferes
with on-demand bandwidth to store and retrieve data. We conclude that clients
need to be able to verify that a server has retained file data without retrieving the
data from the server and without having the server access the entire file. Previous
solutions do not meet these requirements for proving data possession. Some
schemes provide a weaker guarantee by enforcing storage complexity: The server
has to store an amount of data at least as large as the client’s data, but not
necessarily the same exact data. Moreover, all previous techniques require the
server to access the entire file, which is not feasible when dealing with large
amounts of data.
2.3.3 G. Atenieseet al., “Scalable and Efficient Provable Data Possession,”
Proc. Secure Communication ‘08, Sept. 2008.
In recent years, the concept of third-party data warehousing and, more
generally, data outsourcing has become quite popular. Outsourcing of data
24
essentially means that the data owner (client) moves its data to a third-party
provider (server) which is supposed to – presumably for a fee – faithfully store the
data and make it available to the owner (and perhaps others) on demand.
Appealing features of outsourcing include reduced costs from savings in storage,
maintenance and personnel as well as increased availability and transparent up-
keep of data. A number of security-related research issues in data outsourcing have
been studied in the past decade. Early work concentrated on data authentication
and integrity, i.e., how to efficiently and securely ensure that the server returns
correct and complete results in response to its clients’ queries. Later research
focused on outsourcing encrypted data (placing even less trust in the server) and
associated difficult problems mainly having to do with efficient querying over
encrypted domain , More recently, however, the problem of Provable Data
Possession (PDP) –is also sometimes referred to as Proof of Data Retrivability
(POR)– has popped up in the research literature. The central goal in PDP is to
allow a client to efficiently, frequently and securely verify that a server – who
purportedly stores client’s potentially very large amount of data – is not cheating
the client. In this context, cheating means that the server might delete some of the
data or it might not store all data in fast storage, e.g., place it on CDs or other
tertiary off-line media. It is important to note that a storage server might not be
malicious; instead, it might be simply unreliable and lose or inadvertently corrupt
hosted data. An effective PDP technique must be equally applicable to malicious
and unreliable servers. The problem is further complicated by the fact that the
client might be a small device (e.g., a PDA or a cell-phone) with limited CPU,
battery power and communication facilities. Hence, the need to minimize
bandwidth and local computation overhead for the client in performing each
verification. Two recent results PDP and POR have highlighted the importance of
the problem and suggested two very different approaches. The first is a public-key-
25
based technique allowing any verifier (not just the client) to query the server and
obtain an interactive proof of data possession. This property is called public
verifiability. The interaction can be repeated any number of times, each time
resulting in a fresh proof. The POR scheme uses special blocks (called sentinels)
hidden among other blocks in the data. During the verification phase, the client
asks for randomly picked sentinels and checks whether they are intact. If the server
modifies or deletes parts of the data, then sentinels would also be affected with a
certain probability. However, sentinels should be indistinguishable from other
regular blocks; this implies that blocks must be encrypted. Thus, unlike the PDP
scheme in, POR cannot be used for public databases, such as libraries, repositories,
or archives. In other words, its use is limited to confidential data. In addition, the
number of queries is limited and fixed a priori. This is because sentinels, and their
position within the database, must be revealed to the server at each query – a
revealed sentinel cannot be reused.
2.3.4 H. Shacham and B. Waters, “Compact Proofs of Retrievability,” Proc.
Asia- Crypt ‘08, LNCS, vol. 5350, Dec. 2008, pp. 90–107.
In this paper, we give the proof of Retrivability schemes with full proofs of
security against arbitrary adversaries in the Juels-Kaliski model. Our scheme has
the shortest query and response of any proof-of-retrievability with public
variability and is secure in the random oracle model. Our second scheme has the
shortest response of any proof-of-Retrivability scheme with private variability (but
a longer query), and is secure in the standard model. Proofs of storage. Recent
visions of \cloud computing" and \software as a service" call for data, both
personal and business, to be stored by third parties, but deployment has lagged.
Users of outsourced storage are at the mercy of their storage providers for the
continued availability of their data. Even Amazon's S3, the best-known storage
26
service, has recently experienced significant downtime. In an attempt to aid the
deployment of outsourced storage, cryptographers have designed systems that
would allow users to verify that their data is still available and ready for retrieval if
needed: Deswarte, Quisquater, and Sadane, GazzoniFilho and Barreto and Schwarz
and Miller. In these systems, the client and server engage in a protocol; the client
seeks to be convinced by the protocol interaction that his data is being stored. Such
a capability can be important to storage providers as well. Users may be reluctant
to entrust their data to an unknown startup; an auditing mechanism can reassure
them that their data is indeed still available.
2.3.5 K. D. Bowers, A. Juels, and A. Oprea, “Hail: A High-Availability and
Integrity Layer for Cloud Storage,” Proc. ACM CCS ‘09, Nov. 2009, pp.
18798.
Cloud storage denotes a family of increasingly popular on-line services for
archiving, backup, and even primary storage of files. Amazon S3 is a well known
example. Cloud-storage providers offer users clean and simple file-system
interfaces, abstracting away the complexities of direct hardware management. At
the same time, though, such services eliminate the direct oversight of component
reliability and security that enterprises and other users with high service-level
requirements have traditionally expected. Permission to make digital or hard copies
of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage
and that copies bear this notice and the full citation on the first page. To copy
otherwise, to republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee. To restore security assurances eroded by cloud
environments, researchers have proposed two basic approaches to client
verification of file availability and integrity.
27
CHAPTER 3
DEVELOPMENT ENVIRONMENT
3.1 MYECLIPSE:
My Eclipse is a commercially available Enterprise Java and AJAX IDE
created and maintained by the company Genuine, a founding member of the
Eclipse Foundation. My Eclipse is built upon the Eclipse platform, and integrates
both proprietary and open source solutions into the development environment.
My Eclipse has two primary versions a professional and a standard edition.
The standard edition adds database tools, a visual web designer, persistence tools,
Spring tools, Struts and JSF tooling, and a number of other features to the basic
Eclipse Java Developer profile. It competes with the Web Tools Project, which is a
part of Eclipse itself, but My Eclipse is a separate project entirely and offers a
different feature set. Most recently, My Eclipse has been made available via Pulse,
a provisioning tool that maintains Eclipse software profiles, including those that
use My Eclipse.
3.2 MySQL:
MySQL was developed by a consulting firm in Sweden called TcX. They
were in need of a database system that was extremely fast and flexible.
Unfortunately they could not find anything on the market that could do what they
wanted. So, they created MySQL, which is loosely based on another database
management system called SQL. The product they created was fast, reliable, and
extremely flexible. It is used in many places throughout the world. Lately,
however, it has begun to permeate the business world as a reliable and fast
28
database system MySQL is often confused with SQL, the structured query
language developed by IBM. It is not a form of this language but a database system
that uses SQL to manipulate, create, and show data. MySQL is a program that
manages databases, much like Microsoft's Excel manages spreadsheets. SQL is a
programming language that is used by MySQL to accomplish tasks within a
database, just as Excel uses VBA (Visual Basic for Applications) to handle tasks
with spreadsheets and workbooks.
A database is a series of structured files on a computer that are organized in
a highly efficient manner. These files can store tons of information that can be
manipulated and called on when needed. A database is organized in the
hierarchical manner, from the top down. You start with a database that contains a
number of tables. Each table is made up of a series of columns, Data is stored.
3.3 J2EE
Today more and more development wants to write distributed transactional
applications for the enterprise and leverage the speed, security and reliability of
server side technology. J2EE is a platform independent, java centric environment
from sun for developing, building and deploying web based enterprise application
online. The J2ee platform consists of a set of services, API’s and protocols that
provide functionality for developing multi-tiered web based application.
At the client side tier, J2EE support pure HTML as well as java applets or
applications. It relies on jsp and Servlets codes to create HTML or other formatted
data for the client. EJB provide another layer where the platform’s logic is stored.
To reduce costs and fast-track enterprise application design and development, the
java2 platform, Enterprise edition (J2EE) technology provides a component-based
approach to the design.
29
CHAPTER 4
DESIGN ARCHITECTURE
This project “Towards Secure and Dependable Storage Services in Cloud
Computing” merely implemented as “Online Banking” software is an online Web
application through which we can provide an additional level of security for users
data’s being stored. The security is being prosecuted by generating a homomorphic
token id which can be used only once after which it gets expired.
Modules in this project
Administrator as TPA
Client
Banker
4.1 Administrator
Here the administrator himself serves as TPA. He has the sole privilege to
modify or make any change in the database. He can access the client data as well
as the employee details. He is the one who authenticate all the process being
carried out in the application.
4.2 Client
One who enjoy any services in the application is referred here as the client.
Here the client can login as either a current account user or savings account user.
There are facilities for transferring fund from any account to another, check the
status of the account which includes balance enquiry and user login details and so
on.
A client can also apply for a loan online through this application if he is not
being black listed for any malpractice or any other deal.
30
CHAPTER 5
IMPLEMENTATIONS
5.1 DATA FLOW DIAGRAM
Level 1:
Figure 5.1.1: Client Services
The diagram explains how a client is accessing the services.
31
User
Authentication and
user Account
Education Loan Vehicle Loan Money Transfer
User information affect to database in the cloud. This section is keenly monitoring
by Auditor
Logout
Level 2:
32
Auditing View
Authentication
Reference number
Generate Each
Attempt
Choose user to retrieve monitored Data.
ResultResult
Logout
Figure 5.1.2: Third Party Auditing
The diagram shows the works of TPA.
5.2 UML DIAGRAMS
5.2.1Use-case Diagram
Adminstrator Client
Figure 5.2.1.1: Client Login
The above diagram represents the client login.
33
VALIDATION
REQUEST
RESPONSE
LOGIN
MONITORING
Client <<Cloud database System>>
Figure 5.2.1.2: Client Services
This diagram explains the client Services and how the client is accessing the cloud database.
34
LOGIN
MONEY TRANSFER
EDUCATION LOAN
VIEW STATUS
VECHICLE LOAN
5.2.2 ACTIVITY DIAGRAM
Figure 5.2.2: Activity Diagram
The above diagram shows the admin and client activities in the cloud storage.
35
Start
Home
Client loginAdmin login
Client pageMoney TransferEducation Loan
Vehicle Loan
Admin page Monitoring Client data
End
CloudDatabase System
Yes Yes
No
5.2.3 CLASS DIAGRAM
Figure 5.2.3: Class Diagram
The class diagram contains Login as the main class and the remaining as the sub-classes and represent how each classes are inherited from the main class.
36
5.2.4 SEQUENCE DIAGRAM
Figure 5.2.4: Sequence Diagram
In this diagram the sequential flow of instructions are explained and how the request & response are processed.
37
5.3 ARCHITECTURE DIAGRAM
Figure 5.3: Architecture Diagram
The above diagram shows the Cloud Storage Architecture.
38
Third Party Auditor
OwnerUser
Cloud Server
Data Auditing
Delegation
Issue File Access Credential
File Access
Public data auditing
CHAPTER 6
SYSTEM TESTING
Testing is vital to the success of the system. Testing is usually carried out to
check for the Reliability of the system. System testing goal will be successfully
achieved. The aim of testing is to create a buy free, reliable and secure system.
Inadequate testing or non testing leads to error that may not appear until months
later.
The objective of testing is to discover the errors found in the system. The
main stage of testing is:
Module Testing
Integration Testing
Validation Testing
Unit Testing
Black box Testing
White box Testing
39
6.1 MODULE TESTING
Each individual program module is tested for any possible errors. They
were also tested for Specifications i.e., to see whether they are working as per what
the program does and how it should be performed under various conditions.
6.2 INTEGRATION TESTING
Testing a collection of module is known as integration testing. A module is a
collection of dependent classes such as an object class, an abstract data types or
some looser collection of procedures and functions. A module encapsulated
relation components could be tested without other system modules.
In this testing can be a loss of data across interfaces, one module can have an
inadvertent, adverse effect on another sub function, when combined may not
produce designed major functions.
6.2.1 TOP DOWN INTEGRATION TESTING
It is an incremental approach for the construction of a program structure.
Here, modules are integrated by moving downwards through the control hierarchy,
beginning with the main control module.
6.2.2 BOTTOM UP INTEGRATION TESTING
It begins with the construction and testing of atomic modules. Because
components are integrated from the bottom-up, processing required for the
components, subordinate to a given level is available and need for stubs are
eliminated.
6.3 VALIDATION TESTING
Validation is the process of checking if something satisfies a certain
criterion. Examples would include checking if a statement is true, if an appliance
works as intended, if a system is secure, or if computer data’s are compliant with
40
an open standard. Validation implies one is able to document that a solution or
process is correct or is suited for its intended use.
6.4 UNIT TESTING
It deals with testing the individual units or programs in the system. It is to test the main functions availability and check whether they are working or not.
To test whether all hyperlinks are working properly or not.
To test the administrator facilities such as authentication check, change password, insertion, modification of administrative details, and database manipulation from the client side administrative login.
6.5 BLACK BOX TESTING
It is a testing technique where the internal where the terminal architecture of the item is being tested is not necessarily known by the tester.
The tester never examines the project code.
This type of testing should not be performed by the developer of the program but the tester.
Tester should know about the input and the expected outcomes.
6.6 WHITE BOX TESTING
It is known as the glass box or structural or open box testing.
It uses knowledge of programming code to examine output.
Tester should read and use the source code of the application being tested.
Tester must apply carious input in order to test branches, conditions, loops and the logical sequence of the statements being executed.
41
CHAPTER 7
CONCLUSION
7.1 CONCLUSION
In this paper, we investigate the problem of data security in cloud data
storage, which is essentially a distributed storage system. To achieve the
assurances of cloud data integrity and availability and enforce the quality of
dependable cloud storage service for users, we propose an effective and flexible
distributed scheme with explicit dynamic data support, including block update,
delete, and append.
We rely on erasure-correcting code in the file distribution preparation to
provide redundancy parity vectors and guarantee the data dependability. By
Utilizing the homomorphic token with distributed verification of erasure-coded
data, our scheme achieves the integration of storage correctness insurance and data
error localization, i.e., whenever data corruption has been detected during the
storage correctness verification across the distributed servers, we can almost
guarantee the simultaneous identification of the misbehaving server(s).
Considering the time, computation resources, and even the related online
burden of users, we also provide the extension of the proposed main scheme to
support third-party auditing, where users can safely delegate the integrity checking
tasks to third-party auditors and be worry-free to use the cloud storage services.
Through detailed security and extensive experiment results, we show that our
scheme is highly efficient and resilient to Byzantine failure, malicious data
modification attack, and even server colluding attacks.
42
APPENDIX 1
SCREENSHOTS
HOME PAGE
43
ADMIN LOGIN PAGE
44
AUDIT LOGIN-1
45
TOKEN ID GENERATION
46
AUDIT LOGIN WITH TOKEN ID
AUTOMATIC RESET OF TOKEN ID AFTER AUDIT LOGIN
47
ADMIN HOME
AUDIT RECORD
48
AUDIT DETAILS
USER DETAILS
49
User details
EMPLOYEE DETAILS
CLIENT SERVICES
50
ONLINE USER REGISTRATION FORM
USER LOGIN PAGE-1
51
USER LOGIN PAGE-2
USER ACCOUNT DETAILS
52
ONLINE TRANSACTION PAGE
53
MONEY TRANSFER
LOAN PAGE
54
VEHICLE LOAN PAGE
ONLINE LOAN APPLICATION
55
LOAN STATUS CHECK
EDUCATION LOAN DETAILS
56
57
CONTACT US
58
APPENDIX 2
REFERENCES:
1. Amazon.com, (July 2008) “Amazon s3 Availability Event: July 20,
2008,” http://status.aws.amazon.com/s3-20080720.html.
2. M. Arrington, (Dec. 2006) “Gmail Disaster: Reports of Mass Email
Deletions,” http://www.techcrunch.com/2006/12/28/gmail-disaster-
reports-of-massemail- deletions/.
3. M. Armbrustet al., (Feb. 2009.) “Above the Clouds: A Berkeley View of
Cloud Computing,” Univ. California, Berkeley, Tech. Rep. UCBEECS-
2009-28,
4. Juels, J. Burton, and S. Kaliski, (07, Oct. 2007,) “PORs: Proofs of
Retrievability for Large Files,” Proc. ACM CCSpp. 584–97.
5. M. Krigsman, (Dec. 2006) “Apple’s Mobile Me Experiences Post-
Launch Pain, “http://blogs.zdnet.com/projectfailures/?p=908.
6. P. Mell and T. Grance, (2009) “Draft NIST Working Definition of Cloud
Computing”
http://csrc.nist.gov/groups/SNS/cloud-computing/index.html.
59