software requirement specification

15
Software Requirement Specification for Fuzzy Keyword Search Over Encrypted Data In A Cloud Submitted by: Anju N S EPAMECS009 Chaythanya S K EPAMECS022 Hari K EPAMECS030 Rekha N EPAMECS045 under the guidance of Mr. Irshad M Asst. Professor, Dept. Of Computer Science And Engineering DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING GOVERNMENT ENGINEERING COLLEGE SREEKRISHNAPURAM PALAKKAD August 10, 2015

Upload: anonymous-8axe1csez

Post on 06-Dec-2015

12 views

Category:

Documents


3 download

DESCRIPTION

Fuzzy keyword search over encrypted data in a cloud.Software requirement specification.

TRANSCRIPT

Software Requirement Specification

for

Fuzzy Keyword Search Over Encrypted Data In A

Cloud

Submitted by:

Anju N S EPAMECS009Chaythanya S K EPAMECS022Hari K EPAMECS030Rekha N EPAMECS045

under the guidance of

Mr. Irshad MAsst. Professor, Dept. Of Computer Science And Engineering

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERINGGOVERNMENT ENGINEERING COLLEGE

SREEKRISHNAPURAMPALAKKAD

August 10, 2015

Contents

List of Figures 3

1 Introduction 41.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Intended Audience and Reading Suggestions . . . . . . . . . . . . . . . . . . . . . . . 41.4 Product Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Overall Description 52.1 Product Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Product Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 User Classes and Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.4 Operating Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.4.1 Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4.2 Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.5 Design and Implementation Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 62.6 User Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.7 Assumptions and Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 External Interface Requirements 73.1 User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.2 Hardware Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.3 Software Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.4 Communication Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 System Features 94.1 Upload encrypted files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.1.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.1.2 Stimulus Response Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.1.3 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.2 Fuzzy Keyword Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.2.2 Stimulus Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.2.3 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.3 Third Party Auditor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.3.2 Stimulus Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.3.3 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.4 Cloud Admin/CSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.4.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.4.2 Stimulus Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.4.3 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5 Non-Functional Requirements 135.1 Safety Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.2 Security Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1

Appendices 13

Appendix A Glossary 13

Appendix B Analysis Models 14

Appendix C To Be Decided List 14

2

List of Figures

1 Basic System Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Three-node architecture networking architecture . . . . . . . . . . . . . . . . . . . . 83 Data Owner Uploading a File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Data User Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Third Party Auditing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Cloud Administrator/Service Provider . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3

1 Introduction

1.1 Purpose

Purpose of this document is to describe and analyse the software requirements for developing thefirst version of a secure cloud that provides storage as a service with fuzzy keyword search alongwith third-party auditing for the encrypted files stored.

1.2 Document Conventions

No particular conventions followed.

1.3 Intended Audience and Reading Suggestions

This document is intended for developers,marketing staff,documentation writers and users.Therest of the documents consist of an overall description of the project, its software features alongwith functional and nonfunctional requirements.While it would be best if the document is readstarting from the overview section, it is enough for the users to read the documentation sectionalone. Also marketing staff may read features and user documentation only.

1.4 Product Scope

The proposed project is to allow users to perform an efficient search in a cloud environmentwhile at the same time keeping their data private and secure. Although traditional searchableencryption schemes allows users to securely search over encrypted data through keywords, andselectively retrieve files of interest, these techniques support only exact keyword search. That is,there is no tolerance of minor typing errors and format inconsistencies which, on the other hand,are typical user searching behavior and happen very frequently. This significant drawback makesexisting techniques not suitable for a cloud as it greatly affects system usability, rendering usersearching experiences very frustrating and system efficacy very low.

Here, we propose to develop a system that solves the problem of effective fuzzy keyword searchover encrypted cloud data while maintaining keyword privacy.Fuzzy keyword search greatly en-hances system usability by returning the matching files even when users’ searching inputs dontexactly match the predefined keywords.

The primary disadvantage of cloud computing is security. The proposed system introduces athird party auditor to audit users data whenever required. Any user (not just the data owner) canchallenge the cloud server for correctness of the stored data via the third party. The third partyauditor keeps no private information while auditing thus user data remains safe and secure.

1.5 References

• http://docs.openstack.org/

• https://maas.ubuntu.com/docs/

4

2 Overall Description

2.1 Product Perspective

The product to be developed could be classified as an alternative to existing solutions.But at thesame time it is a self contained project. As mentioned in the Scope of the product, existing systemsdo not implement a fuzzy keyword search combined with third-party auditing. Given below is asimple diagrammatic representation of the product.

Figure 1: Basic System Representation

2.2 Product Functions

• Search and download public files

• Upload files

• Delete uploaded files

• Audit the cloud server using a Third Party Auditor

2.3 User Classes and Characteristics

Based on the product functions mentioned above there are 4 classes of users:

1. Data User: This is a general class of all registered users.This class of users can upload andsearch files as well as invoke the auditor.They may not be able to download,edit or delete afile without the permission of its owner.

2. Data Owner: Owner is the most important users as most of the functionalities are developedwith respect to him.He is a subset of Data Users.Data Owner can upload,edit,delete anddownload his files.He may invoke the auditor as well.Only data Owner can view the actualcontents of a file.

3. Cloud Service Provider: CSP is the one who provides the storage service.He will act as theadmin and hence can view the user profiles and encrypted files.In the proposed product weassume CSP to be semi-honest and curious.

5

4. Third Party Auditor: An entity, which has expertise and capabilities that clients do not have,and is trusted to assess and expose risk of cloud storage services on behalf of the clients uponrequest.Any registered user can request the TPA to check the validity of a file.TPA is trustedand hence itself will not make user data vulnerable to threat.

2.4 Operating Environment

The system requires the following hardware and software environments for development and tofunction properly

2.4.1 Software Requirements

• Operating System: Ubuntu 14.04 Server Edition

• Software Platform: Openstack 11 Kilo

• Hypervisor: Kernel-based Virtual Machine

2.4.2 Hardware Requirements

• 1 node with 1 processor, 2 GB memory, and 5 GB storage

• 1 node with 1 processor, 512 MB memory, and 5 GB storage

• 1 node with 1 processor, 2 GB memory, and 10GB storage

2.5 Design and Implementation Constraints

As mentioned in the previous section, the cloud is built and managed using Openstack. Openstack Networking (neutron) architecture mainly requires three nodes, two of which require morethan one network interface. Since none of the PCs currently available has more than one Ethernetinterface, the requirement is expected to be fulfilled by a wireless LAN interface in the PCs or byrunning virtual machines where network interfaces are configurable.

2.6 User Documentation

Users can expect a complete user manual with the final report on the completion of the project.

2.7 Assumptions and Dependencies

The project is expected to be built on free software which were mentioned in software require-ments,especially the hypervisors.So if in the future these software turn proprietary, then it couldeffect the development of the product. Apart from software, it is assumed that the amount of hard-ware dedicated to the project is feasible. Also an uninterrupted power and network connectivity isassumed.

6

3 External Interface Requirements

3.1 User Interfaces

Users can access the Secure Cloud using a web interface.The web page will have the standardforms for User Register and Sign In.Once signed in the Data User/Owner will be provided buttonsto enable uploading of files,viewing user owned files and a search bar to search for public files.

Files owned by User will be listed in a table with delete and download buttons.Also there wouldbe an option to request TPA services.On every page there would be the Settings and Logout links.

3.2 Hardware Interfaces

To set up a basic cloud environment, the three-node networking architecture(neutron) is followed:

• The basic controller node runs the management portions of Compute and Networking, Net-working plug-in, and the dashboard. It also includes supporting services such as a database,message broker, and Network Time Protocol (NTP).

• The network node runs the Networking plug-in. It had agents in Layers 2 and 3 that provisionand operate tenant networks. Layer 2 services include provisioning of virtual networks andtunnels. Layer 3 services include routing, Network Address Translation (NAT), and DynamicHost Configuration Protocol (DHCP). This node also handles external (internet) connectivityfor tenant virtual machines or instances.

• The compute node runs the hypervisor portion of Compute, which operates tenant virtualmachines or instances. By default Compute uses KVM as the hypervisor. The computenode also runs the Networking plug-in and layer 2 agent which operate tenant networks andimplement security groups.

7

Figure 2: Three-node architecture networking architecture

3.3 Software Interfaces

The private Secure Cloud is set up using Ubuntu OS and Openstack Cloud Software Plat-form.Openstack provides many tools for managing a private cloud and providing IaaS.Since theproposed project provides Storage as a Service, Openstack was the best choice.Nova,Cinder,Key-stone,Swift,Glance and Horizon are some of the software interfaces provided by Openstack to man-age various features.

Openstack uses MaaS-Metal as a Service-which turns bare metal-servers and nodes-into an elasticcloud-like resource. This means that the cloud administrator just need tell MaaS about the machinesit should manage and it will boot them, check that the hardwares okay, and have them waiting tillthey are needed. The administrator can then pull nodes up, tear them down and redeploy them atwill; just as with virtual machines in the cloud.

When a particular service is ready to be deployed, MaaS gives Juju the nodes it needs to powerthe service. Juju is used to manage how systems and services are deployed automatically in thecloud.

3.4 Communication Interfaces

The user interacts with the system either to search for a file or to upload/edit his file. Both theseinteractions are handled through web pages. The back end programming handles further processingof the request. The users need no internet connectivity since the search is implemented within aprivate network. File transfer between users and the server is carried out by FTP. Openstackuses Network Time Protocol (NTP) for clock synchronization between computer systems over thenetwork.

8

4 System Features

4.1 Upload encrypted files

4.1.1 Description

The first step in providing a secure cloud is to ensure that the CSP(or any unauthorized user)cannot view or edit the contents of the file.Hence files are encrypted and can only be accessed withthe permission of the owner.Neither the CSP nor the TPA can view the file.They can only viewthe encrypted contents.

4.1.2 Stimulus Response Sequence

The data owner logs into his account and browses his local machine for a file he wants to upload.Once he has selected the file, he can either keep the file private or make it available for the searchengine by keeping it public. If the file is kept public, He has to specify the type of the file, pdf,ppt, doc, etc. Before finishing the upload operation he has to provide keywords for his file. Thefile is then encrypted and send to the Cloud Administrator for further processing.

Figure 3: Data Owner Uploading a File

4.1.3 Functional Requirements

Data Owner

Requirement 1. File Browser: The Data Owners who have files to be outsourced can select thefiles of their choice using the Browse/Open File option.

Requirement 2. Public/Private: Once selected, the owner can decide if the file should be Publicor Private.Only a Public file will be returned during a search.Private files will be visible onlyto the Data Owner.

Requirement 3. Encrypt the file: Using this option, the user can encrypt the file. Once en-crypted, user can check it by viewing the file which would now show encrypted text.

9

Requirement 4. Upload: Finally the user can upload the file to the cloud using the uploadoption.It is required that the user checks through each of the previous options for a successfulupload.

CSP/TPA/Data User

Requirement 1. View File: This would enable all the users to view the file content, only that itwould be encrypted.

4.2 Fuzzy Keyword Search

4.2.1 Description

Fuzzy keyword search greatly enhances system usability by returning the matching files whenusers searching inputs exactly match the predefined keywords, or the closest possible matching filesbased on keyword similarity semantics, when exact match fails.

W hen the search button is clicked a fuzzy keyword search is performed.This can be performedin three ways:Wildcard,Gram-based and Tree-traverse search schemes.

4.2.2 Stimulus Response

The user logs into his account and searches the cloud for a file. The search returns the matchingfiles, and sorts it based on the file type, pdf, ppt, doc, etc. The user can challenge the integrity ofthe cloud server via the TPA.

Figure 4: Data User Searching

4.2.3 Functional Requirements

Data Owner

Requirement 1 Keywords: Whenever the user upload a public file he/she will be asked to enterfew keyword that best describes the file contents.Keywords are collected because not only thefile but also its name and keywords are encrypted.

Requirement 2 File type: If the privacy of the file is set as public, the data owner is asked tospecify the type of file that is being uploaded. Setting the file type for eg .mp3/.txt/.pdf/.pptx

10

would help the developer provide a more efficient and classified search, thus keeping the searchresults minimum and relevant.

Data User

Requirement 1 Search: CSP will perform the fuzzy keyword search and return a list of filestagged with the keyword.User can only download these files.

Requirement 2 Download: Search would result in a list of files which the user can download.The downloaded file will be decrypted.

4.3 Third Party Auditor

4.3.1 Description

The proposed system introduces a third party auditor to audit users data whenever required.Any user (not just the data owner) can challenge the cloud server for correctness of the stored datavia the third party. The third party auditor keeps no private information while auditing thus userdata remains safe and secure.

4.3.2 Stimulus Response

The data owner or the user requests the TPA to verify the integrity of the Server/ CSP. The TPAlogs into his account, scans through the pending requests and verifies them. Once the requests areverified, the TPA challenges/queries the CSP/ the cloud server for the validity of the file.

Figure 5: Third Party Auditing

4.3.3 Functional Requirements

Data Owner/User

Requirement 1 Verify file request:This would invoke a call to TPA to verify the required file.After verification users may download the file. Any user with the public key to the file canplace the request.

11

TPA

Requirement 1 TPA Log in: The third party auditor will have his own user id nd password withwhich he can log in and manage the requests.

Requirement 2 Approve request: This would allow the TPA to verify the specified file by sendinga challenge to the CSP and verifying its result.

4.4 Cloud Admin/CSP

4.4.1 Description

The cloud admin is a person who manages and stores the files uploaded by users. He servicesvarious requests from data owners and users, and can view user profiles and encrypted files.

4.4.2 Stimulus Response

The cloud admin logs into his account, and can view all the uploaded files and user details. Hecan also view the pending TPA requests. The admin responds to the challenge raised by the TPA.

Figure 6: Cloud Administrator/Service Provider

4.4.3 Functional Requirements

Requirement 1 Admin Login: Admin can log in with his respective username and password.In the Admin Page, all the uploaded(encrypted) file list,user details,approved/pending userrequests can be viewed. He also verifies, manages and responds to TPA challenge queries.

Requirement 2 View User Requests: The Admin has a list of user requests which has to beapproved for the user to upload files.

Requirement 3 View Uploaded Files: The Admin can view the entire list of uploaded files alongwith the user details.

Requirement 4 View TPA Requests: The Admin can view a list of pending TPA requests andservice them one by one.

12

5 Non-Functional Requirements

5.1 Safety Requirements

The database is set up on the controller node. Any damage to the controller node poses a seriousthreat to data.

5.2 Security Requirements

Since the project is intended to address the security concerns in cloud environments, it basicallyimplements all the usual security techniques. All users of the cloud maintains an account with theCSP/ Cloud admin. The TPA as mentioned above is a separate entity that unlike the user or thedata owner has special privileges.

It is ensured that the TPA doesnt keep any private information while auditing thus user dataremains safe and secure.

Appendix A Glossary

Third Party Auditor (TPA) An entity, which has expertise and capabilities that clients donot have, and is trusted to assess and expose risk of cloud storage services on behalf of the clientsupon request.Any registered user can request the TPA to check the validity of a file.TPA is trustedand hence itself will not make user data vulnerable to threat.

Cloud Service Provider (CSP) A service provider that offers customers storage or softwareservices available via a private (private cloud) or public network (cloud). Usually, it means thestorage and software is available for access via the Internet.

Infrastructure as a Service (IaaS) Infrastructure as a service (IaaS) is a type of cloud com-puting in which a third-party provider hosts virtualized computing resources over the Internet.

Storage as a service (SaaS) is an architecture model in which a provider provides digitalstorage on their own infrastructure.Storage as a service can be implemented as a business modelin which a large service provider rents space in their storage infrastructure on a subscription basis.

Network Time Protocol (NTP) is a networking protocol for clock synchronization betweencomputer systems over packet-switched, variable-latency data networks. In operation since before1985, NTP is one of the oldest Internet protocols in current use.

Transfer Protocol (FTP) is a standard network protocol used to transfer computer files fromone host to another host over a TCP-based network, such as the Internet. FTP is built on aclient-server architecture and uses separate control and data connections between the client andthe server.

13

Appendix B Analysis Models

Appendix C To Be Decided List

Other design features like algorithms used by the Third party auditor, the back end programminglanguage, procedure for setting up a cloud etc are yet to be determined.

14