secure data replication between two database …greenskill.net/suhailan/fyp/report/037478.pdf ·...
TRANSCRIPT
SECURE DATA REPLICATION BETWEEN TWO
DATABASE SERVERS USING TWO FISH
ENCRYPTION ALGORITHM
SITI MAZIDAH BINTI MOHAMAD
BACHELOR OF COMPUTER SCIENCE
(NETWORK SECURITY)
UNIVERSITI SULTAN ZAINAL ABIDIN
2017
SECURE DATA REPLICATION BETWEEN TWO DATABASE SERVERS
USING TWO FISH ENCRYPTION ALGORITHM
SITI MAZIDAH BINTI MOHAMAD
Bachelor of Computer Science (Network Security)
Faculty of Informatics and Computing
Universiti Sultan Zainal Abidin, Terengganu, Malaysia
MAY 2017
i
DECLARATION
I hereby declare that this report is based on my original work except for quotations
and citations, which have been duly acknowledged. I also declare that it has not been
previously or concurrently submitted for any other degree at Universiti Sultan Zainal
Abidin or other institutions.
________________________________
Name : ..................................................
Date : ..................................................
ii
CONFIRMATION
This is to confirm that:
The research conducted and the writing of this report was under my supervison.
________________________________
Name : ..................................................
Date : ..................................................
iii
DEDICATION
Firstly and foremost, praised to Allah, the Most Gracious and the Most Merciful
for blessing me and giving me the opportunity to undergo and complete this final year
project.
I would like to take this opportunity to express my heartiest gratitude to my
supervisor. Dr. Zarina binti Mohamad for her teachings, kindness, patience, and ideas
towards this project.
Finally, I would like to thanks to Faculty of Informatics and Computing for the
chance to expose and explore students with this project. I would like to thank all the
lectures in Faculty of Informatics and Computing especially for giving me a great
support to complete the final year project.
iv
ABSTRACT
Rapid development of multimedia technologies, research on safety and security are
becoming more important. With this development, the number of user that used Web
Server increase. Then the data keep changes. Because of that we need replication for
the availability of data. Data replication is an important technique in peer to peer
network, data grid architecture, clustering and distributed system. Data replication
increases data availability and enhances data access and reliability and minimizes the
cost of the data transmission. Web server consists of a set of nodes via a network. Each
node contains a copy of the entire database. Queries can be submitted to any node in
the cluster. Some of these queries may have problem. For example, if two concurrent
transactions attempt to modify the same row of data, both updates cannot be applied
simultaneously at one time. Beside that data that replicated from one database server to
another database server not securely replicated. This paper proposed a secure data
replication using encryption algorithm. This is method where data from a database will
be encrypted first then replicate it to another server. Data encryption translates data into
another form or code, so that only people with access to a secret key or password can
read it. Primary purpose of applied this encryption is data will securely replicated over
the network.
v
ABSTRAK
Perkembangan pesat teknologi multimedia, penyelidikan mengenai keselamatan
menjadi lebih penting. Dengan perkembangan ini, bilangan pengguna yang
menggunakan Web Pelayan meningkat. Maka data yang disimpan mahupun ditambah
ke dalam pangkalan data. Oleh kerana itu kita perlu replikasi untuk kesediaan data.
Replikasi data adalah teknik penting dalam peer-to-peer rangkaian, seni bina grid data,
kelompok dan sistem teragih. Replikasi data meningkatkan kesediaan data dan
meningkatkan akses data dan kebolehpercayaan dan mengurangkan kos penghantaran
data. Pelayan web terdiri daripada satu set nod melalui rangkaian. Setiap nod
mengandungi salinan keseluruhan pangkalan data. Pertanyaan boleh dikemukakan
kepada mana-mana nod dalam kelompok. Banyak pertanyaan ini mungkin mempunyai
masalah. Sebagai contoh, jika dua transaksi serentak cuba untuk mengubah suai baris
yang sama data, kedua-dua kemas kini tidak boleh digunakan secara serentak pada
satu masa. Selain data yang yang ditiru dari satu pelayan pangkalan data kepada
pelayan pangkalan data yang lain tidak selamat untuk direplika. Dengan ini, saya
mencadangkan replikasi data dengan menggunakan algoritma enkripsi. Ini adalah
kaedah di mana data dari pangkalan data yang akan disulitkan dahulu kemudian
direplikasikan ke pelayan lain. Enkripsi data menterjemahkan data ke dalam bentuk
atau kod lain, supaya hanya orang yang mempunyai akses kepada kunci atau kata
laluan rahsia boleh membacanya. Tujuan utama menggunakan enkripsi data ini akan
selamat dienkrip melalui rangkaian.
vi
CONTENT
DECLARATION i
CONFIRMATION ii
DEDICATION iii
ABSTRACT iv
ABSTRAK v
CONTENTS vi-vii
LIST OF TABLES viii
LIST OF FIGURES ix
LIST OF ABBREVIATIONS x
LIST OF APPENDIX xi
CHAPTER 1 INTRODUCTION
1.1 Background 1-3
1.2 Problem Statement 3
1.3 Objectives 3-4
1.4 Project Scope & Limitation 4
CHAPTER II LITERATURE REVIEW
2.1 Introduction 5
2.2 Database Server 5
2.3 Web Server 5-6
2.4 Data replication 6-7
vii
2.5 Data Encryption 7-8
CHAPTER III METHODOLOGY
3.1 Introduction 9
3.2 System design
3.2.1 Framework 10-12
3.2.2 Algorithm 12-16
3.3 Conclusion 16
References 17
Appendix 18
viii
LIST OF TABLE
TABLE TITLE PAGE
3.2.2 Twofish Encryption Algorithm 13
3.2.3 Detail Algorithm 14-15
ix
LIST OF FIGURES
FIGURE TITLE PAGE
3.2.1 Framework Project 9
3.2.3 Detail of Framework Project 11
3.2.3 Framework of Data Encryption 12
x
LIST OF ABBREVIATIONS / TERMS / SYMBOLS
SMS Short Message Service
AES Advanced Encryption Standard
DES Data Encryption Standard
xi
LIST OF APPENDIX
APPENDIX TITLE PAGE
A Gantt chart 18
1
CHAPTER 1
INTRODUCTION
1.1 Background
The rise of the internet and the World Wide Web sparked a revolution not only
in network communications but also in application design and development. With this
revolution, the number of user that surf the internet by accessing the system
development increase. The system developed in web server system and web server
application.
The World Wide Web (WWW) is implemented by means of an interconnection
of networks of computer systems. This interconnection of computer systems provides
information and services to users of the web. Computer system in interconnection of
networks that provide services and information to users of computer systems are call
Web Servers. Computer system that request services and information use software
called Web Browser.
Web server is a computer where the web content is stored. Basically web server
is used to host the web sites but there exists other web servers also such as gaming,
storage, FTP, email and etc. Web server working as respond to the client associated
with the requested URL. Then it generating response by invoking a script and
2
communicating with database. All the file or data that requested by client will
automatically store in database storage.
Data replication is a process that copies and maintains database objects, such as
tables in multiple databases. A change of the main database is reflected, forwarded and
applied at each of the replicated server which might be in a remote location. Data
replication can be drives by programs which transport data to some other location and
then loaded at the receiving location. Data may be filtered and transformed during
replication. Replication must not interfere with existing applications and should have
the minimal impact on production systems. The replication processes to need to be
managed and monitored. So finally, data replication improves data access time,
transaction time and provides fault tolerance by maintaining and managing multiple
copies of data at different locations. Primary purpose replication of data to increase the
availability and faster server response time. Moreover, data replication need replicated
in secure way to another database server.
Rapid development of multimedia technologies, research on safety and security
are becoming more important. Therefore, delivery and storage of data via electronic
media requires a process that ensures security and integrity of the data. Thus, a way to
secure data and information in order forms, namely encryption process. Encryption is
done when the data will be sent. This process will transform an original data into
confidential data which cannot be read. Meanwhile, decryption process done by the data
receiver. Received confidential data is converted back to the original data using a key.
Data encryption is used pervasively in today’s connected society. As modern
society becomes more connected, and more information becomes available there is need
3
for safeguards which bring data integrity and data secrecy. In addition, authenticating
the source of information gives the recipient, with complete certainly that the
information came from the original source and that it has not been altered from its
original state.
Data encryption is the translation of data into a secret code. Encryption is the
most effective way to achieve data security. To read an encrypted file, you must have
access to a secret key or password that enables you to decrypt it. Unencrypted data is
called plain text encrypted data is referred to as cipher text.
1.2 Problem Statement
Replication of data is the methods to manage a huge resource of data as it
enhances reliability and data access. Replication is a process of copying and maintaining
database objects in multiple databases to increase the availability and reliability of data
that make up a distributed database system because alternate data access option exist.
However the efficiency of securing data when data replication occur are not up to the
satisfactory limit.
1.3 Objectives
The objectives of this project are:
To propose a technique to ensure the security of data replication in
database server.
To apply encryption algorithm technique in data replication.
4
To test encryption algorithm technique to secure data replication in
database server
1.4 Project Scope and Limitation
The scope of this project focuses on to secure data replication in database server
by using encryption algorithm technique. This technique will simulate in virtualization.
Three Virtual server will be used which one master server and the other two are slave
server.
The limitation of this project will simulate on virtual server only for prototype.
It will use more cost and time for implement in physical server.
5
CHAPTER 2
LITERATURE REVIEW
2.1 Introduction
There are several related published studies concerning database server, data
replication, and technique of encryption. This chapter discusses the idea of the previous
research or article related. It also highlight the implementation technique used in
previous research. Other than that, this can prove as evidence of the technique is
feasible.
2.2 Database server
Database server is a computer in a network that is dedicated to database storage
and retrieval. It also holds the database management system and the databases. It
searches the database for selected records and passes back the results when there are
any request from the client.
2.3 Web Server
Web server is a computer where the web content is stored. Basically web server
is used to host the websites. In Web server also need web security requirements for such
a network are more extensive than a single multi-user computer system or stand-alone
local area network since there are two way communication. Web security is a set of
6
procedures, practices, and technologies for protecting Web servers, Web users and their
surrounding organizations (Garfinkel and Spafford, 1997).
2.3 Data Replication
Data replication is one of the methods to manage huge resource of data as it
enhances reliability and data access (Noraziah,Azila,Fauzi,Mat & Mohd, 2011).
Replication is a software feature which synchronizes data to a remote system within the
same site or a different location. Replicating data helps to provide data redundancy and
safeguards against storage system failure or database server at the main production site
(December 2016).
Synchronous and asynchronous replication technologies have been available for
a long period of time. Synchronous replication has the advantage of no data loss, but
due to latency, synchronous replication is limited by distance and bandwidth.
Asynchronous replication on the other hand has no distance limitation, but leads to some
data loss which is proportional to the data lag. We present a novel method, implemented
within EMC Recover-Point, which allows the system to dynamically move between
these replication options without any disruption to the I/O path. As latency grows, the
system will move from synchronous replication to semi-synchronous replication and
then to snapshot shipping. It returns to synchronous replication as more bandwidth is
available and latency allows (Assaf Natanzon ,2013).
Additionally, (Mattias Holmgren, 2015) applied multi master database
replication on e-learning system. This research had investigated the possibility to
7
combine multi master database replication technologies together with a LEMP-stack on
a tiny server to increase the availability of e-learning services in remote area. The aim
was to evaluate the combination of symmetric DS for multi master database replication
and conflict detection and resolution with e-learning system.
Fragmentation is a technique designed to divide a single relation or class of
database into two or more partitions such that the combination of partitions provides
the originally database without any loss of information (Nurul Syafiqah Hani, 2016).
Fragmentation of data can be horizontal, vertical, or hybrid. (Nurul Syafiqah Hani,
2016) applied horizontal fragmentation technique in data replication process.
2.4 Technique of Encryption
Encryption is the technique or process of transforming plain text data into cipher
text in order to conceal its meaning and preventing any an unauthorized recipient from
retrieving the original information or data(August 2016). Encryption techniques are
used to secure information while it is stored within a network node or while it is transit
or replicate between database server and on their own nodes (Armit Dhir, 2000).
Short message Service (SMS) is the oldest application for exchanging messages
between communicating parties in cellular network used by mobile phones. These
messages are encrypted over the air with A5/1 algorithm and stored as clear text at
network operator. Unfortunately, recent developments have shown that this algorithm
is not secure any more. The efficient solution for encrypting SMS between
communicating parties (Blerim Rexha, 2014) using Advanced Encryption Standard
8
(AES) algorithm in android environment. The algorithm used for assuring privacy the
SMS application must provide end-to-end encryption.
Encryption is a practical means to achieve information secrecy. In this aspect,
Data Encryption Standard (DES) cryptography and its variant triple DES, has over the
last three decades played major role in securing data in this sector of the economy and
within the other governmental and private sector agencies ( Kefa Rabah, 2005).
Next, (Ariel M.Sison, 2013) proposed the implementation of improved Data
Encryption Standard (DES) algorithm in securing smart card data. Although smart cards
have already provided secure portable storage device, security is still a major concern
to electronic data systems against accidental or unlawful destruction during
transmission or while in storage. One way of ensuring security is through encryption so
that only the intended parties are able to read and access the confidential information
inside the smart card. The improvement which the inclusion of the Odd-Even
substitution to DES ensures that even the data is intercepted by other networks or
redirected to other destinations, its integrity and confidentiality will not be
compromised.
9
CHAPTER 3
METHODOLOGY
3.1 Introduction
Methodology is defined as a particular procedures or set of procedures. While
software development methodology means a framework that is used to structure, plan,
and control the process of developing a system. This chapter covers the detail
explanations of the methodology used in this project which is how to secure data
replication using encryption algorithm. The details about the technique applied in this
project will be explained concisely in this chapter.
Data replication is the process of storing separate copies of the database at two
or more sites. It is simple process that can be used in order to increase the availability
of data. However, since this project proposed to secure data replication between two
databases. This is why encryption algorithm technique is applied in data replication
process.
10
3.2 System Design
3.2.1 Framework
A framework is often a layered structure indicating what kind of program can
or should be built and how they would interrelate. It may be a set of functions within a
system and they can interrelate. A framework is generally more comprehensive than a
protocol and more prescriptive than structure.
The proposed project architecture in figure 3.2.1 will be started with data
collection from web server and stored the data in database server. The data will be
collected when user access the web server and add or delete data or information about
Figure 3.2.1
11
them. Since the data might be in unstructured or semi-structured form, it need to be
sorted and stored into database table in database server.
First step, we need to select text file that stored in database to be encrypted and
replicated to another database server. Then the text file selected need to be encrypt for
ensure the security of data during the replication of data process occur. The other step
in encryption of data will be explained based on figure 3.2.2.
Then, the encrypted text file need to replicate to backup server. Replicate data
from database server to backup server can increase the availability and faster server
response time. It also provides fault tolerance by maintaining and managing multiple
copies of data at different location.
After replication of data process success, the text file need to decrypt first before
it stored in database. This is because the data stored in database must available us to
read. Last but not least. The decrypted data can easily store in backup server. Figure
below shows detail explain about encryption of data in diagrammatic.
12
3.2.2 Algorithm
Figure 3.2.2
Figure 3.2.3
13
From figure 3.2.2, now the main focus on how the data doing encryption. The
data need to be encrypt from database table in database server and send it to backup
server. The data that encrypted need to decrypt first after it securely replicate to backup
server.
Algorithm use when the data encrypt is twofish encryption algorithm. Twofish
is a symmetric block cipher. A single key is used for encryption and decryption.
Twofish algorithm has a block size of 128 bits and accepts a key of any length up to
256 bits. Twofish is fast on both 32-bit and 8-bit CPUs and in hardware. It also flexible
that can be used in network applications where keys are changed frequently and in
applications where there is little or no RAM and ROM available. Below are the
algorithm used to encrypt the data.
1. Start
2. Select text file from Database server
3. Encrypt text file
4. Divide input bit into 4 parts
5. 4 bytes are sent through 4 different key-dependent S-boxes
6. 4 output bytes are combined using MDS matrix and
combined into a 32-bit word.
7. 32-bit word are combined using PHT and added to 2 round
sub-keys
8. Performed XOR operation between bit input with a key
9. XORed with the right half of the text
10. 1 bit rotations
14
11. Add sub keys are XORed into text
12. Processing the input bit in 16 times
13. Repeat line 4 and 8 for the next round
14. Decrypt text file
15. Stored text file in backup server
16. Stop
Table 3.2.2 shows the algorithm used in this project which twofish encryption
algorithm. Start from line 4 to 14 are encryption algorithm used in twofish algorithm.
In twofish algorithm, there are three main step which are divide input bit into 4 parts,
performed XOR operation between bit input with a key and processing the input bit in
16 times
Table 3.2.3 shows the number of line of algorithm with description. The
description explain more about the algorithm.
No of line Description
2 Select data from database named as text file.
3 For ensure the security of data when replication process, data need to
encrypt.
4 This is the first step how to encrypt data using twofish algorithm.
5 The key dependent S-boxes are designed to be resistant against the two
big attacks of early 1990s (differential cryptanalysis and linear
Table 3.2.2
15
cryptanalysis) and resistant against whatever unknown attacks come
next.
It’s not selected randomly but carefully designed S-boxes construction
rules, and tested with all possible 128-bit key.
6 The MDS Matrix is used as the main diffusion mechanism for the four
bytes outputted by the four S-Boxes. To retain its MDS property even
after 1-bit rotation and to be fast in both hardware and software. This
means that we had to search through all possible matrices and find the
one that best.
7 Pseudo-Hadamard Transformation (PHT) and key addition provide
diffusion between the sub blocks and the key. By using the LEA
instruction on the Pentium, we can do all four additions in just two
operations.
8 XORed the input with a key
9 Added to two round sub-keys, then XOR-ed with the right half of the
text.
10 The 1-bit rotation is designed to break up the byte structure. Without it,
everything operates on bytes. This operation exists to frustrate
cryptanalysis, it certainly frustrated attempts at cryptanalyzing twofish
algorithm.
11 Need a key to encrypt and decrypt data.
12 Twofish algorithm need to process the input 16 times/rounds fiestel
network.
16
13 In each round need step in line 7 and 11 to finish the twofish encryption
algorithm.
14 Before stored data to backup server, need to decrypt the text file first.
15 Stored decrypted text file in backup database.
3.4 Conclusion
Replication process involves more than one server. This is why this project use
Oracle VM VirtualBox and created two virtual server. One can act as a main database
server and the other one act as backup server. The main database is located at the main
database server. The backup server will receive the replicated data and stored it. The
data will replicate securely by using twofish encryption algorithm.
Table 3.2.3
17
REFERENCES
Beg, A. H., Noraziah, A., Abdalla, A. N., & Rabbi, K. F. (2013). Framework of
Persistence Layer Synchronous Replication to Improve Data Availability into a
Heterogeneous System. International Journal of Computer Theory and
Engineering, 5(4), 611.
Dhir, A. (2000). Data Encryption using DES/Triple-DES Functionality in
Spartan-II FPGAs. White Paper: Spartan-II FPGAs, WP115 (v1. 0) March, 9.
Holmgren, M. (2015). Multi-Master Database Replication and e-Learning–
Theoretical and Practical Evaluation.
Rabah, K. (2005). Theory and implementation of data encryption standard: A
review. Information Technology Journal, 4(4), 307-325.
Sison, A. M., Tanguilig III, B. T., Gerardo, B. D., & Byun, Y. C. (2012).
Implementation of Improved DES Algorithm in Securing Smart Card Data.
In Computer Applications for Software Engineering, Disaster Recovery, and
Business Continuity (pp. 252-263). Springer Berlin Heidelberg.
18
APPENDIX
Gantt chart