framework for sql modification and analysis

114
SUBMITTED IN PARTIAL FULLFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN SOFTWARE SYSTEMS ENGINEERING THE SIRINDHORN INTERNATIONAL THAI-GERMAN GRADUATE SCHOOL OF ENGINEERING KING MONGKUT‘S UNIVERSITY OF TECHNOLOGY NORTH BANGKOK ACADEMIC YEAR 2012 COPYRIGHT OF KING MONGKUT‘S UNIVERSITY OF TECHNOLOGY NORTH BANGKOK MASTER THESIS Mr. Pinaet Phoonsarakun Framework for SQL Modification and Analysis

Upload: -

Post on 28-Apr-2015

47 views

Category:

Documents


0 download

DESCRIPTION

This is my master's thesis. The work presented in this thesis has developed a framework for a network proxy to extend functionality to analyze and modify SQL statements in network traffic and to adapt to different database protocols, such as TNS for Oracle database, TDS for Microsoft SQL Server, and DRDA for IBM DB2.

TRANSCRIPT

Page 1: Framework for SQL Modification and Analysis

SUBMITTED IN PARTIAL FULLFILMENT OF THE REQUIREMENTS

FOR THE DEGREE OF MASTER OF SCIENCE IN SOFTWARE SYSTEMS ENGINEERING

THE SIRINDHORN INTERNATIONAL THAI-GERMAN GRADUATE SCHOOL OF ENGINEERING

KING MONGKUT‘S UNIVERSITY OF TECHNOLOGY NORTH BANGKOK

ACADEMIC YEAR 2012

COPYRIGHT OF KING MONGKUT‘S UNIVERSITY OF TECHNOLOGY NORTH BANGKOK

MASTER THESIS

Mr. Pinaet Phoonsarakun

Framework for SQL Modification and Analysis

Page 2: Framework for SQL Modification and Analysis

i

Author

Name Mr. Pinaet Phoonsarakun

Title Framework for SQL Modification and Analysis

Major Field Software Systems Engineering

Advisor Professor Dr. -Inf. Wolfgang Benn

Dr. Kamol Limtanyakul

Advisory Support Dip. –Inf. Alexander Adam

Academic Year 2012

Page 3: Framework for SQL Modification and Analysis

ii

Master Thesis: Registration

Page 4: Framework for SQL Modification and Analysis

iii

Master Thesis: Evaluation

Page 5: Framework for SQL Modification and Analysis

iv

Master Thesis: Certificate

Page 6: Framework for SQL Modification and Analysis

v

Acknowledgement

I always cannot thank my mom and my dad enough for their helps, support, and love

while I having a hard time.

I would like to thank dimensio informatics GmbH in providing me infrastructure,

facilities, and financial support while I was staying in Germany for my study and research.

I was fortunate to have Prof. Dr. Wolfgang Benn and Dr. Kamol Limtanyakul, who

agreed to act as my supervisors. They have provided guidance, support, understanding and

professional and personal assistance of the most valuable kind. For this, I am immensely

grateful.

I also would like to express my thank to Mr. Alexander Adam, who agreed to act as

my mentor, for his helps, support, advice, humour, and professional experience while I was

doing internship and research at dimensio informatics GmbH.

To Dr. Tony Anwar, Dr. Christoph Quix, Mr. Mirko Caspar, Prof. Dr. Wolfram

Hardt, Mr. Sebastian Leuoth, I owe you a considerable debt for your helps, support,

consultation, advice, experience, and motivation while I was having challenging time.

I wish also to acknowledge my gratitude to the department of Software Systems

Engineering at the Sirindhorn International Thai-German Graduate School of Engineering for

the patience, courtesy, and support I have unfailingly encountered in the long course of

completing my study.

This thesis would not have been possible without all of them.

Pinaet Phoonsarakun

Page 7: Framework for SQL Modification and Analysis

vi

Abstract

dimensio informatics GmbH is developing tools to accelerate databases. One of these

tools is a network proxy. This proxy is one possibility to be integrated in existing IT

infrastructure. It can improve responding speed of databases by understanding the database

protocol used between a client and the server and then analyzing and modifying database

queries. The proxy is currently specialized for the Oracle database protocol and has limited

functionality.

This goal of this thesis is to study, design, and develop a framework for the proxy to

easily adapt to other database protocols like TDS and DRDA, and develop a demonstration

for SQL modification and analysis.

Keyword: databases, proxy, database client and server, database protocol, SQL, TNS,

TDS, DRDA

Page 8: Framework for SQL Modification and Analysis

vii

Table of content

Author .................................................................................................................................. i

Master Thesis: Registration ....................................................................................................... ii

Master Thesis: Evaluation........................................................................................................ iii

Master Thesis: Certificate ......................................................................................................... iv

Acknowledgement ..................................................................................................................... v

Abstract ................................................................................................................................ vi

Table of content ....................................................................................................................... vii

List of Acronyms and Abbreviations ......................................................................................... x

List of figure ............................................................................................................................ xii

List of table .............................................................................................................................. xv

Chapter 1 Introduction ............................................................................................................. 1

1.1 Problem Description ...................................................................................... 1

1.2 Thesis Goal and Contribution ........................................................................ 1

1.3 Thesis Organization ....................................................................................... 2

Chapter 2 Background and Related work ................................................................................ 3

2.1 Network Proxy Server ................................................................................... 3

2.2 Network traffic .............................................................................................. 5

2.3 SQL ................................................................................................................ 6

2.3.1 SQL History ....................................................................................... 7

2.3.2 SQL Process ....................................................................................... 7

2.3.3 SQL Component ................................................................................ 8

2.3.4 SQL Command ................................................................................ 10

2.3.5 Stored Procedures and Prepared Statements .................................... 11

2.4 Database Protocols ...................................................................................... 12

Page 9: Framework for SQL Modification and Analysis

viii

2.4.1 TNS .................................................................................................. 13

2.4.2 TDS .................................................................................................. 17

2.4.3 DRDA .............................................................................................. 20

2.4.4 Summary .......................................................................................... 24

2.5 Database drivers .......................................................................................... 24

2.6 Related work ................................................................................................ 26

2.6.1 TDS Protocol Analyzer .................................................................... 26

2.6.2 GreenSQL ........................................................................................ 27

2.6.3 Security Testing Approach and Framework .................................... 28

2.6.4 The Current Proxy Structure ............................................................ 28

2.6.5 dimensio server ................................................................................ 29

Chapter 3 Concept ................................................................................................................. 30

3.1 The Framework ........................................................................................... 30

3.2 Packet Assembling and Disassembling ....................................................... 31

3.3 SQL Modification ........................................................................................ 32

3.4 SQL Timing Analysis .................................................................................. 34

Chapter 4 Implementation ..................................................................................................... 38

4.1 The Framework ........................................................................................... 38

4.2 The ProxyPlugin Abstraction Layer ............................................................ 46

4.3 The DBProtocolPlugin Abstraction Layer .................................................. 52

Chapter 5 Results ................................................................................................................... 62

5.1 Framework for SQL Analysis and Modification ......................................... 63

5.2 Demonstration for SQL modification .......................................................... 64

5.2.1 The SQL modification with Oracle.................................................. 64

5.2.2 The SQL modification with Microsoft SQL Server ........................ 67

5.2.3 The SQL modification with IBM DB2 ............................................ 68

5.2.4 The SQL modification for Oracle with dimensio server ................. 70

5.3 Demonstration for SQL analysis ................................................................. 75

5.3.1 The SQL Analysis with Microsoft SQL Server ............................... 76

Page 10: Framework for SQL Modification and Analysis

ix

5.3.2 The SQL Analysis with Oracle database ......................................... 77

Chapter 6 Outlook.................................................................................................................. 80

6.1 Future work ................................................................................................. 81

Bibliography ............................................................................................................................ 82

Biography 85

Appendix 87

Appendix A: The Result of the SQL Timing Analysis ......................................... 87

Appendix B: Struct Variables ............................................................................... 98

Page 11: Framework for SQL Modification and Analysis

x

List of Acronyms and Abbreviations

Abbreviation Full Form

3VL Three-Valued Logic

ACCRDB Access RDB

ACCRDBRM Access RDB Reply Message

AFS Apple File Share

ANSI American National Standard Institute

API Application Programming Interface

AR Application Requester

AS Application Server

CDRA Character Data Representation Architecture

CPU Central Processing Unit

DBMS Database Management Systems

DCL Data Control Language

DDL Data Definition Language

DDM Distributed Data Management Architecture

DML Data Manipulation Language

DQL Data Query Language

DRDA Distributed Relational Database Architecture

DS Database Server

DSS Data Stream Structure

ENDQRYRM End of Query Reply Message

EOM End Of Message

EXCSAT Exchange Server Attributes

EXCSATRD Server Attributes Reply Data

FD:OCA Formatted Data Object Content Architecture

ISO the International Organization for Standards

IP Internet Protocol

IT Information Technology

JDBC The Java Database Connectivity

MS Microsoft

MSA Management Services Architecture

NP Named Pipes

OCCI The Oracle C++ Call Interface

ODBC The Open Database Connectivity

OLE-DB The Object Linking and Embedding Database

OOP Object-Oriented Programming

OPNQRY Open Query

OPNQRYRM Open Query Complete

PacketID Packet Identification

PRPSQLSTT Prepare SQL Statement

QRYDSC Query Answer Set Description

Page 12: Framework for SQL Modification and Analysis

xi

QRYDTA Query Answer Set Data

RDB Relational Database

RDBMS Relational Database Management Systems

RFC Request For Comments

RPC Remote Procedure Call

SDP Sockets Direct Protocol

SMB Server Message Block

SNA Systems Network Architecture

SPID Process Identification

SQL Structured Query Language

SQLATTR SQL Statement Attributes

SQLCARD SQL Communication Area Reply Data

SQLDTA SQL Program Variable Data

SQLSTT SQL Statement

SSPI Security Support Provider Interface

TCP Transport Control Protocol

TCP/IP Transmission Control Protocol/Internet Protocol

TDS Tabular Data Stream

TNS Transparent Network Substrate

Page 13: Framework for SQL Modification and Analysis

xii

List of figure

Figure 2.1 Two computers communicate through a proxy server (shown in red) ..................... 4

Figure 2.2 Simple diagram of architecture for processing SQL ................................................ 8

Figure 2.3 SQL language anatomy ............................................................................................ 9

Figure 2.4 Stored procedure and prepared statement concept ................................................. 11

Figure 2.5 A typical prepared statement .................................................................................. 12

Figure 2.6 The Oracle network architecture [TG00] ............................................................... 14

Figure 2.7 TNS packet structure .............................................................................................. 15

Figure 2.8 The Microsoft SQL Server architecture overview ................................................. 18

Figure 2.9 TDS packet structure .............................................................................................. 18

Figure 2.10 DRDA protocol architecture (after [Drda11]) ...................................................... 21

Figure 2.11 DRDA packet structure ........................................................................................ 22

Figure 2.12 Common flow of communication of database protocols ...................................... 24

Figure 2.13 The database driver on the client converts requests and replies [GS09] .............. 25

Figure 2.14 Different database drivers connecting with the same database server ................. 26

Figure 2.15 The problematic structure of the proxy ................................................................ 28

Figure 3.1 The overview of the framework developed from a problematic structure ............. 30

Figure 3.2 Overview of the concept of packet assembling and disassembling ....................... 32

Figure 3.3 SQL modification concept ...................................................................................... 33

Figure 3.4 SQL timing analysis concept .................................................................................. 34

Figure 3.5 Overview of gaining SQL timing information with the framework ....................... 35

Figure 4.1 The overview of the framework developed from a problematic structure ............. 38

Figure 4.2 Three database protocol plugins inherit the ProxyPlugin abstract class ................ 39

Figure 4.3 Creating the abstract class and its derived class in C++ ......................................... 40

Figure 4.4 The instance of PPluginOracle is declared in main function using C++ ................ 41

Page 14: Framework for SQL Modification and Analysis

xiii

Figure 4.5 The relationship between the class DBProtocolPlugin and the class Customer .... 42

Figure 4.6 Sample code for creating the class DBProtocolPlugin and the class Customer ..... 43

Figure 4.7 Declare a Customer instance and call its interface in DB Protocol code ............... 43

Figure 4.8 Change the access specifier of the functions inherited from base class ................. 44

Figure 4.9 Access the private interface of a derived class via its base class ........................... 44

Figure 4.10 The relationship and interaction between layers in the framework ...................... 45

Figure 4.11 The three DB protocol classes develop their functionalities on the standardized

interfaces of their base class .................................................................................................... 46

Figure 4.12 Sample code of how to declare ProxyPlugin object and use it............................. 47

Figure 4.13 The flow chart associated with onConnection interface....................................... 48

Figure 4.14 The pure virtual function of onConnection interface ........................................... 48

Figure 4.15 The overview of the processes for the interface onReceiveClient ....................... 49

Figure 4.16 The overview of tasks of the DB protocol object ................................................. 50

Figure 4.17 The pure virtual function of onReceiveClient interface ....................................... 51

Figure 4.18 The pure virtual function of onReceiveServer interface ...................................... 51

Figure 4.19 The flow chart associated with onDisonnection interface .................................... 52

Figure 4.20 The pure virtual function of onReceiveClient interface ....................................... 52

Figure 4.21 The class DBProtocolPlugin is inherited by the Customer class ......................... 54

Figure 4.22 Sample code of how to declare DBProtocolPlugin object and use it ................... 54

Figure 4.23 The work flow for onConnectToDimensio interface ........................................... 55

Figure 4.24 The pure virtual function of onConnectToDimensio interface ............................ 55

Figure 4.25 The flow chart for onChangeSql interface ........................................................... 56

Figure 4.26 The pure virtual function of onChangeSql interface ............................................ 57

Figure 4.27 The flow chart for onClientSqlTime interface and onServerSqlTime interface .. 58

Figure 4.28 The pure virtual function of onClientSqlTime interface ...................................... 58

Figure 4.29 The pure virtual function of onServerSqlTime interface ..................................... 59

Figure 5.1 The class diagram of the final design of the framework ........................................ 63

Figure 5.2 The result of the SQL modification testing by the specification in Table 5.2........ 65

Figure 5.3 The testing case with DB client Oracle SQL Developer 3.0.03 ............................. 66

Page 15: Framework for SQL Modification and Analysis

xiv

Figure 5.4 The testing case with Toad for Oracle Xpert Trial Version 11.0.0.116 ................. 67

Figure 5.5 The result of the SQL modification testing by the specification in Table 5.3........ 68

Figure 5.6 The result of the SQL modification testing by the specification in Table 5.4........ 70

Figure 5.7 The result of the connection to Oracle database via the proxy and dimensio server

by the specification in Table 5.5 .............................................................................................. 74

Figure 5.8 The result of direct connection to Oracle database by the specification in Table 5.5

.................................................................................................................................................. 75

Figure 5.9 The SQL timing analysis information when working with MS SQL Server ......... 77

Figure 5.10 The SQL timing analysis information when working with Oracle database ........ 78

Figure 5.11 The two arrangement forms of SQL timing analysis information ....................... 78

Figure 5.12 The sample result of SQL timing analysis with Oracle database ......................... 79

Page 16: Framework for SQL Modification and Analysis

xv

List of table

Table 5.1 EMP table ................................................................................................................ 64

Table 5.2 The specification for testing the SQL modification with Oracle ............................. 65

Table 5.3 The specification for testing the SQL modification with MS SQL Server .............. 67

Table 5.4 The specification for testing the SQL modification with IBM DB2 ....................... 69

Table 5.5 The specification for testing the SQL modification with Oracle and dimensio ...... 70

Table 5.6 The specification for testing the SQL analysis with MS SQL Server ..................... 76

Table 5.7 The specification for testing the SQL analysis with Oracle database ...................... 77

Page 17: Framework for SQL Modification and Analysis

1

Chapter 1 Introduction

1.1 Problem Description

dimensio informatics GmbH is developing tools to make databases respond faster.

These tools have various possibilities to be integrated in existing IT infrastructures. One of

these so called integration points is a network proxy. This proxy has to understand the

database protocol that is used between a database client and the corresponding server.

To work as an integration point, the before mentioned proxy has to modify and

analyze the traffic between a client and the database. After an internship [Pho11] the current

proxy is able to provide the SQL modification, but the further results of this analysis are

currently discarded, i.e. everything that does not directly concern the execution of SQL

statements. Moreover, the proxy is currently specialized for the Oracle database protocol. To

extend the proxy functionality and support further database protocols can help the company

reach more potential customers. Therefore, the proxy‘s structure has to be revised; its

functionality has to be extended and it should support further database protocols for more

potential customers.

1.2 Thesis Goal and Contribution

The goals of this master thesis are to revise the current proxy structure and develop it

to be easily adapted to other database protocols like TDS and DRDA. Furthermore, the goals

are to develop the proxy functionality to be capable of modifying and analyzing SQL

statements in network traffic. Finally, the thesis work is to demonstrate two prototypes. The

first prototype is a prototype of the SQL modification which covers changing, altering, or

replacing SQL statements in the traffic. The second prototype is a prototype for the SQL

analysis which covers counting statement fetches and measures the time for the calls that the

clients send to the database (i.e. prepare, bind, execute and fetch).

Page 18: Framework for SQL Modification and Analysis

2

1.3 Thesis Organization

This master thesis is divided into six chapters. Initially, Chapter 1, introduction,

begins with the brief overview of thesis work and gives problem statements for this master

thesis. Then Chapter 2, background and related work, covers the theoretical and practical

information as well as the background necessary and related to the master thesis. Next,

Chapter 3, concept, explains the concepts behind the thesis work. After that, Chapter 4,

implementation, explains the methods of the implementation. It starts with the overview of

thesis work which describes the way to implement in short. The following chapter, results,

explains the introduction of the testing examples related to the thesis work. In the end,

Chapter 6, outlook, briefly sums up the main points in this thesis and suggests further

research and improvements.

Page 19: Framework for SQL Modification and Analysis

3

Chapter 2 Background and Related work

The central topic of this master thesis is to develop a proxy structure that is to be

easily adapted to other database protocols like TDS and DRDA and develop its functionality

to be able to modify and analyze SQL statements in network traffic. Therefore a basic

knowledge about network proxy server, network traffic, SQL, database protocols, and

database drivers are needed to understand the development in this thesis. Furthermore, each

section in this chapter observes and briefly describes related works to this master thesis.

Moreover the last section explains which point from the related works has been applied to

this master thesis.

2.1 Network Proxy Server

First of all this thesis work needs a tool that is capable of observing and modifying

data streams in network traffic between a client and the corresponding database server. The

tool that the work uses to achieve this task is a network proxy server.

Network proxy servers are servers which act as a middleman between a database

client and the corresponding database server. With the proxy, the client and the database

server will not connect to each other directly. The client will connect and send requests to the

database server via the proxy instead of a direct connection. The proxy will do on behalf of

the client to get some resources or services the client seeks from the database server. The

resources or services can be a connection, a file, an image file, a data record, a result set, a

database view or a table, or another resource available.

The proxy server evaluates and operates requests from a database client according to

its rules or its functionalities. For instance, it may filter SQL queries sent by a client if the

query consists of maliciously code against database applications [Ope12]. If the query is valid

Page 20: Framework for SQL Modification and Analysis

4

for the filtering rules, the proxy will provide the resource by connecting to the relevant

database application and requesting the service on behalf of the client. Otherwise the query is

dropped. Furthermore, a proxy server may optionally alter SQL queries from a database

client or modify database result sets from a database server especially for better performance.

Moreover, a proxy server may serve the SQL query without contacting the corresponding

database server. In this case, it caches replies from the database server, substitutes them, and

returns subsequent SQL queries for the same content directly. For example, that similar usage

idea, web proxies are commonly used to cache web pages from a web server [Tho06] and

used to generate a substitute web page [HG04] to speed up access to resources.

Back to the history, the proxy concept was invented in the early days of distributed

computers as a way to simplify and control their complexity [Sha86].

Figure 2.1 simplifies and illustrates how a proxy works.

Figure 2.1 Two computers communicate through a proxy server (shown in red)

In Figure 2.1, the server in the middle shown in red acts as a proxy server. It receives

a request from Charles which acts as a client. Then the proxy forwards the request to Jonas

which acts as a destination server. Jonas spends some time to process an answer for the

request. After that, Jonas replies with an answer to the proxy, and then the proxy forwards the

answer from Jonas to Charles.

Page 21: Framework for SQL Modification and Analysis

5

In general, a proxy server has two possibilities to be implemented, i.e. non-transparent

proxy and transparent proxy. RFC2616 offers a standard definition for non-transparent proxy

as:

"A proxy that modifies the request or response in order to provide some added service

to the user agent, such as group annotation services, media type transformation, protocol

reduction, or anonymity filtering".

and defines transparent proxy as:

"A proxy that does not modify the request or response beyond what is required for

proxy authentication and identification".

In more detail, the transparent proxy intercepts communications between clients and

remote servers without requiring any special client configuration. Also clients do not have to

be aware of the existence of the proxy. Furthermore the transparent proxy can be enabled to

cache web objects transparently by using standard non-transparent proxy caches [CRS99].

Finally, there are a large diversity of potential purposes for the proxy to be used.

Those aforementioned usages of the proxy are some of the purposes that this master thesis

pays most attention to. Also the proxy that this thesis uses is a non-transparent proxy only.

2.2 Network traffic

Before the proxy server evaluates and operates requests from a database client or

replies from the corresponding database server, it has to deal with the network traffic between

the client and the server.

The term network traffic is defined as:

“Computer network communications that are carried over wired or wireless networks

between hosts.” [KCZGD06]

In more detail, the hosts encapsulate their data in network traffic or packets and send

them through a network for communications. The data sent through the network are

Page 22: Framework for SQL Modification and Analysis

6

formatted based on the protocol both a client and the relevant server use. The protocol will be

explained in great detail in the section database protocols.

Monitoring and capturing network traffic require specialized knowledge and network

traffic capture tools, known as packet analyzer or sniffer, to process. Also they can provide

useful information and contents of the communications, such as, a username, a file, an image,

or even an SQL statement. Moreover this can help developers to analyze and fix problematic

network traffic or packets.

Existing open source sniffers can be used for basic tasks in simple cases but lack the

functionality of commercial sniffer that are specifically designed to process network traffic

[Cas04]. Unfortunately those sniffers still lack the functionality that this master thesis wants

to achieve.

Since this master thesis concentrates on developing a framework for SQL

modification and analysis, it uses network traffic as a source of SQL statements.

2.3 SQL

Modifying SQL statements requires basic knowledge of SQL. Therefore, this section

has the objective to describe what SQL is and what its standard commands are in order to get

a general idea of it and apply the knowledge to the thesis work.

SQL stands for Structured Query Language. It is the language that clients use to

communicate with database servers for storing, manipulating, and retrieving data stored in

relational databases. Relational database management systems (RDBMS), for instance;

Oracle [Abb08], IBM DB2 [Drda11], Microsoft SQL Server [Tds11], Microsoft Access

[CV10], MySQL [SZT12], Informix [Fla00], Sybase [GAS96], and PostgreSQL [OHR11],

use SQL as a language and data inside them is written in table form, i.e. rows/records and

fields/columns [Abb08]. Also they are developing and using their own dialect of SQL which

adds functionality based on SQL industrial standard defined by the American National

Standard Institute (ANSI), but they all are more than 90% alike [Abb08]. Different versions

Page 23: Framework for SQL Modification and Analysis

7

of SQL, for example, are PL/SQL developed and used by Oracle, SQL PL developed and

used by IBM DB2, and T-SQL developed and used by MS SQL Server etc.

SQL plays an important role for making many things possible for users to manage

data stored in relational database management systems. For example, users can use SQL to

access data in relational database management systems; to describe the data; to define the

data in databases and manipulate that data; to create and drop databases and tables; to set

permissions on tables, procedures, and views; and to create views, stored procedures or

functions in a database [Abb08].

2.3.1 SQL History

The chronological events of SQL began in the year 1970 when Dr. Edgar Frank Codd

introduced a relational model for databases [Cod70]. During the 1970s IBM included Codd‘s

ideas into its product named System R [CB74]. Later in the 1970s SEQUEL (structured

English query language) which was the database language for that product was renamed SQL

[Opp04]. In the late 1970s a first SQL-based relational database management systems was

released by Relational Software and that later becoming Oracle [Ora77]. In 1983 IBM began

developing DB2 which based on the System R prototype after testing SQL at customer sites

to determine usefulness and practicality of the system. [IBM83] By 1986 SQL became a

standard of the American National Standards Institute (ANSI) [ANSI12], and by 1987 it

became a standard of the International Organization for Standards (ISO) [ISO12].

2.3.2 SQL Process

An SQL statement can be very simple or very complex, for example SELECT is the

most complex statement in SQL, with optional keywords and clauses, such as FROM,

WHERE, GROUP BY, HAVING, ORDER BY etc. Processing SQL is composed of various

components. Those components are shown in Figure 2.2.

Page 24: Framework for SQL Modification and Analysis

8

Figure 2.2 Simple diagram of architecture for processing SQL

In Figure 2.2, the Query Language Processor translates the SQL Query and optimizes

it for the DBMS Engine. The DBMS Engine then receives the compiled query and

manipulates files and data stored in the database according to the query.

SQL processing relates to SQL performance where the industry needs to reduce

response time of SQL queries. Reducing the response time can be achieved by several

methods. These methods can be roughly grouped into three groups, which are database

internal tuning, hardware tuning, and application tuning. An example of database internal

tuning is to create indexes, optimize buffer sizes, and set database protocol packet size

[GS09]. Using indexes can speed up the information search process. Mostly the indexes are

used in WHERE clause [Abb08]. Next, some examples of hardware tuning are to buy a faster

central processing unit (CPU), buy faster storage systems, or add more main memory. Lastly,

an example of application tuning is to optimize SQL statements, known as SQL tuning,

through monitoring and measuring SQL performance [GS09].

2.3.3 SQL Component

Understanding the SQL component is also useful for this thesis work. SQL consists of

several language components, including: clauses, expressions, predicates, statements, queries,

and whitespaces. These components are shown in Figure 2.3.

Page 25: Framework for SQL Modification and Analysis

9

Figure 2.3 SQL language anatomy

As depicted in Figure 2.3, those SQL language components can be described as

follows:

Clauses are component parts of statements and queries. (In some cases, these are

optional.) [IS99]

Expressions can be evaluated and will return either scalar values or tables consisting

of columns and rows of data.

Predicates specify conditions to limit the effects of statements and queries, or to

change the program flow. There are three possibilities of result values that is true, false, or

unknown value. In other words, these values are Boolean truth values or SQL three-valued

logic (3VL) [Rub07], [Dat05].

Statements may have a persistent effect on schemata and data, or may control

transactions, program flow, connections, sessions, or diagnostics. SQL statements also

include the semicolon statement terminator (";"). Though not required on every platform, it is

defined as a standard part of the SQL grammar.

Queries are the most important elements of SQL. They retrieve the data based on

specific criteria.

Whitespaces have no semantics in SQL statements and queries. Therefore they are

used to format SQL code for readability.

Page 26: Framework for SQL Modification and Analysis

10

2.3.4 SQL Command

The standard SQL commands to communicate with relational databases are CREATE,

ALTER, DROP, INSERT, UPDATE, DELETE, SELECT, GRANT, REVOKE, and

TRUNCATE [ISO08]. SQL commands can be classified into several groups based on their

functionality:

Data Definition Language (DDL):

CREATE Creates a new table, a view of a table, or other object in database

ALTER Modifies an existing database object, such as a table.

DROP Deletes an entire table, a view or other objects in the database.

TRUNCATE Removes all rows from a table without logging the individual deletions

Data Manipulation Language (DML):

INSERT Creates a record

UPDATE Modifies records

DELETE Deletes records

Data Query Language (DQL):

SELECT Retrieves certain records from one or more tables

Data Control Language (DCL):

GRANT Gives a privilege to user

REVOKE Takes back privileges granted from user

In the end, this master thesis concentrates on the standard SQL commands, especially

the SELECT command, because the command is used to retrieve information from databases.

Also, generally, the greater the amount of data records it has to deal with, the longer the

response time will be. Furthermore the main theme of the products of dimensio informatics

GmbH is to speed up databases. Therefore reducing response time to SQL queries is an issue

related directly to that theme.

Page 27: Framework for SQL Modification and Analysis

11

2.3.5 Stored Procedures and Prepared Statements

Stored procedures and prepared statements are used to improve SQL performance

when SQL queries have to run a few thousands to millions times a day. Technically, every

SQL query has to go through a processing cycle to produce the desired result. It goes from

parsing, optimizing, compiling, executing, and finally returning the result. The general idea,

which stored procedures and prepared statements come from, is that once a query is written

and ran correctly for the first time then the processes of parsing, optimization, and compiling

are unnecessarily required since it has already been parsed, optimized, and compiled, as

illustrated in Figure 2.4.

In more detail, a stored procedure is a precompiled executable object that contains one

or more SQL statements. In other words, complex SQL statements can be replaced by a

single stored procedure. Since stored procedures are precompiled objects, they execute faster

at the database server. For the consecutive run, it will run from the compiled stage and hence

boost performance.

Figure 2.4 Stored procedure and prepared statement concept

Page 28: Framework for SQL Modification and Analysis

12

Next, a prepared statement is the ability to set up a statement once, and then execute

many times with different parameters. Prepared statements are designed to replace building

specific and customized query strings, and do so in a more secure and efficient manner. A

typical prepared statement looks something like Figure 2.5.

Figure 2.5 A typical prepared statement

The question mark ‗?‘ in the prepared statement is called a placeholder. When the

statement is executed, it requires the parameter to supply for it, which would replace its

placeholder.

However the thesis work uses the proxy server to modify and analyze SQL commands

or statements, it has to monitor and capture network traffic between a database client and the

relevant database server. But SQL statements are encrypted and formatted into network

packets differently according to different database protocol.

2.4 Database Protocols

This section briefly describes basic principle of database protocols, especially

database protocol packet and packet flows, which is sufficient to be applied to the

development of thesis work. The goal of the thesis work is to develop the framework for SQL

analysis and modification. So that the framework should be able to be easily adapted to

different database protocols.

The term protocol is defined by Kurose [KK10] as:

“A protocol defines the format and the order of messages exchanged between two or

more communicating entities, as well as the actions taken on the transmission and/or receipt

of a message or other event.”

Building upon that definition, understanding formatted data in network, it is required

to understand the protocol that a client and the server are using. Like one would like to know

Page 29: Framework for SQL Modification and Analysis

13

what two people are talking about, the one have to know the language the people are

speaking for the communications.

In computer network, there are a large variety of protocols in use. For example, TNS

protocol is used for Oracle Database, TDS protocol is used by Microsoft SQL Server, and

DRDA protocol is used by IBM DB2 etc. Furthermore different protocols are used to

accomplish different communication tasks. As this section implies, it concentrates on

database client and server communication tasks.

There are a large variety of database protocols, but the database protocols this master

thesis chooses to study and applies to the thesis work are TNS, DRDA, and TDS. These

database protocols are chosen because not only of the requirement of dimensio informatics

GmbH, as these are the three big commercial database systems on the market [Hes12], but

also of proving that the framework being able to be adapted to different database protocols.

The subsections that follow describe those three database protocols respectively.

2.4.1 TNS

Transparent Network Substrate (TNS) is the database protocol that Oracle

Corporation relies on to provide generic network connectivity between local Oracle database

clients and the related Oracle database instance or between two different Oracle database

instances, regardless of underlying protocols. TNS protocol is utilized by Oracle and its

protocol specification is not available for public access. TNS uses transport control protocol

(TCP) on top of internet protocol (IP) to delivery data and its default port is 1521.

Note that much of this subsection stems from the internship [Pho11].

In the Oracle network architecture (shown in Figure 2.6), TNS is a foundation

technology that is built into the Oracle Net foundation layer and enables the tasks of the layer

to be able to establish and maintain the connection between Oracle database clients and the

database server, as well as exchange messages between them.

Page 30: Framework for SQL Modification and Analysis

14

Figure 2.6 The Oracle network architecture [TG00]

TNS also enables peer-to-peer application connectivity, where two or more computers

can communicate with each other directly, without intermediary devices can occur. It

provides a user-transparent layer that enables a heterogeneous network consisting of different

protocols to function as a homogeneous network. It forms a transparent layer to which

different network protocols can connect.

TNS supports all common physical network protocols, for instance TCP/IP, named

pipes (NP), and sockets direct protocol (SDP), because it is a highly generic logical protocol.

Also TNS is based on individual logic packets which are mapped transparently to physical

packets.

2.4.1.1 TNS Packet Header Structure

Every TNS packet has an 8-byte wide global header. The 8 identical bytes of the

global header are composed of total packet size (2 bytes), packet checksum (2 bytes), TNS

Page 31: Framework for SQL Modification and Analysis

15

packet type (1 byte), flags or reserved (1 byte), and global header checksum (2 bytes). This is

illustrated in Figure 2.7.

Figure 2.7 TNS packet structure

The abbreviations in Figure 2.7 are described below:

HDLEN Two-byte Total packet size

PKTSM Two-byte Packet checksum

TYP One-byte TNS packet type

RSVD One-byte Flags or reserved

HDSM Two-byte Global header checksum

DATA TNS packet body

The packet size field is in big-endian byte order. The packet checksum and global

header checksum are either the ones complement of the sum of the packet header or whole

packet (like an IP checksum) or, in practice, zero. The flag field is usually zero.

The TNS packet type is determined by the type field in the header that makes the

packet body (DATA) different accordingly. Currently there are 12 different types of packets

in use.

The 12 different types of TNS packet can be roughly divided into 3 groups as

following:

Connection:

Connect packet

Accept packet

Ack packet

Refuse packet

Redirect packet

Page 32: Framework for SQL Modification and Analysis

16

Data Transfer:

Data packet

NULL packet

Control Flow:

Abort packet

Resend packet

Marker packet

Attention packet

Control packet

2.4.1.2 TNS Overall Flows

This is how each TNS packet type works. Firstly, a local Oracle database client sends

a Connect packet to specify the service name while connecting to a Oracle database instance.

Then the database listener provided matches its information with the client connection

request. If the information matches, it grants the request with an Accept packet. Next the

client tries authentication. All packets sent for authentication are data packets. If the Listener

does not know the service, then it sends the client a Refuse packet.

After the client and the server authenticated, if everything is fine, the connection

between the client and the destination server is established. Now the client sends requests and

the server sends replies. All request and reply packets are type Data. While sending request

and reply packets, a Marker packet may have been sent for interrupting. For instance, the

server will send the client a Marker packet if it wants to stop the client sending data. Client

requests and server replies can continue until the connection is terminated.

However, data packets will be the majority of packets during the interaction between

the client and database server. These packets would contain queries and their results, stored

procedure calls, and information or messages which make up the great deal of the session

with the database.

This knowledge of how and when each type of TNS packet has been sent is applied to

the development of a framework for SQL modification and analysis, especially TNS data

packets because SQL statements can be located in these packets.

Page 33: Framework for SQL Modification and Analysis

17

2.4.2 TDS

Tabular Data Stream (TDS) is an application level protocol that Microsoft relies on

for communication between database clients and the database server. With TDS, it is possible

that a database client‘s request contains multiple commands and the server‘s response returns

one or many result sets. In general, TDS uses TCP/IP to deliver application-layer messages

across network and its default port is 1433.

TDS was initially designed and developed by Sybase Inc. for their Sybase SQL Server

relational database engine, and later by Microsoft in Microsoft SQL Server. Moreover,

Microsoft publishes the TDS protocol specification [Tds11] available for free and public

access.

TDS relies on a connection oriented transport service. Session, presentation, and

application service elements are provided by TDS. TDS does not require any specific

transport provider. It can be implemented over multiple transport protocols if they provide

connection oriented service, such as TCP protocol.

TDS provides support for login capability negotiation, authentication services, and

support for both database specific and generic client commands. Responses to client

commands are returned using a self-describing, table oriented protocol. Column name and

data type information is returned to the client before the actual data.

In Microsoft SQL Server architecture, as shown in Figure 2.8, SQL Server clients

send SQL statements to SQL Server by using TDS protocol. Next the client messages are

built into TDS packets by database driver SQL Server ODBC and passed to Server-Side Net-

Libraries. Then the Server-Side Net-Libraries encapsulates these packets into network

protocol packets and sends them through network to the destination server by using network

protocol TCP/IP. On the server side, the Server-Side Net-Libraries receives the arrival

packets from network, extracts the TDS packets from them and passes to the database driver

above. Next the TDS packets are extracted to get the actual client messages for the SQL

Server relational database. This can be accomplished if the SQL Server is capable of

accepting the version of TDS used by the client, such as TDS 4.2, TDS 7.0, and TDS 8.0 etc.

Page 34: Framework for SQL Modification and Analysis

18

Then the SQL Server relational database processes the SQL statements and generates result

sets in return. After this the processes is reversed until the client gets the result sets.

Figure 2.8 The Microsoft SQL Server architecture overview

2.4.2.1 TDS Packet Header Structure

Every TDS packet starts with an 8-byte wide global header. The 8 identical bytes of

the global header are composed of TDS packet type (1 byte), packet status (1 byte), length of

the entire packet (2 bytes), process identification (2 bytes), packet identification (1 byte), and

window (1 byte). This is illustrated in Figure 2.9.

Figure 2.9 TDS packet structure

The abbreviations in Figure 2.9 are described as following:

TYP One-byte TDS packet type

STA One-byte TDS packet status

LEN Two-byte length of the entire packet (big-endian)

SID Two-byte process identification (big-endian)

PID One-byte packet identification

WIN One-byte window

Page 35: Framework for SQL Modification and Analysis

19

The TDS packet type is determined by the type field in the header that makes the

packet body different accordingly. There are more than 10 TDS packet types that are, for

example, SQL batch, Pre-TDS7 login, Remote procedure call (RPC), Tabular result,

Attention signal, Bulk load data, Transaction manager request, TDS7 login, Security Support

Provider Interface (SSPI) message, Pre-login message, and unused etc.

The packet status field is set to indicate ―more‖ or ―normal‖ packets when the length

of the entire packet is greater than 512 bytes and the packet will be split on the 512-byte

block, otherwise it is set to indicate ―last packet‖ or ―end of message (EOM)‖. So that the full

TDS packet is reassembled from its component 512-byte packets with the 8-byte headers

stripped out. The block size 512 could be set to a different value and found in the login

packet. The default value of block size also depends on TDS protocol version, for instance

TDS version 7.0+ uses a default of 4096 as the block size.

The length of the entire packet field tells the total length of packet and the bytes in it

are in network byte order (big-endian).

The process identification (SPID) is the process identification on the server,

corresponding to the current connection.

The packet identification (PacketID) is used for numbering message packets that

contain data in addition to the packet header.

The window is currently not used, set to be 0x00, and should be ignored.

2.4.2.2 TDS Overall Flows

Initially, the client requests a transport connection to the server. Next the server

responds with an acknowledgement token. Then the client sends a login packet including a

record to establish a dialog. The login record contains capability and authentication

information. The server responds with a completion token indicating that it has accepted the

dialog request. Now that a dialog has been established between the client and the server, the

client sends a query packet, which contains an SQL query, to the server and then waits for the

Page 36: Framework for SQL Modification and Analysis

20

server to respond. The server executes the query and returns the results to the client. Client

requests and server responses continue until the connection terminated.

For the overall view of TDS packet type query, when a database client sends a request

INSERT SQL command, SQL Server responses result set done. When the client sends a

request SELECT SQL command, the server responses a reply including column names,

column info, row result, and result set done. If the client sends request call stored procedure,

the server responses a reply including column names, column info, row result, done inside

process, return status, and process done.

To be more precise, SQL statements and stored procedure calls can be found in TDS

packet type SQL batch and RPC.

2.4.3 DRDA

DRDA stands for Distributed Relational Database Architecture. It is the database

protocol in an application layer protocol that IBM DB2 relies on for the communication

between database client and database server in a distributed environment. DRDA is specific

to relational database management system and responsible for encapsulating SQL commands

and responses for distributed transaction. Furthermore its default port is 50000.

DRDA was originally developed by IBM and first used in DB2 release version 2.3.

Later it was adapted to the Open Group standard for database interoperability. Moreover, the

Open Group publishes DRDA protocol as an open standard [Drda11]. The latest version

released of DRDA, at the time writing this thesis, is version 5.

The DRDA protocol specification is designed for distributed data as its name implies.

It clearly defines the rules for accessing the distributed data and when the flows should occur,

but it does not describe an API for distributed database management system processing.

In Figure 2.10, the DRDA architecture comprises mainly three functions and two

protocols. The three functions are application requester (AR), application server (AS), and

database server (DS) and the two protocols are Application Support Protocol and Database

Support Protocol.

Page 37: Framework for SQL Modification and Analysis

21

The AR is the function that handles a database client of a distributed connection. It

performs function accepting SQL commands from the client and sending them to the

appropriate database servers for processing. Using this function, database clients can access

remote data.

The AS is the function that handles the database server of the connection. It performs

function receiving SQL commands from AR and processing them. It performs upon the

portions that can be processed and forwards the remainder to database servers for subsequent

processing. The AR and the AS use the Application Support Protocol which handles data

representation conversion for the communication among themselves.

The server that an application requester connects to is an application server, but any

other server further downstream is called a DS as it does not communicate with the

application requester directly. The DS performs function receiving requests from AS or other

DS servers. It supports distributed requests and will forward parts of the request to

collaborating DS in order to fulfill the request. The AS and the DS use the Database Support

Protocol to communicate between them.

Figure 2.10 DRDA protocol architecture (after [Drda11])

To implement the connections between DRDA server database management systems

and database clients, DRDA protocol uses the following architectures:

Character Data Representation Architecture (CDRA)

Distributed Data Management (DDM) Architecture

Formatted Data Object Content Architecture (FD:OCA)

Systems Network Architecture (SNA)

Page 38: Framework for SQL Modification and Analysis

22

SNA Management Services Architecture (MSA)

Transmission Control Protocol/Internet Protocol (TCP/IP).

2.4.3.1 DRDA Packet Header Structure

DRDA can be divided into two architectures. The first one is the distributed data

management architecture which provides the command and reply structure. The last one is

the formatted data object content architecture which describes the data definition. The

information referred to DRDA packet decoding can be located in DDM specification

[Ddm11]. On the other hand, the information referred to binding variable value decoding in

DRDA packet can be located in FD:OCA specification [Fdoca11].

DRDA packet is structured in the form of chain of objects which is known as data

stream structure (DSS) chaining. In other words, multiple objects can be sent in a single

DRDA packet. Every object or data stream structure requires a six byte header. First two

bytes are designated for length of data with continuation flag set. The third byte is

hexadecimal number ‗D0‘ to indicate DDM data. The fourth byte is the DDM format byte.

The last two bytes are the request correlation identifier. The bytes for the length of data and

the request correlation identifier are in big-endian byte order. This is illustrated in Figure

2.11.

Figure 2.11 DRDA packet structure

The abbreviations in Figure 2.11 are described as following:

cl Two-byte length field with continuation flag set ON

nl Two-byte length field with continuation flag set OFF

C One-byte DDM identifier (Hex ‗D0‘ for all DDM data)

f One-byte qualifier of the C-byte (used as a format identifier by DDM)

Page 39: Framework for SQL Modification and Analysis

23

rc Two-byte request correlation identifier

In the header, the information that represents whether the DRDA packet has one more

object to chain or not can be located in the fourth byte, the DDM format identifier.

2.4.3.2 DRDA Overall Flows

This subsection describes overall flows of interest that are initialization flows, SQL

statement execution and commit flows, and termination flows. All following commands and

objects are defined in DDM specification [Ddm11].

To establish a connection, first a database client sends a connection request to the

database server by using the exchange server attributes (EXCSAT) command. The server

receives the request and returns the server attributes reply data (EXCSATRD) included

DRDA environment and DDM manager support information. Then the client sends the

database name it wishes to access and data type definition information by using the access

relational database (ACCRDB) command to the server. The server returns the access RDB

reply message (ACCRDBRM) which tells the client that it supports and will use the data type

definition the client specified. Now the connection between the client and the server has been

established.

To execute and commit SQL statement, the client sends multiple commands in a

single packet to the server. The packet may include some or all of the following commands: a

prepared SQL statement (PRPSQLSTT), followed by attributes (SQLATTR), the SQL

statement itself (SQLSTT), the SQL execution (OPNQRY), and finally, followed by binding

variable values (SQLDTA). The server, then, executes the commands and returns multiple

result objects in a single reply. The reply may include some or all of the following result

objects: the execution completed (OPNQRYRM), followed by result set description

(QRYDSC), result set data (QRYDTA), end of query (ENDQRYRM), and conditions

detected information (SQLCARD). Now the SQL statement has been executed and

committed.

Page 40: Framework for SQL Modification and Analysis

24

To terminate the connection, under normal circumstances, the client initiates the

termination of a socket associated with the connection. In error situations, the server will

initiate the termination of the socket.

2.4.4 Summary

Figure 2.12 Common flow of communication of database protocols

In summary, this thesis uses TCP/IP protocol as the network protocol used for the

communication between a database client and the database server.

The packet structure of each database protocol is different to each other as shown in

its section.

The common flows of communication of the three database protocols can be

simplified as Figure 2.12.

2.5 Database drivers

Page 41: Framework for SQL Modification and Analysis

25

Basically, before database queries from a client are sent to a destination database

server, they are encoded into packets or in a format that the destination server can understand.

After finished the packet encoding, the client sends them to the destination server. Then the

destination server receives packets from the client and processes replies. After that the replies

are encoded into packets and sent back to the client. Finally the client receives the replies and

decodes them in a format that it can understand. The software routine that encodes and

decodes the queries and replies is the database driver.

In other words, database drivers are client-side adapters (installed on the client

machine, not on the server) that convert requests from application programs to a protocol that

the DBMS can understand (see Figure 2.13).

Figure 2.13 The database driver on the client converts requests and replies [GS09]

To develop a database client to connect with individual databases by using application

programming interface, this requires database drivers for each database. For instances, the

java database connectivity (JDBC) API requires the database driver that follows TNS

protocol, namely JDBC driver, to connect with the Oracle database, the object linking and

embedding database (OLE-DB) API requires the database driver that follows TDS protocol to

connect with the Microsoft SQL Server database, and OLE-DB API requires the database

driver that follows DRDA protocol to connect with IBM DB2 database.

Page 42: Framework for SQL Modification and Analysis

26

Figure 2.14 Different database drivers connecting with the same database server

Furthermore using different database drivers to connect with the same database server

makes physical packets on the network look differently, such as using different database

drivers: JDBC, the open database connectivity (ODBC), and the Oracle C++ Call Interface

(OCCI) to connect with Oracle database server as shown in Figure 2.14.

In roughly conclusion, database drivers are the places where to implement the

aforementioned database protocols i.e. TNS, TDS, and DRDA.

2.6 Related work

There are tools and approaches dealing with database protocol(s) as following

subsections

2.6.1 TDS Protocol Analyzer

The TDS protocol analyzer specifically based on TDS database protocol [GW09] has

developed several features which include packet capture, packet analysis, packet storage,

traffic statistics, and vulnerability warning.

The analyzer is a useful tool in the further testing and development for the detection

of the database protocol vulnerability. Unfortunately, this thesis work aims mainly to modify

and analyze SQL statement and develop a software framework that is easily adapted to many

Page 43: Framework for SQL Modification and Analysis

27

database protocols in order to accomplish these processes. So the analyzer could not fulfill

the required tasks of the thesis completely because of the lack of SQL modification and

analysis functionalities.

Although the lack of the SQL functionalities, the thesis has gained useful information

from the analyzer on how to reassemble and disassemble TDS packets.

2.6.2 GreenSQL

GreenSQL [Ope12] is used as a firewall filtering SQL commands with malicious

intent against the database server, such as SQL injection vulnerabilities.

Although GreenSQL has features of SQL functionalities, it could not fulfill the entire

of the required tasks of the thesis. The reason why is that the main purpose between

GreenSQL and the thesis work are different, which is the purpose of GreenSQL is used

mainly for database security, such as, to filter malicious SQL commands, while the main

purpose of the thesis work is to modify and analyze SQL commands for dimensio server

[dim12] to accelerate databases.

The commercial version of GreenSQL [Gre12] has more advanced features for

databases performance as well. But it is not designed and implemented to work especially

with the dimensio server as the proxy of dimensio informatics GmbH. For example, the proxy

fetches an SQL statement from the network traffic and forwards the statement to the dimensio

server. Once received the reply back from the server, the proxy may modify the SQL

statement in the network traffic for the better performance and sends it to the destination

database. Furthermore the GreenSQL open source version is compatible only with MySQL

and PostgreSQL database servers, while its commercial version is compatible one more

database server, Microsoft SQL Server. In other words, GreenSQL is not yet compatible with

Oracle and IBM DB2.

Technically speaking, using the proxy server has the advantage that it can be

integrated to the systems without modifying application or database code, which requires

minimal setup complexity.

Page 44: Framework for SQL Modification and Analysis

28

2.6.3 Security Testing Approach and Framework

Security Testing Approach and Framework [ADM09] tests DRDA database protocol

for IBM DB2. It captures packets, and after mutating the packets, injects them back into the

network, in order to test the implementation‘s ability to handle erroneous cases. The

mutations are based on the semantics of the protocol.

The approach and framework are able to adapt to different protocols from similarly

modest in size, like server message block (SMB) and Apple file share (AFS), to the more

complex protocol, like DRDA. Unfortunately, their purpose is for security testing of network

application, but not for database optimization as the main aim of the framework of this thesis.

2.6.4 The Current Proxy Structure

After the internship [Pho11], the proxy of dimensio informatics GmbH is currently is

able to provide only the SQL modification, but not the analysis, and specialized only for the

Oracle database protocol. Yet its current structure is not suitable for developers to specialize

the proxy for further database protocol, like TDS and DRDA. The structure is illustrated in

Figure 2.15.

Figure 2.15 The problematic structure of the proxy

As shown in Figure 2.15, the structure is problematic because it merges the DB

Analyst code and Proxy code into one place, which is difficult to maintain and reuse the code

for further development for other database protocols.

Page 45: Framework for SQL Modification and Analysis

29

2.6.5 dimensio server

dimensio server [dim12] is one of products of dimensio informatics GmbH. It is an

innovative technology that can accelerate the analysis of large data volumes and highly

complex database queries by a factor of up to 1,000. It is also based on indexing method that

is innovative and suitable for data in any database, information retrieval system, or flat files.

The aforementioned proxy works with dimensio server by using the interfaces

designed and implemented by this thesis, in order to modify SQL statement to accelerate

databases.

In the following chapter, the thesis issues what it has to do to solve the problematic

structure of the proxy and provide SQL functionalities for the SQL modification and analysis.

Page 46: Framework for SQL Modification and Analysis

30

Chapter 3 Concept

This chapter describes four concepts of thesis work as follows. The framework

concept explains the structure of current proxy of dimensio informatics GmbH and what the

target structure or framework the development in this thesis wants to meet. Next the packet

assembling and disassembling concept explains how this concept relates to the interface

between the proxy code and the proxy plug-in code. Then the SQL modification concept

describes how this concept relates to the interface between the proxy plugin and the above

layer in order to change SQL statement to accelerate databases. Lastly the concept of SQL

timing analysis explains concepts concerning the analysis, such as what calls to be measured,

and the interface for the analysis.

3.1 The Framework

The following picture shows the overview of the Framework for SQL Modification

and Analysis, which is arranged from a problematic structure, the thesis work has to deal with

to the target structure of the thesis.

Figure 3.1 The overview of the framework developed from a problematic structure

Page 47: Framework for SQL Modification and Analysis

31

In Figure 3.1, the leftmost one, the merged form, shows the current structure of the

proxy, which merges the DB Analyst code and the proxy code together, which is problematic

to be maintained and reused for further development to support other database protocols

beyond Oracle database protocol. The middle one, the modular form, is the structure after

revised the previous structure and separated the DB Analyst code from the Proxy code. With

this form, it makes the code more maintainable and reusable and makes the DB Analyst code

considered as module or Proxy Plugin, which is easy to develop functionalities based on

different database protocols and integrate to support the Proxy code. The rightmost one, the

framework, is the goal form of this thesis. This form goes further to develop interfaces of the

Proxy Plugin in supporting calls from Customer Code which is considered as the plugin of

proxy plugin.

In order to develop the interfaces between database (DB) Plug-in and Customer Code,

this thesis has to evaluate different database protocols (namely TNS, TDS, and DRDA) and

find the common characteristics among them. The interfaces focus on providing services for

packet assembling and disassembling, SQL modification, and SQL timing information for the

analysis, which are described in the following sections of this chapter.

3.2 Packet Assembling and Disassembling

As described in section 2.4, different database protocols have different rules to build

their own network packets. Technically speaking, before the proxy modifies and analyzes a

network packet sent from and to client and database server, it has to assemble the packet. By

doing this, the proxy has to find the database protocol being used by the client and the

database server. According to the aforementioned framework, the proxy just takes the right

proxy plug-in (which is the DB Protocol colored in red) and forwards the data streams from

the client or the server to the plug-in via the interface. After that, the plug-in will do the rest

on behalf of the proxy, which it will assemble the data streams into the completed network

packet and disassemble it to the proxy in return. Figure 3.2 simplified this concept.

Page 48: Framework for SQL Modification and Analysis

32

Figure 3.2 Overview of the concept of packet assembling and disassembling

Moreover, before the plug-in disassembles the packet and returns it to the proxy

through the interface, it can modify and analyze the packet for particular purposes. The

following section describes the concept of SQL modification of this thesis.

3.3 SQL Modification

The purpose of SQL modification is to accelerate databases processing. By doing this,

firstly a database client and the corresponding database use the same database protocol for

their communication, for instance, to connect, query, access, and retrieve data. Next the

proxy, that works in the middle, where the client and the server connect and send data

through, takes the proxy plug-in (which is the DB Protocol in red box in the previous Figure)

that understands the database protocol as well.

Back to the previous section, to modify SQL, after the plug-in finishing assembling

the data streams and having the completed packet, it finds SQL in the packet. In case the

plug-in did not find SQL statement, it does nothing but passes the packet for the next step.

Otherwise it extracts the SQL statement and the related information from the packet and

sends that information to the above layer named Customer Code. Then, in this case, the

Customer Code connects and forwards that information to the dimensio server. After that the

dimensio server gives the new SQL statement and its related information back to the

Customer Code and it forwards them again back to the plug-in to replace the original SQL.

Page 49: Framework for SQL Modification and Analysis

33

Next the plug-in compares the original information with the new information from the above

layer. If both information are different, the plug-in modifies the original SQL statement in the

packet with the new SQL statement. If both information are the same, the plug-in does

nothing for the modification but passes the packet to the next step. Figure 3.3 illustrates the

SQL modification concept.

Figure 3.3 SQL modification concept

Note that the proxy and the dimensio server are tools provided by dimensio

informatics GmbH. Another point is that the proxy plug-in, the DB Protocol in red, does not

generate the better SQL statement by itself, but it lets the task to be of the above layers,

Customer Code and the dimensio server, and takes the statement from them.

With the new SQL statement and its related information, such as binding variables,

the database server can perform the answer or result set processing for the query faster than

the original one, because not only the new SQL is simpler and faster to process but also has

the same result set output.

Page 50: Framework for SQL Modification and Analysis

34

For the further step before the plug-in disassembles the modified packet into many

tiny packets or data streams, it can analyze the packet and gain information useful for the

further SQL timing analysis which is explained in more detail in the following section.

3.4 SQL Timing Analysis

The purpose of SQL timing analysis is to estimate cost of each SQL statement. Yet

this can provide useful information to find the least cost SQL to replace the costly SQL that

both have the same results.

But this thesis does not work directly on SQL timing analysis. It just provides

information for the analysis instead, such as SQL statement, timestamp, call type, and

binding variables etc.

To provide timing information of each SQL statement for the analysis, this thesis

roughly classifies timing types or client calls to time into five groups, which is prep, bind,

exec, fetch, and free, as shown in Figure 3.4.

Figure 3.4 SQL timing analysis concept

The abbreviations and arrows in the above figure can be described as follows.

Firstly, prep is a prepared statement and happens only once.

Next, bind supplies parameters for the prepared statement and exec executes the

statement. bind and exec can both happen many times and equal, that is the same statement

with different parameters. In the figure, it uses n times representing the number of

occurrences of them as equal.

Page 51: Framework for SQL Modification and Analysis

35

Then fetch receives resultsets from database. One exec can cause many occurrences

of fetch.

free closes or cancels the handle of the statement and happens only once.

The arrows represent sending to database server or client. To the arrow to the right is

to database server and to the arrow to the left is to database client.

However, different database protocols have different styles of calls concerning to

SQL queries. One single query can consist of many calls: prep, bind, and exec altogether, for

example.

Back to the previous section, whether or not the packet is modified according to the

new SQL statement, the next step is to analyze the packet and gain information useful for the

SQL timing analysis. The information to gain from the packet sent from database client and

provide for the analysis are client port, call type, statement ID, timestamp, SQL statement,

and parameters. After that the packet will be sent to the step of disassembling packet, which

will be disassembled into many fragmented packets. Then these packets will be sent back to

the layer Proxy and it will forward them to the destination database server.

Figure 3.5 Overview of gaining SQL timing information with the framework

Page 52: Framework for SQL Modification and Analysis

36

In gaining the information provided for the analysis, the plug-in has to cooperate

information with both sides, client and server. This is shown in Figure 3.5. Moreover, the

information that gain from reply packet sent from the database server are client port, call

type, statement ID, timestamp, number of columns in table, and number of records in table.

Now the thesis has described what it has to do and its related concepts. In the next

chapter, the thesis will show how to implement these concepts in more details.

Page 53: Framework for SQL Modification and Analysis

37

Page 54: Framework for SQL Modification and Analysis

38

Chapter 4 Implementation

This chapter explains techniques and practical information on how to implement the

four concepts in the chapter 3, which is arranged from the framework to the SQL timing

analysis as the following sections.

4.1 The Framework

This section explains the method which the thesis uses to solve the problematic

structure and builds the target structure, the framework.

Figure 4.1 The overview of the framework developed from a problematic structure

In Figure 4.1, the thesis solves the problematic structure (merged form) into the

modular form by using polymorphism technique, inheritance technique, and abstraction in

object-oriented programming (OOP). Polymorphism technically means that the same name of

method (or function) has ability to perform many behaviors (or different functionalities).

Inheritance is a programming technique that enables a class to inherit some or all attributes

and behaviors of another class, and abstraction is a way for a super class (base class) to

require a subclass (derived class) to define a method. Furthermore, this thesis uses

Page 55: Framework for SQL Modification and Analysis

39

programming language C++ to develop the framework. In C++, the term superclass is called

base class and subclass is called derived class [KG04].

To solve the problematic structure, the thesis first creates a new class, which is

abstract class, and names it ―ProxyPlugin‖. Then create more three classes inheriting the

ProxyPlugin as their single base class. These three classes are named ―PPluginOracle‖,

―PPluginSQLServer‖, and ―PPluginDB2‖ respectively. Figure 4.2 demonstrates the

relationship of these classes.

Figure 4.2 Three database protocol plugins inherit the ProxyPlugin abstract class

The class ―ProxyPlugin‖ is called abstract class because it is defined to contain virtual

function, it is not complete class, and is more a guideline for creating actual class. The

sample code in C++ for creating the abstract class and the class that inherits it is shown here

in Figure 4.3.

Page 56: Framework for SQL Modification and Analysis

40

Figure 4.3 Creating the abstract class and its derived class in C++

The reason why this thesis defines the ProxyPlugin abstract class and defines the three

database plugin classes to inherit the abstract class because this requires all classes inheriting

the abstract class to have the standard interfaces and prevent these inherited classes from

forgetting to define their own functionality for the standard interfaces , since each database

protocol has different style to perform in order to get the same thing, such as encoding data

within its packet. As shown in Figure 4.3, ProxyPlugin declares the abstract function

‗standardInterface‘, which is indicated with the special ‗=0‘ postfix and has no function body

or functionality. The class PPluginOracle that inherits the class ProxyPlugin is required to

define the abstraction function, otherwise this will cause compiler errors.

After finished the class creation, this thesis removes all the DB Analyst code from the

Proxy code and put the removed code into the right database plugin class. For example, as

mentioned that the current proxy was specialized on Oracle database, after removed the DB

Analyst code, then the thesis put the code into ―PPluginOracle‖ class.

After finished the class creation and the code rearrangement, the problematic structure

(merged form) now becomes the modular form. With this modular form, each DB protocol

plugin class is independent to each other, easier to maintain, modify existing code, and be

reused for further DB protocol development, and has a clearly defined interfaces with the

Page 57: Framework for SQL Modification and Analysis

41

Proxy. The design and implementation of the interfaces between the proxy and DB protocol

plugin are explained in the following section.

Figure 4.4 shows a sample code, that an instance of each DB protocol plugin is

selected to be declared by the value of variable ‗selectDB‘ in the main function. (The main

function is the entry point into the program.) In this case, the thesis assumes that this main

function is the proxy body code where it declares an instance of DB protocol plugin and calls

the plugin‘s function. After declared the instance of PPluginOracle, the instance is called to

perform the standardInterface function which is specialized for Oracle database.

Figure 4.4 The instance of PPluginOracle is declared in main function using C++

Next is to transform the modular form into the framework as shown in Figure 4.1. In

the framework has topmost layer named Customer Code, which is the layer of tools and

solutions for database performance tuning. In doing this, they have to associate with the

middle layer named DB Protocol. To solve issues on the association between the layer

Customer Code and the layer DB Protocol, the thesis creates an abstract class between them.

This abstract class is the collection of different interfaces that the layer DB Protocol will call,

provide parameters to, and may take return parameter back from, to perform its services for

the above layer. The customer code, which is the above layer, then has to create its own class

Page 58: Framework for SQL Modification and Analysis

42

which inherits the abstract class, and develop its own functionalities to fulfil the service

interfaces in the class. The service interfaces, which this thesis provides for the layer

Customer Code, are the SQL modifying interface and the SQL timing analysis information

interface, which will be explained in more details in the following sections in this chapter.

In action, this thesis creates the abstract class named ―DBProtocolPlugin‖ and creates

a derived class named ―Customer‖ which inherits the abstract class as its base class. The

relationship between these two classes is demonstrated in Figure 4.5.

Figure 4.5 The relationship between the class DBProtocolPlugin and the class Customer

The Customer class is an example class that inherits DBProtocolPlugin class in order

to use services or interfaces supported by proxy plugin and then considering what to do with

the services, such as developing functionalities for analysing and optimising SQL queries.

This can also mean that other class can inherit the DBProtocolPlugin class and develop

different functionalities than the Customer class.

Sample code, that creates the abstract class DBProtocolPlugin and the derived class

Customer, is shown in Figure 4.6.

Page 59: Framework for SQL Modification and Analysis

43

Figure 4.6 Sample code for creating the class DBProtocolPlugin and the class Customer

To declare an instance of the class Customer in DBProtocol code and call its

interface, looks something like Figure 4.7.

Figure 4.7 Declare a Customer instance and call its interface in DB Protocol code

In case the derived class Customer does not want its interfaces to be accessed from

DB protocol code, it is possible to change the access specifier of virtual functions that it

inherited from its base class to be protected or private, as shown in Figure 4.8.

Page 60: Framework for SQL Modification and Analysis

44

Figure 4.8 Change the access specifier of the functions inherited from base class

However, although the access specifier of the interfaces in the derived class have

changed to be protected or private, DB protocol code can still access the interfaces in the

class Customer via the base class of the class Customer as the sample code in Figure 4.9.

Figure 4.9 Access the private interface of a derived class via its base class

After finished the class creation between the layer DB Protocol and the layer

Customer, the modular form now becomes the targeted structure of this thesis, the

framework. With this framework, dimensio informatics GmbH is able to use it as a basic tool

for other tools and solutions, such as with dimensio server, to the performance tuning of

databases and applications. This can be done via the abstraction layers named ―ProxyPlugin‖

and ―DBProtocolPlugin‖. The abstraction layer ―ProxyPlugin‖ makes each of database

Page 61: Framework for SQL Modification and Analysis

45

protocol modular in the framework and the abstraction layer ―DBProtocolPlugin‖ makes the

performance applications from dimensio informatics GmbH or from their partners modular

for database protocol, like Oracle, IBM DB2, and SQLServer. The modularity makes

components and layers in the framework more manageable and helps them connect, interact,

and exchange resources (data) in some way, by adhering to standardized interfaces.

Figure 4.10 The relationship and interaction between layers in the framework

The framework is reviewed as Figure 4.10. The layer Proxy (colored in white)

connects, interacts, and exchanges data with the DBProtocol layer (colored in red) via a

standardized interface in the ProxyPlugin abstraction layer (colored in yellow). Also, the

DBProtocol layer connects, interacts, and exchanges data with the Customer layer (colored in

red) via a standardized interface in the DBProtocolPlugin abstraction layer (colored in

yellow). The design and development of standardized interfaces for the ProxyPlugin

abstraction layer are explained in the section The ProxyPlugin Abstraction Layer, and

standardized interfaces of the DBProtocolPlugin abstraction layer are explained in section

The DBProtocolPlugin Abstraction Layer respectively in this chapter.

Page 62: Framework for SQL Modification and Analysis

46

4.2 The ProxyPlugin Abstraction Layer

This section explains the design and development of standardized interfaces of the

ProxyPlugin abstraction layer in Figure 4.10. These interfaces have to enable the Proxy layer

and the DBProtocol layer to connect, interact, and exchange data with each other in order to

achieve their tasks. The tasks of the Proxy are to open a connection with database client and

server, receive data streams from database client, receive data streams from database server,

and close the connection. The tasks of DBProtocol are to gain information of the connection,

assemble and disassemble data streams from database client, assemble and disassemble data

streams from database server, and return reserved resources when the connection is closed.

Therefore the standardized interfaces have to be provided for the Proxy and DBProtocol to

enable them to work together accordingly.

The interfaces to be developed and provided for them are onConnection,

onDisconnection, onReceiveClient, and onReceiveServer. Figure 4.11 shows the class

diagram that the base class ProxyPlugin standardizes the interfaces and the three derived

classes inherit it to develop functionalities for the standardized interfaces.

Figure 4.11 The three DB protocol classes develop their functionalities on the

standardized interfaces of their base class

Page 63: Framework for SQL Modification and Analysis

47

Figure 4.12 Sample code of how to declare ProxyPlugin object and use it

The Proxy will instantiate an object of the ProxyPlugin class, define its type, and

invoke each of its interfaces which is relevant to the event occurs in the program (see sample

code in Figure 4.12). The description of each of the interfaces and its relevant event is

described as follows.

onConnection interface will be called when a database client connects to the proxy

and the proxy establishes the connection with the destination database server. When this

interface is called, the proxy will give necessary information concerning the connection (such

as listening port, database host name, database service port, dimensio address, dimensio port)

to the interface. After that, the DB protocol object will perform its functionality to initialize

its variables and its initializing functions. These processes are simplified in Figure 4.13.

Page 64: Framework for SQL Modification and Analysis

48

Figure 4.13 The flow chart associated with onConnection interface

The abstract interface of onConnection is shown in Figure 4.14.

Figure 4.14 The pure virtual function of onConnection interface

onReceiveClient interface is called when the proxy receives data streams from the

database client. Technically speaking, these data streams are disassembled from one packet

by the database client. After the proxy received data streams, it calls this interface to forward

these data streams to the DB protocol object. The proxy will repeat these processes until the

DB client has no more data streams to send for the packet. Next the proxy will wait until it

gets returned data streams back from the DB protocol object. Finally, the proxy forwards the

returned data streams to the destination database server. Figure 4.15 shows overview of this

process.

Page 65: Framework for SQL Modification and Analysis

49

Figure 4.15 The overview of the processes for the interface onReceiveClient

Resume to the step that the proxy receiving data streams from the DB client and

forwarding them to the interface onReceiveClient. Meanwhile the DB protocol object will be

collecting and organizing these data streams until it is able to assemble them into one

complete packet. While the proxy is waiting for returned data streams from the DB protocol

object, the DB protocol object is performing its functionality to serve interfaces of an object

of the above layer, Customer, which will be described in the following section. After the DB

protocol object exchanged information with the Customer object and served services for the

object, it disassembles the packet into many data streams and returns these data streams back

to the proxy.

Page 66: Framework for SQL Modification and Analysis

50

Figure 4.16 The overview of tasks of the DB protocol object

To collect, organize, and assemble data streams into one complete packet again, each

DB protocol has different rule to perform these processes. Although they are different, they

have some procedures in common when working with the proxy via the interface

onReceiveClient (see Figure 4.16).

The abstract interface of onReceiveClient is shown in Figure 4.17.

Page 67: Framework for SQL Modification and Analysis

51

Figure 4.17 The pure virtual function of onReceiveClient interface

onReceiveServer interface will be called when the database server sends data streams

to the proxy. The overview tasks of the proxy and the DB protocol object with this interface

are similar to the interface onReceiveClient. The differences that this interface has, to the

interface onReceiveClient, are that it changes the direction of sending data streams from the

database client to the database server in another direction. The further changes are in the step

of performing functionalities for the Customer object, which will be described in more details

in the next section.

The abstract interface of onReceiveServer is shown in Figure 4.18.

Figure 4.18 The pure virtual function of onReceiveServer interface

onDisconnection interface will be called when the database client or the database

server closes or terminates the connection. When this occurs, the proxy will call this interface

to let the DB protocol object knows. After that, the DB protocol object will return resources

to the systems and so does the proxy (see Figure 4.19).

Page 68: Framework for SQL Modification and Analysis

52

Figure 4.19 The flow chart associated with onDisonnection interface

The abstract interface of onDisonnection is shown in Figure 4.20.

Figure 4.20 The pure virtual function of onReceiveClient interface

In the following section, this thesis describes how to design and implement the

interfaces for SQL modification and analysis and explains what the DB protocol object does

after it finished the step of assembling packet.

4.3 The DBProtocolPlugin Abstraction Layer

This section explains how to implement the SQL modification concept and the SQL

timing analysis concept in the chapter 3, by designing and developing standardized interfaces

of the DBProtocolPlugin abstraction layer in Figure 4.10. These interfaces will be called to

enable the DBProtocol layer and the Customer layer to connect, interact, and exchange data

with each other in order to achieve their tasks. Some of the tasks of the DBProtocol are

Page 69: Framework for SQL Modification and Analysis

53

already explained in the previous section. The further tasks of the DBProtocol are to forward

information for connecting to dimensio server, fetch and modify an SQL statement in the

packet, gain and inform useful information for SQL timing analysis in the packet sent by the

database client, gain and inform useful information for SQL timing analysis in the packet sent

by the database server, and inform the connection is terminated to the above layer, Customer.

The tasks of the Customer are to work accordingly with the DB protocol layer, which are to

connect with dimensio server, modify an SQL statement, analyze SQL timing information

from database client side, analyze SQL timing information from database server side, and

disconnect from dimensio server.

The interfaces to be developed and provided for the DBProtocol layer and the

Customer layer are onConnectToDimensio, onChangeSql, onClientSqlTime,

onServerSqlTime, and onDisconnectFromDimensio. Figure 4.21 illustrates the class diagram

that the base class DBProtocolPlugin standardizes the interfaces and the derived class

Customer inherits it to develop functionalities for the standardized interfaces. In the figure,

the bold and italic depict abstract class, the only italic depicts abstract function, the only bold

depicts concrete class, and the normal depicts concrete function.

Page 70: Framework for SQL Modification and Analysis

54

Figure 4.21 The class DBProtocolPlugin is inherited by the Customer class

The DBProtocol will instantiate an object of the DBProtocolPlugin class, define its

concrete class, and invoke each of its interfaces which is relevant to the event occurs in the

program (see sample code in Figure 4.22).

Figure 4.22 Sample code of how to declare DBProtocolPlugin object and use it

The description of each of the interfaces and its relevant event is described as follows.

onConnectToDimensio interface will be called right after the DBProtocol object

received information for connecting dimensio server from the proxy. The DBprotocol object

will call this interface and give the information as the parameter. After that the Customer

object will get the information and initialize its variables. In case the Customer object wants

to connect to dimensio server for database performance tuning, it will do so by using the

information. These steps are simplified in Figure 4.23.

Page 71: Framework for SQL Modification and Analysis

55

Figure 4.23 The work flow for onConnectToDimensio interface

The abstract interface of onConnectToDimensio is shown in Figure 4.24.

Figure 4.24 The pure virtual function of onConnectToDimensio interface

onChangeSql interface is used to exchange SQL statement between the DBProtocol

object and the Customer object for changing SQL statement in the packet. This interface is

concerned directly to the concept of SQL modification in the chapter 3. To change SQL

statement in the packet, firstly the DBProtocol object has to finish assembling the packet as

mentioned in the previous section. Secondly the object analyzes and finds SQL statement in

the packet. If it found SQL statement, it call onChangeSql interface to give the found SQL

statement to the Customer object. After that the Customer object will modify the SQL

statement by working with dimensio server. Then the Customer object will return the new

Page 72: Framework for SQL Modification and Analysis

56

SQL statement to the DBProtocol object. Finally, the DBprotocol object modifies the SQL

statement in the packet with the new one. See Figure 4.25 for the flow chart of these steps.

The abstract interface of onChangeSql is shown in Figure 4.26.

In the argument list of onChangeSql interface in Figure 4.26, there is a struct variable

type named CBindVar (see Appendix B). The DBProtocol object will find binding parameter

and provide it to supply CBindVar in the argument list of the interface if the SQL statement

type in the packet is stored procedure or prepared statement.

Figure 4.25 The flow chart for onChangeSql interface

Page 73: Framework for SQL Modification and Analysis

57

Figure 4.26 The pure virtual function of onChangeSql interface

onClientSqlTime interface and onServerSqlTime interface are used for providing SQL

timing information for the Customer to perform its functionality for SQL timing analysis.

These two interfaces are concerned directly to the concept of SQL timing analysis in the

chapter 3. The processes related to SQL timing analysis are explained in step-by-step. First

the DB Protocol object assembles the packet. Second, the object analyses the packet to gain

useful information for the SQL timing analysis. If the packet is a request from database

client, the information to gain from it are client port, call type, statement ID, timestamp, SQL

statement, and parameters. If the packet is a reply from database server, the information to

gain from it are client port, call type, statement ID, timestamp, number of columns in table,

and number of records in table. After that, the DBProtocol calls onClientSqlTime interface

and onServerSqltime interface, which depends on where the packet is sent from. If the packet

is sent from database client, the DBProtocol object calls onClientSqlTime interface.

Otherwise the object calls onServerSqlTime interface to provide the information for the

Customer object. Finally, the Customer object uses the information to analyze time usage of

SQL. These steps for the SQL timing analysis are shown in Figure 4.27.

Page 74: Framework for SQL Modification and Analysis

58

Figure 4.27 The flow chart for onClientSqlTime interface and onServerSqlTime

interface

The abstract interface of onClientSqlTime is shown in Figure 4.28 and the abstract

interface of onServerSqlTime is shown in Figure 4.29.

Figure 4.28 The pure virtual function of onClientSqlTime interface

Page 75: Framework for SQL Modification and Analysis

59

Figure 4.29 The pure virtual function of onServerSqlTime interface

To provide SQL timing information for the analysis, the DBProtocol object has to

organize information it gained from both sides, database client and server, and find the

relationship of the information between them. All information that the DBProtocol object

gains from request packet and reply packet, and provides for the Customer object via the two

interfaces, are explained as follows.

client port tells which client is communicating with the database server.

call type defines that the packet is a request from client or a reply from

database and tells that which calls the packet consists of.

statement ID tells which ID number is used to refer to the SQL statement.

time stamp tells at what time the packet is received.

SQL is an SQL statement.

parameter tells parameter values supplying to the SQL statement. The values

will be supplied if database client executes its query by using stored procedure or

prepare statement.

#col is the number of columns in result table.

#row is the number of rows in result table.

The information which are used to refer and find relationship of the query and reply

between database client and server are client port, statement ID, and call type.

In more details, client port tells which client the database server is currently

communicating with, statement ID tells the communications between the database client and

Page 76: Framework for SQL Modification and Analysis

60

server are to fulfill which query. call type helps to predict and expect what the next call

should be.

Next chapter will show the result of using these interfaces in the framework for

modifying SQL statement in request packet and analyzing SQL timing information.

Page 77: Framework for SQL Modification and Analysis

61

Page 78: Framework for SQL Modification and Analysis

62

Chapter 5 Results

This chapter shows the output of the framework design illustrated as class diagram.

Furthermore, this chapter demonstrates testing cases by using the framework of this thesis for

SQL modification and SQL timing analysis. There are four testing cases of SQL

modification. The four cases are the demonstration of SQL modification for Oracle, the

demonstration of SQL modification for DB2, the demonstration of SQL modification for

Microsoft SQL Server, and the demonstration of SQL modification for Oracle with dimensio

server. Additionally, two testing cases are for SQL analysis.

Page 79: Framework for SQL Modification and Analysis

63

5.1 Framework for SQL Analysis and Modification

Figure 5.1 The class diagram of the final design of the framework

This section shows the final design of the framework for SQL analysis and

modification for the three database protocols, that is TNS for Oracle, TDS for Microsoft

(MS) SQL Server, and DRDA for IBM DB2. The final design of the framework is

demonstrated in class diagram as Figure 5.1. The Proxy exchanges information with an

instance of the class PPluginOracle, PPluginSQLServer, or PPluginDB2 via the standardized

interface of the abstract class ProxyPlugin. An instance of the class PPluginOracle,

Page 80: Framework for SQL Modification and Analysis

64

PPluginSQLServer, or PPluginDB2 exchanges information with an instance of Customer via

the standardized interface of the abstract class DBProtocolPlugin.

After this thesis developed functionalities into the interfaces of the four derived

classes, that are PPluginOracle, PPluginSQLServer, PPluginDB2, and Customer, the

framework is now ready to be tested for the SQL modification and analysis, as shown in the

following sections.

5.2 Demonstration for SQL modification

This section demonstrates the results of using the framework for SQL modification

for Oracle database, MS SQL Server, IBM DB2, and Oracle database with dimensio server in

the following subsections respectively. Additionally, each subsection describes the

specification of its test.

The table to be used for all testing cases in this chapter is EMP table as shown in

Table 5.1.

Table 5.1 EMP table

EMPNO ENAME HIREDATE TAX

901 Martin 01-FEB-06 70.5

902 Andreas 01-MAR-06 70.5

903 Alexander 01-JUN-07 75.5

904 Dirk 01-DEC-08 70.5

905 Hartmut 01-MAY-09 50.5

906 xxxxxx 01-APR-00 1000.5

5.2.1 The SQL modification with Oracle

The specification of this testing case is detailed in Table 5.2.

To perform this testing case, first, run the database server. Second, set the the the

proxy to use PPluginOracle. Third, set the original SQL statement and the new SQL

Page 81: Framework for SQL Modification and Analysis

65

statement in the proxy. Then, run the proxy to connect with the running database server. After

that run the database client to connect the database server via the proxy. Later, make the

database client sending SQL command same as the original SQL statement to the database

server. The result of this testing case, which is detailed in Table 5.2, is demonstrated in

Figure 5.2.

Table 5.2 The specification for testing the SQL modification with Oracle

Title Description

Database client SQL*Plus: Release 11.2.0.2.0

Database server Oracle Database 11g Enterprise Edition Release 11.2.0.1.0-64bit

Original SQL statement SELECT * FROM EMP WHERE ename = 'Alexander';

New SQL statement SELECT * FROM EMP;

Figure 5.2 The result of the SQL modification testing by the specification in Table 5.2

Page 82: Framework for SQL Modification and Analysis

66

With same specification in Table 5.2, the thesis changes the database client from

―SQL*Plus: Release 11.2.0.2.0‖ to ―Oracle SQL Developer (3.0.03)‖ and runs this testing

case. The result is demonstrated in Figure 5.3.

Figure 5.3 The testing case with DB client Oracle SQL Developer 3.0.03

With same specification in Table 5.2, the thesis changes the database client from

―SQL*Plus: Release 11.2.0.2.0‖ to ―Toad for Oracle Xpert Trial Version 11.0.0.116‖ and

runs this testing case. The result is demonstrated in Figure 5.4.

Page 83: Framework for SQL Modification and Analysis

67

Figure 5.4 The testing case with Toad for Oracle Xpert Trial Version 11.0.0.116

5.2.2 The SQL modification with Microsoft SQL Server

The specification of this testing case is detailed in Table 5.3.

Table 5.3 The specification for testing the SQL modification with MS SQL Server

Title Description

Database client A DB client developed by using Microsoft Visual Basic 2010

Express, working with the SqlClient data provider for SQL

Server.

Database server Microsoft SQL Server 2008 (RTM) - 10.0.1600.22 (Intel X86)

Jul 9 2008 14:43:34 Copyright (c) 1988-2008 Microsoft

Corporation Express Edition with Advanced Services on

Windows NT 5.1 <X86> (Build 2600: Service Pack 3)

Page 84: Framework for SQL Modification and Analysis

68

Original SQL statement SELECT * FROM EMP WHERE ename = 'Alexander';

New SQL statement SELECT * FROM EMP;

Figure 5.5 The result of the SQL modification testing by the specification in Table 5.3

To perform this testing case, first, run the database server. Second, set the the the

proxy to use PPluginSQLServer. Third, set the original SQL statement and the new SQL

statement in the proxy. Then, run the proxy to connect with the running database server. After

that run the database client to connect the database server via the proxy. Later, make the

database client sending SQL command same as the original SQL statement to the database

server. The result of this testing case, which is detailed in Table 5.3, is demonstrated in

Figure 5.5.

5.2.3 The SQL modification with IBM DB2

The specification of this testing case is detailed in Table 5.4.

Page 85: Framework for SQL Modification and Analysis

69

Table 5.4 The specification for testing the SQL modification with IBM DB2

Title Description

Database client A DB client developed by using Microsoft Visual Basic 2010

Express, working with the OLE-DB data provider for IBM

DB2.

Database server IBM DB2 v10.1.0.872

Original SQL statement SELECT * FROM EMP WHERE ename = 'Alexander';

New SQL statement SELECT * FROM EMP;

To perform this testing case, first, run the database server. Second, set the the the

proxy to use PPluginDB2. Third, set the original SQL statement and the new SQL statement

in the proxy. Then, run the proxy to connect with the running database server. After that run

the database client to connect the database server via the proxy. Later, make the database

client sending SQL command same as the original SQL statement to the database server. The

result of this testing case, which is detailed in Table 5.4, is demonstrated in Figure 5.6.

Page 86: Framework for SQL Modification and Analysis

70

Figure 5.6 The result of the SQL modification testing by the specification in Table 5.4

5.2.4 The SQL modification for Oracle with dimensio server

This section demonstrates the SQL modification for Oracle with dimensio server. The

specification is detailed in Table 5.5.

Table 5.5 The specification for testing the SQL modification with Oracle and dimensio

Title Description

Database client Oracle SQL Developer Version 3.0.04 Build MAIN-04.34

Database server Oracle Database 11g Express Edition 11.2.0.2.0 64bit

Production

Computer Specification CPU: Intel(R) Core(TM) i5 CPU 660 @ 3.33 GHz

Page 87: Framework for SQL Modification and Analysis

71

RAM: 4 GB DDR3 Speed 1333 MHz

HD: 500 GB

OS: Linux Debian Squeeze

Database Size 10 GB

Tables and number of

records

h_orders: 15,000,000 records

h_customer: 1,500,000 records

h_nation: 25 records

h_region: 5 records

h_lineitem: 59,986,052 records

h_part: 2,000,000 records

h_supplier: 100,000 records

Original SQL statement SELECT

l.l_key,

o.o_orderstatus,

o.o_orderpriority,

c.c_mktsegment,

l.l_shipinstruct,

l.l_shipmode,

l.l_returnflag,

l.l_discount,

l.l_quantity,

r.r_name,

r2.r_name,

s.s_acctbal

FROM

h_orders o,

h_customer c,

h_nation n,

h_region r,

h_lineitem l,

h_part p,

Page 88: Framework for SQL Modification and Analysis

72

h_supplier s,

h_nation n2,

h_region r2

WHERE

o.o_custkey = c.c_custkey

AND c.c_nationkey = n.n_nationkey

AND n.n_regionkey = r.r_regionkey

AND o.o_orderkey = l.l_orderkey

AND l.l_partkey = p.p_partkey

AND l.l_suppkey = s.s_suppkey

AND s.s_nationkey = n2.n_nationkey

AND n2.n_regionkey = r2.r_regionkey

AND o.o_orderstatus IN ('F', 'P')

AND o.o_orderpriority IN ('2-HIGH')

AND c.c_mktsegment IN ('BUILDING')

AND l.l_shipinstruct IN ('COLLECT COD')

AND l.l_shipmode IN ('MAIL')

AND l.l_returnflag IN ('R', 'N')

AND l.l_discount BETWEEN 0 AND 0.1

AND l.l_quantity BETWEEN 6 AND 47

AND r.r_name IN ('AFRICA')

AND r2.r_name IN ('AMERICA')

AND s.s_acctbal BETWEEN 2115.2 AND 2263.02

ORDER BY L_KEY

;

New SQL statement with

indexes received from the

dimensio server

SELECT

l.l_key,

o.o_orderstatus,

o.o_orderpriority,

c.c_mktsegment,

l.l_shipinstruct,

l.l_shipmode,

l.l_returnflag,

l.l_discount,

l.l_quantity,

r.r_name,

r2.r_name,

s.s_acctbal

FROM

h_orders o,

h_customer c,

h_nation n,

h_region r,

h_lineitem l,

h_part p,

h_supplier s,

h_nation n2,

h_region r2

Page 89: Framework for SQL Modification and Analysis

73

WHERE (L_KEY IN (18232998, 32061977, 43501819,

42104195, 36778379, 36698067, 35345872, 30416162,

26899799, 19619936, 16728562, 16648956, 13748343,

13672428, 38038313))

AND

o.o_custkey = c.c_custkey

AND c.c_nationkey = n.n_nationkey

AND n.n_regionkey = r.r_regionkey

AND o.o_orderkey = l.l_orderkey

AND l.l_partkey = p.p_partkey

AND l.l_suppkey = s.s_suppkey

AND s.s_nationkey = n2.n_nationkey

AND n2.n_regionkey = r2.r_regionkey

AND o.o_orderstatus IN ('F', 'P')

AND o.o_orderpriority IN ('2-HIGH')

AND c.c_mktsegment IN ('BUILDING')

AND l.l_shipinstruct IN ('COLLECT COD')

AND l.l_shipmode IN ('MAIL')

AND l.l_returnflag IN ('R', 'N')

AND l.l_discount BETWEEN 0 AND 0.1

AND l.l_quantity BETWEEN 6 AND 47

AND r.r_name IN ('AFRICA')

AND r2.r_name IN ('AMERICA')

AND s.s_acctbal BETWEEN 2115.2 AND 2263.02

ORDER BY L_KEY

;

The result of the SQL modification for Oracle with dimensio server is shown in

Figure 5.7.

Page 90: Framework for SQL Modification and Analysis

74

Figure 5.7 The result of the connection to Oracle database via the proxy and dimensio

server by the specification in Table 5.5

The result of direct connection to Oracle database with the original SQL statement,

which without connection to the proxy and dimensio server is shown in Figure 5.8.

Page 91: Framework for SQL Modification and Analysis

75

Figure 5.8 The result of direct connection to Oracle database by the specification in

Table 5.5

As demonstrated, the result of direct connection to Oracle database took 32.106

seconds to process, while the result of connection to Oracle database via the proxy and

dimensio server took 0.05 second to process. The factor of improvement was 642 faster.

5.3 Demonstration for SQL analysis

This section demonstrates using the framework for SQL timing analysis especially for

two testing cases. The first one is tested with MS SQL Server just to provide timing

information for the Customer, and the second one is tested with Oracle database not only to

provide timing information for the Customer but also develop a functionality of it to

demonstrate SQL timing analysis.

Page 92: Framework for SQL Modification and Analysis

76

5.3.1 The SQL Analysis with Microsoft SQL Server

This subsection demonstrates the use of the proxy with Microsoft SQL Server for the

SQL timing analysis. But this thesis does not develop functionality at the Customer for that

analysis, it just observes network traffic between a database client the destination database

server and gains useful information for the Customer. This testing case for the SQL analysis

has the specification as detailed in Table 5.6.

Figure 5.9 shows the logfile of this testing case that keeps SQL timing analysis

information after the DB client and the DB server finished processing the query.

Table 5.6 The specification for testing the SQL analysis with MS SQL Server

Title Description

Database client A DB client developed by using Microsoft Visual Basic 2010

Express, working with the SqlClient data provider for SQL

Server.

Database server Microsoft SQL Server 2008 (RTM) - 10.0.1600.22 (Intel X86)

Jul 9 2008 14:43:34 Copyright (c) 1988-2008 Microsoft

Corporation Express Edition with Advanced Services on

Windows NT 5.1 <X86> (Build 2600: Service Pack 3)

Original SQL statement SELECT * FROM EMP WHERE empno=@v_no and

ename=@v_name and tax=@v_tax;

Page 93: Framework for SQL Modification and Analysis

77

PPluginSQLServer ...

PP@@@60809@@@req_SP_PREPEXEC_@@@1347052836.332432@@@0@@@SELEC

T * FROM EMP WHERE empno=@v_no and ename=@v_name and

tax=@v_tax@@@:1=903 :2=Alexander :3=75.5

PP@@@60809@@@ans_SP_PREPEXEC_@@@1347052836.333186@@@0@@@4@@

@1@@@0

Figure 5.9 The SQL timing analysis information when working with MS SQL Server

5.3.2 The SQL Analysis with Oracle database

This subsection demonstrates the use of the proxy with Oracle database for the SQL

timing analysis. Additionally with this testing case for the SQL timing analysis, this thesis

has not only provided SQL timing analysis information by observing network traffic

between a database client the destination database server and gaining useful information for

the Customer, but also developed functionality for the Customer for that analysis. This testing

case for the SQL analysis has the specification as detailed in Table 5.7.

Figure 5.10 shows the logfile of this testing case that keeps SQL timing analysis

information after the DB client and the DB server finished processing the query.

Table 5.7 The specification for testing the SQL analysis with Oracle database

Title Description

Database client Oracle SQL Developer 3.0.03

Database server Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 -

64bit

Query VARIABLE v_no NUMBER;

VARIABLE v_name VARCHAR2(10);

VARIABLE v_tax FLOAT;

Page 94: Framework for SQL Modification and Analysis

78

EXECUTE :v_no :=903;

EXECUTE :v_name :='Alexander';

EXECUTE :v_tax :=75.5;

SELECT * FROM EMP WHERE empno=:v_no and

ename=:v_name and tax=:v_tax;

PPluginOracle ...

PP@@@32791@@@req_prep_bind_exec_@@@1346992948.608229@@@0@@@SELEC

T * FROM EMP WHERE empno=:v_no and ename=:v_name and

tax=:v_tax@@@:1=903 :2=Alexander :3=75.5

PP@@@32791@@@ans_prep_bind_exec_free_@@@1346992948.609630@@@4@@@4

@@@1@@@1403

Figure 5.10 The SQL timing analysis information when working with Oracle database

The information orders in Figure 5.10 are arranged differently depending on where

the packet is sent from. If the packet is sent from database client, the information are arranged

in form 1 and its call_type has prefix ‗req_‘, otherwise they are organized in form 2 and its

call_type has prefix ‗ans_‘. The two forms are shown in Figure 5.11.

Form1:

PP@@@<client_port>@@@<call_type>@@@<timestamp>@@@<statement_i

d>@@@<SQL_statement>@@@<parameters>

Form2:

PP@@@<client_port>@@@<call_type>@@@<timestamp>@@@<statement_i

d>@@@<#col>@@@<#row>@@@<additional_info>

Figure 5.11 The two arrangement forms of SQL timing analysis information

Page 95: Framework for SQL Modification and Analysis

79

Additionally, Figure 5.12 shows a sample of the result after the Customer finished

performing the functionality for the SQL timing analysis by using the information.

The full length for the SQL timing analysis information in Figure 5.11 and the full

length of the result of the SQL timing analysis in Figure 5.12 are put in Appendix A of this

thesis.

In the next chapter discusses the results of this thesis work, some of obstacles it

encountered, and further research and improvements.

Figure 5.12 The sample result of SQL timing analysis with Oracle database

Page 96: Framework for SQL Modification and Analysis

80

Chapter 6 Outlook

This chapter summarizes the results of the thesis work, discusses obstacles this thesis

encountered, and speaks more about further research and improvements.

As the results, the thesis has finished the implementation of the framework for SQL

modification and analysis. Moreover the results have proved that the framework is able to

support development of different database protocols; like TNS for Oracle database, TDS for

Microsoft SQL Server, and DRDA for IBM DB2, to modify SQL statement, especially

SELECT SQL statement, in network traffic. But providing the information for the SQL

timing analysis, this thesis has demonstrated only two testing cases that work with Oracle

database and Microsoft SQL Server. Furthermore the thesis has developed sample

functionalities on the Customer using the information for SQL timing analysis and displaying

the result of the analysis.

The challenges that this thesis dealing with were complexities of database protocols,

the framework and its interfaces design. The most complicated database protocol that this

thesis had dealt with was TNS for Oracle database because the protocol is not open standard

protocol as TDS for Microsoft SQL Server and DRDA for IBM DB2. Therefore, the thesis

had to do several times of trials and errors in order to gain the exact meaning of each bit each

byte each column in network traffic sent between Oracle database client and server. This is

time consuming and can delay the design of interfaces of the framework for SQL

modification and analysis. Since, the thesis has to find common characteristics among

database protocols in order to develop standardized interfaces in the framework.

Page 97: Framework for SQL Modification and Analysis

81

6.1 Future work

Possible further improvement of the framework is to measure overheads, such as

proxy overhead and network overhead in different circumstances, e.g. when dealing with

many client connections at the same time etc. Next is to extend functionalities, such as

modifying properties related to SQL statements such as binding variables, generating

resultset independently from databases, supporting further database protocols, e.g. MySQL,

PostgreSQL, Sybase, Informix etc, changing the database protocol of client's requests or

database's responses. At last but not least, the final goal that can be suggested for further

research is to replace databases completely with the fully developed framework.

Page 98: Framework for SQL Modification and Analysis

82

Bibliography

[Pho11] P. Phoonsarakun, "Extending a Proxy to Analyze and Modify Data Streams

based on the Oracle Database Protocol," 2011.

[Drda11] DRDA V5 Vol. 1: Distributed Relational Database Architecture (DRDA). The

Open Group, July 2011.

[Tds11] [MS-TDS] Tabular Data Stream Protocol Specification. Microsoft Corporation,

December 2011.

[CV10] J. Conrad and J. Viescas, Microsoft Access 2010 Inside Out. O'Reilly,

September 1, 2010.

[SZT12] B. Schwartz, P. Zaitsev, and V. Tkachenko, High Performance MySQL:

Optimization, Backups, and Replication. O'Reilly, March 30, 2012.

[Fla00] R. M. Flannery, Informix Handbook. Prentice Hall PTR, August 14, 2000.

[GAS96] J. Garbus, W. E. al, and D. Solomon, Sybase SQL Server 11 Unleashed. Sams,

April 16, 1996.

[OHR11] L. H. P. R. Regina Obe, PostGIS in Action. Manning Publications; . Pap/Psc

edition, April 28, 2011.

[Cod70] E. F. Codd, "A Relational Model of Data for Large Shared Data Banks," vol.

13, no. 6, p. 377–387, 1970.

[CB74] D. D. Chamberlin and R. F. Boyce, "SEQUEL: A Structured English Query

Language," in The 1974 ACM SIGFIDET Workshop on Data Description,

Access and Control (Association for Computing Machinery), 1974, p. 249–264.

[Opp04] A. Oppel, "Databases Demystified," in McGraw-Hill Osborne Media , San

Francisco CA, 2004, p. 90–91.

[Ope12] (2012) GreenSql. [Online]. http://www.greensql.net/

[Ora77] (1977) Oracle Timeline. [Online].

http://oracle.com.edgesuite.net/timeline/oracle/

[IBM83] (1983) History of IBM. [Online]. http://www-

03.ibm.com/ibm/history/history/year_1983.html

[ANSI12] ANSI. [Online]. www.ansi.org

[ISO12] ISO. [Online]. www.iso.org

[GS09] J. Goodson and R. A. Steward, The Data Access Handbook: Achieving Optimal

Database Application Performance and Scalability. Prentice Hall, March 2009.

[IS99] "Database Language SQL—Part 2: Foundation (SQL/Foundation)," 1999.

[Rub07] C. Rubinson, "Nulls, Three-Valued Logic, and Ambiguity in SQL: Critiquing

Date‘s Critique," SIGMOD Record, December 2007.

[Dat05] C. J. Date, Database in Depth: Relational Theory for Practitioners. O‘Reilly,

2005.

Page 99: Framework for SQL Modification and Analysis

83

[ISO08] (2008, Jul.) ISO/IEC 9075(1-4,9-11,13,14) :2008. [Online]. www.iso.org

[KK10] J. Kurose and K. Ross, Computer Networking: a Top-Down Approach. Pearson,

2010.

[Tho06] K. Thomas, Beginning Ubuntu Linux: From Novice to Professtional. Apress,

2006, A proxy server helps speed up Internet access by stroing frequently

accessed pages.

[Hes12] K. Hess. Top 10 Enterprise Database Systems to Consider. [Online].

http://www.serverwatch.com/trends/article.php/3883441/Top-10-Enterprise-

Database-Systems-to-Consider.htm

[TG00] H. Toledo and J. Gennick, Oracle Net8 Configuration and Troubleshooting.

O‘Reilly, December 2000,

(http://oreilly.com/catalog/net8trouble/chapter/ch01.html).

[Ddm11] DRDA V5 Vol. 3: Distributed Data Management (DDM) Architecture. The

Open Group, July 2011.

[Fdoca11] DRDA V5 Vol. 2: Formatted Data Object Content Architecture (FD:OCA). The

Open Group , July 2011.

[GW09] L. Guo and H. Wu, Design and Implementation of TDS Protocol Analyzer.

IEEE, 2009.

[dim12] (2012) dimensio: Light Speed Data Access. [Online]. http://www.dimensio-

informatics.com

[Gre12] (2012) Product Overview. [Online]. http://www.greensql.com

[ADM09] M. AboElFotoh, T. Dean, and R. Mayor, "An Empirical Evaluation of a

Language-Based Security Testing Technique," 2009.

[KG04] J. Keogh and M. Giannini, OOP Demystified. McGraw-Hill/Osborne, 2004.

[HG04] M. A. Harding and E. Granada, "Method and Apparatus for Reducing Latency

Involved in Retrieving Web Page Components," U.S. Patent UNKNOWN,

2004.

[Sha86] M. Shapiro, "Structure and encapsulation in distributed systems: the Proxy

Principle," in Int. Conf. on Distributed Computer Sys, 1986, p. 198–204.

[CRS99] A. Cohen, S. Rangarajan, and N. Singh, "Supporting Transparent Caching with

Standard Proxy Caches," in 4th International Web Caching Workshop, 1999.

[KCZGD06] K. Kent, Z. Chevalier, T. Grance, and H. Dang, "Guide to Integrating Forensic

Techniques into Incident Response," in NIST Special Publication, 2006, pp.

800-86.

[Cas04] E. Casey, "Network traffic as a source of evidence: tool strengths, weaknesses,

and future needs," Elsevier Digital Investigation, pp. 28-43, 2004.

[Abb08] A. Abbasi, Oracle 10g Database Administration Concepts and implementation

Made Simple. 2008.

Page 100: Framework for SQL Modification and Analysis

84

Page 101: Framework for SQL Modification and Analysis

85

Biography

Pinaet Phoonsarakun is a student research assistant at dimensio informatics GmbH,

Chemnitz, Germany. He received his Bachelor degree in technical education, computer

technology from King Mongkut‘s University of Technology North Bangkok (KMUTNB) in

2008 with the first class honour. His research focuses on database optimization, database and

network protocols, and software engineering.

Page 102: Framework for SQL Modification and Analysis

86

Page 103: Framework for SQL Modification and Analysis

87

Appendix

Appendix A: The Result of the SQL Timing Analysis

The full length of the SQL timing analysis information of Figure 5.10 is demonstrated

in Figure A.1 and the full length of the result of SQL timing analysis of Figure 5.12 is

demonstrated in Figure A.2.

PPluginOracle ...

PP@@@39717@@@req_exec_@@@1347037415.182590@@@0@@@select * from v$version

where banner like '%Oracle%'@@@

PP@@@39717@@@ans_exec_free_@@@1347037415.186112@@@3@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 1

PP@@@39717@@@req_exec_@@@1347037415.187331@@@0@@@select USER from dual@@@

PP@@@39717@@@ans_exec_free_@@@1347037415.188364@@@2@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 2

PP@@@39717@@@req_exec_@@@1347037415.189295@@@0@@@alter session set

PLSCOPE_SETTINGS='identifiers:all'@@@

PP@@@39717@@@ans_exec_@@@1347037415.190293@@@3@@@1@@@0@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 3

PP@@@39717@@@req_exec_@@@1347037415.190805@@@0@@@select

sys_context('USERENV','SESSIONID') from dual@@@

PP@@@39717@@@ans_exec_free_@@@1347037415.191789@@@2@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 4

PP@@@39717@@@req_exec_@@@1347037415.192275@@@0@@@select

sys_context('USERENV','SID') from dual@@@

PP@@@39717@@@ans_exec_free_@@@1347037415.193227@@@3@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 5

PP@@@39717@@@req_exec_@@@1347037415.193682@@@0@@@alter session set

PLSQL_OPTIMIZE_LEVEL=0@@@

PP@@@39717@@@ans_exec_@@@1347037415.194525@@@2@@@1@@@0@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 6

PP@@@39717@@@req_exec_@@@1347037415.194926@@@0@@@alter session set

PLSQL_DEBUG=true@@@

PP@@@39717@@@ans_exec_@@@1347037415.195747@@@3@@@1@@@0@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 7

PP@@@39717@@@req_prep_bind_exec_@@@1347037415.196260@@@0@@@select count(1)

from all_objects where object_name = :1 @@@:1=APEX_RELEASE

PP@@@39717@@@ans_prep_bind_exec_free_@@@1347037415.198753@@@2@@@1@@@1@@@140

3

PStatist:timeSqlFromServer:mTimeList.size(): 8

PP@@@39717@@@req_exec_@@@1347037415.199662@@@0@@@select version_no from

apex_release@@@

Page 104: Framework for SQL Modification and Analysis

88

PP@@@39717@@@ans_exec_free_@@@1347037415.201085@@@3@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 9

PP@@@39717@@@req_exec_@@@1347037415.202339@@@0@@@select parameter,value

from nls_session_parameters

union all SELECT 'DB_TIMEZONE' name, DBTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE' name, SESSIONTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE_OFFSET' name, TZ_OFFSET(SESSIONTIMEZONE)

value from DUAL

union all SELECT parameter, value FROM nls_database_parameters WHERE

parameter='NLS_CHARACTERSET' @@@

PP@@@39717@@@ans_exec_@@@1347037415.203498@@@2@@@2@@@10@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 10

PP@@@39717@@@req_fetch_@@@1347037415.204047@@@2@@@@@@

PP@@@39717@@@ans_fetch_@@@1347037415.204742@@@2@@@2@@@10@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 10

PP@@@39717@@@req_fetch_@@@1347037415.205314@@@2@@@@@@

PP@@@39717@@@ans_fetch_free_@@@1347037415.205989@@@2@@@2@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 10

PP@@@39717@@@req_exec_@@@1347037415.206466@@@0@@@ALTER SESSION SET

TIME_ZONE = 'Europe/Berlin'@@@

PP@@@39717@@@ans_exec_@@@1347037415.207532@@@3@@@2@@@0@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 11

PP@@@39717@@@req_exec_@@@1347037415.208836@@@0@@@SELECT DBTIMEZONE FROM

DUAL@@@

PP@@@39717@@@ans_exec_@@@1347037415.210210@@@2@@@1@@@1@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 12

PP@@@39717@@@req_exec_@@@1347037415.211955@@@0@@@select parameter,value

from nls_session_parameters

union all SELECT 'DB_TIMEZONE' name, DBTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE' name, SESSIONTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE_OFFSET' name, TZ_OFFSET(SESSIONTIMEZONE)

value from DUAL

union all SELECT parameter, value FROM nls_database_parameters WHERE

parameter='NLS_CHARACTERSET' @@@

PP@@@39717@@@ans_exec_@@@1347037415.213202@@@2@@@2@@@10@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 13

PP@@@39717@@@req_fetch_@@@1347037415.214076@@@2@@@@@@

PP@@@39717@@@ans_fetch_@@@1347037415.215014@@@2@@@2@@@10@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 13

PP@@@39717@@@req_fetch_@@@1347037415.215587@@@2@@@@@@

PP@@@39717@@@ans_fetch_free_@@@1347037415.216228@@@2@@@2@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 13

PP@@@39717@@@req_exec_@@@1347037415.217496@@@0@@@select parameter,value

from nls_session_parameters

union all SELECT 'DB_TIMEZONE' name, DBTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE' name, SESSIONTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE_OFFSET' name, TZ_OFFSET(SESSIONTIMEZONE)

value from DUAL

union all SELECT parameter, value FROM nls_database_parameters WHERE

parameter='NLS_CHARACTERSET' @@@

PP@@@39717@@@ans_exec_@@@1347037415.218677@@@3@@@2@@@10@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 14

PP@@@39717@@@req_fetch_@@@1347037415.219512@@@3@@@@@@

Page 105: Framework for SQL Modification and Analysis

89

PP@@@39717@@@ans_fetch_@@@1347037415.223510@@@3@@@2@@@10@@@0

PStatist:timeSqlFromServer:mTimeList.size(): 14

PP@@@39717@@@req_fetch_@@@1347037415.224270@@@3@@@@@@

PP@@@39717@@@ans_fetch_free_@@@1347037415.225006@@@3@@@2@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 14

PP@@@39717@@@req_exec_@@@1347037415.253432@@@0@@@select 1 from sys.obj$

where 1=0@@@

PP@@@39717@@@ans_exec_free_@@@1347037415.256283@@@2@@@2@@@0@@@942

PStatist:timeSqlFromServer:mTimeList.size(): 15

PP@@@39717@@@req_prep_bind_exec_@@@1347037418.107234@@@0@@@SELECT * FROM

EMP WHERE empno=:v_no and ename=:v_name and tax=:v_tax@@@:1=903

:2=Alexander :3=75.5

PP@@@39717@@@ans_prep_bind_exec_free_@@@1347037418.108646@@@4@@@4@@@1@@@140

3

PStatist:timeSqlFromServer:mTimeList.size(): 16

PP@@@39717@@@req_exec_@@@1347037422.18475@@@0@@@

SELECT 1 FROM dual

@@@

PP@@@39717@@@ans_exec_free_@@@1347037422.19564@@@3@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 17

PP@@@39717@@@req_exec_@@@1347037422.20607@@@0@@@

SELECT dbms_transaction.local_transaction_id

FROM dual

@@@

PP@@@39717@@@ans_exec_free_@@@1347037422.21857@@@2@@@1@@@1@@@1403

PStatist:timeSqlFromServer:mTimeList.size(): 18

Figure A.1 The full length of Figure 5.10

PStatist:timeDisplay:

******************************

* SQL Timing Analysis Result *

* (re-arranged by time used) *

******************************

=== Order Nr. 0 ===

Total Time Use : 821 (usec)

Statement Nr. : 7

Client Port : 39717

Stmt ID : 3

SQL : alter session set PLSQL_DEBUG=true

Parameters :

= Timing (in usec) =

exec_ : (1) 821

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 0

Page 106: Framework for SQL Modification and Analysis

90

Message Code : 0

= traffic =

MaxRows/Fetch : 0

Total Packet : 2

=== Order Nr. 1 ===

Total Time Use : 843 (usec)

Statement Nr. : 6

Client Port : 39717

Stmt ID : 2

SQL : alter session set PLSQL_OPTIMIZE_LEVEL=0

Parameters :

= Timing (in usec) =

exec_ : (1) 843

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 0

Message Code : 0

= traffic =

MaxRows/Fetch : 0

Total Packet : 2

=== Order Nr. 2 ===

Total Time Use : 952 (usec)

Statement Nr. : 5

Client Port : 39717

Stmt ID : 3

SQL : select sys_context('USERENV','SID') from dual

Parameters :

= Timing (in usec) =

exec_ : (1) 952

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 2

=== Order Nr. 3 ===

Total Time Use : 984 (usec)

Statement Nr. : 4

Client Port : 39717

Page 107: Framework for SQL Modification and Analysis

91

Stmt ID : 2

SQL : select sys_context('USERENV','SESSIONID') from dual

Parameters :

= Timing (in usec) =

exec_ : (1) 984

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 2

=== Order Nr. 4 ===

Total Time Use : 998 (usec)

Statement Nr. : 3

Client Port : 39717

Stmt ID : 3

SQL : alter session set PLSCOPE_SETTINGS='identifiers:all'

Parameters :

= Timing (in usec) =

exec_ : (1) 998

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 0

Message Code : 0

= traffic =

MaxRows/Fetch : 0

Total Packet : 2

=== Order Nr. 5 ===

Total Time Use : 1033 (usec)

Statement Nr. : 2

Client Port : 39717

Stmt ID : 2

SQL : select USER from dual

Parameters :

= Timing (in usec) =

exec_ : (1) 1033

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

Page 108: Framework for SQL Modification and Analysis

92

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 2

=== Order Nr. 6 ===

Total Time Use : 1066 (usec)

Statement Nr. : 11

Client Port : 39717

Stmt ID : 3

SQL : ALTER SESSION SET TIME_ZONE = 'Europe/Berlin'

Parameters :

= Timing (in usec) =

exec_ : (1) 1066

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 2

Total Row : 0

Message Code : 0

= traffic =

MaxRows/Fetch : 0

Total Packet : 2

=== Order Nr. 7 ===

Total Time Use : 1089 (usec)

Statement Nr. : 17

Client Port : 39717

Stmt ID : 3

SQL :

SELECT 1 FROM dual

Parameters :

= Timing (in usec) =

exec_ : (1) 1089

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 50

Page 109: Framework for SQL Modification and Analysis

93

Total Packet : 2

=== Order Nr. 8 ===

Total Time Use : 1250 (usec)

Statement Nr. : 18

Client Port : 39717

Stmt ID : 2

SQL :

SELECT

dbms_transaction.local_transaction_id FROM dual

Parameters :

= Timing (in usec) =

exec_ : (1) 1250

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 50

Total Packet : 2

=== Order Nr. 9 ===

Total Time Use : 1374 (usec)

Statement Nr. : 12

Client Port : 39717

Stmt ID : 2

SQL : SELECT DBTIMEZONE FROM DUAL

Parameters :

= Timing (in usec) =

exec_ : (1) 1374

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 0

= traffic =

MaxRows/Fetch : 1

Total Packet : 2

=== Order Nr. 10 ===

Total Time Use : 1412 (usec)

Statement Nr. : 16

Page 110: Framework for SQL Modification and Analysis

94

Client Port : 39717

Stmt ID : 4

SQL : SELECT * FROM EMP WHERE empno=:v_no and ename=:v_name

and tax=:v_tax

Parameters : :1=903 :2=Alexander :3=75.5

= Timing (in usec) =

exec_ : (0) 0

prep_bind_exec_: (1) 1412

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 4

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 50

Total Packet : 2

=== Order Nr. 11 ===

Total Time Use : 1423 (usec)

Statement Nr. : 9

Client Port : 39717

Stmt ID : 3

SQL : select version_no from apex_release

Parameters :

= Timing (in usec) =

exec_ : (1) 1423

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 2

=== Order Nr. 12 ===

Total Time Use : 2493 (usec)

Statement Nr. : 8

Client Port : 39717

Stmt ID : 2

SQL : select count(1) from all_objects where object_name = :1

Parameters : :1=APEX_RELEASE

= Timing (in usec) =

exec_ : (0) 0

prep_bind_exec_: (1) 2493

prep_ : (0) 0

Page 111: Framework for SQL Modification and Analysis

95

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 2

=== Order Nr. 13 ===

Total Time Use : 2529 (usec)

Statement Nr. : 10

Client Port : 39717

Stmt ID : 2

SQL : select parameter,value from nls_session_parameters

union all SELECT 'DB_TIMEZONE' name, DBTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE' name, SESSIONTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE_OFFSET' name,

TZ_OFFSET(SESSIONTIMEZONE) value from DUAL

union all SELECT parameter, value FROM nls_database_parameters WHERE

parameter='NLS_CHARACTERSET'

Parameters :

= Timing (in usec) =

exec_ : (1) 1159

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (1) 695

fetch_free_ : (1) 675

= resultset =

Total Column : 2

Total Row : 21

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 6

=== Order Nr. 14 ===

Total Time Use : 2826 (usec)

Statement Nr. : 13

Client Port : 39717

Stmt ID : 2

SQL : select parameter,value from nls_session_parameters

union all SELECT 'DB_TIMEZONE' name, DBTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE' name, SESSIONTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE_OFFSET' name,

TZ_OFFSET(SESSIONTIMEZONE) value from DUAL

union all SELECT parameter, value FROM nls_database_parameters WHERE

parameter='NLS_CHARACTERSET'

Parameters :

= Timing (in usec) =

Page 112: Framework for SQL Modification and Analysis

96

exec_ : (1) 1247

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (1) 938

fetch_free_ : (1) 641

= resultset =

Total Column : 2

Total Row : 21

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 6

=== Order Nr. 15 ===

Total Time Use : 2851 (usec)

Statement Nr. : 15

Client Port : 39717

Stmt ID : 2

SQL : select 1 from sys.obj$ where 1=0

Parameters :

= Timing (in usec) =

exec_ : (1) 2851

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 2

Total Row : 0

Message Code : 942

= traffic =

MaxRows/Fetch : 50

Total Packet : 2

=== Order Nr. 16 ===

Total Time Use : 3522 (usec)

Statement Nr. : 1

Client Port : 39717

Stmt ID : 3

SQL : select * from v$version where banner like '%Oracle%'

Parameters :

= Timing (in usec) =

exec_ : (1) 3522

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (0) 0

fetch_free_ : (0) 0

= resultset =

Total Column : 1

Total Row : 1

Page 113: Framework for SQL Modification and Analysis

97

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 2

=== Order Nr. 17 ===

Total Time Use : 5915 (usec)

Statement Nr. : 14

Client Port : 39717

Stmt ID : 3

SQL : select parameter,value from nls_session_parameters

union all SELECT 'DB_TIMEZONE' name, DBTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE' name, SESSIONTIMEZONE value FROM DUAL

union all SELECT 'SESSION_TIMEZONE_OFFSET' name,

TZ_OFFSET(SESSIONTIMEZONE) value from DUAL

union all SELECT parameter, value FROM nls_database_parameters WHERE

parameter='NLS_CHARACTERSET'

Parameters :

= Timing (in usec) =

exec_ : (1) 1181

prep_bind_exec_: (0) 0

prep_ : (0) 0

bind_exec_ : (0) 0

fetch_ : (1) 3998

fetch_free_ : (1) 736

= resultset =

Total Column : 2

Total Row : 21

Message Code : 1403

= traffic =

MaxRows/Fetch : 10

Total Packet : 6

******************************

* End of SQL Timing Analysis *

* (re-arranged by time used) *

******************************

Figure A.2 The full length of Figure 5.12

Page 114: Framework for SQL Modification and Analysis

98

Appendix B: Struct Variables

struct CBindVal

{

long bvl_num;

double bvl_flt;

std::string bvl_str;

tm bvl_ymd;

std::string bvl_rid;

char bvl_byt;

int byt_size;

int type;

/* type

BVL_NUM=number, BVL_FLT=float, BVL_STR=string, VL_YMD=datetime,

BVL_RID=rowid, BVL_BYT=bytes

*/

};

struct CBindVar

{

CBindVal* bVal;

int num;

};

struct CResultSet

{

int totalCol;

int totalRow;

struct CColumn

{

std::string name;

int len;

int type;

};

CColumn* cCol;

struct CRow

{

CBindVal* cVal;

};

CRow* cRow;

};