senagis paper.pdf

Upload: kamau-gabriel

Post on 13-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/26/2019 Senagis Paper.pdf

    1/6

    SOCA

    DOI 10.1007/s11761-015-0186-x

    O R I G I N A L R E S E A R C H P A P E R

    An aggregated technique for optimization of SOAP performancein communication in Web services

    Kennedy Mutange Senagi1 George Okeyo2 Wilson Cheruiyot2

    Michael Kimwele2

    Received: 12 July 2014 / Revised: 20 August 2015 / Accepted: 19 October 2015

    Springer-Verlag London 2015

    Abstract Simple Object Access Protocol (SOAP) among

    other techniques implements Web Services (WS). SOAPoffers a lightweight and simple mechanism for exchange of

    structured and typed information among computing devices

    in a decentralized, distributed computing environment. How-

    ever, SOAP transmits data in Extensible Markup Language

    (XML) format. XML documents are huge in size and ver-

    bose thus becoming a major hindrance in performance for

    high-performance applications that process lots of data.

    In this paper, we develop, implement and evaluate SOAP

    performance optimization aggregated architecture in a disad-

    vantaged network, i.e., 10 Mbps bandwidth. The aggregated

    architecture entailed: client side caching, documentliteral

    Web Services Description Language (WSDL) description,

    simple database queries on the server side and Gzip compres-

    sion technique. The experimental results showed a relatively

    high turnaround time and low network throughput. Never-

    theless, improved performance of SOAP is evident in terms

    of bandwidth utilization and transfer time. This can be useful

    in disadvantaged networks.

    B Kennedy Mutange Senagi

    [email protected]

    George [email protected]

    Wilson Cheruiyot

    [email protected]

    Michael Kimwele

    [email protected]

    1 Department of Information Technology, Dedan Kimathi

    University of Technology, Nyeri, Kenya

    2 Department of Computing, Jomo Kenyatta University

    of Agriculture and Technology, Nairobi, Kenya

    Keywords SOA Web services SOAP XML SOAP

    optimization Disadvantaged network and SOAP evaluation

    1 Introduction

    TheInternet is growing very fast, andit hasbecome an impor-

    tant tool in communication, providing services and sharing

    information. Many organizations, institutions and individu-

    als have embraced Internet in many ways, e.g., e-commerce,

    blogs, etc. Such services can be provided on the Internet by

    using Service-Oriented Architecture (SOA), Web-Oriented

    Architecture (WOA) etc.[1]. There are various technologies

    that implement SOA, including Common Object Request

    Broker Architecture (CORBA) [2], Java Remote Method

    Invocation (RMI)[3], Component Object Model (COM)[4]

    and Web services (WS) [5].

    Implementation of WS using SOAP gives it the capa-

    bility of being platform independent and the ability of

    going through firewalls without being recognized. Firewalls

    by default allow traffic through port 80 which Hypertext

    Transfer Protocol (HTTP) uses in communication. SOAP

    messages are encapsulated within HTTP and transmitted

    using the GET/PUT operations. HTTP is a universal standard

    in the World Wide Web (WWW). Thus, WS work in hetero-

    geneous systems and makes it stand out among its equals

    which are monolithic in nature [6,7].

    SOAP does the packaging of actual messages being

    transmitted in a communication channel. It relies on XML

    in formatting messages. WS can process XML-formatted

    SOAP messages. XML and SOAP use their own service

    description to deal with the service-specific characteristics

    of messages it receives [6].

    SOAP provides a lightweight and simple mechanism for

    exchange of structured and typed information in decen-

    1 3

    http://crossmark.crossref.org/dialog/?doi=10.1007/s11761-015-0186-x&domain=pdf
  • 7/26/2019 Senagis Paper.pdf

    2/6

    SOCA

    tralized, distributed environments using XML. XML has a

    redundant annotated data structure[6]. SOAPs XML-based

    message structure is verbose, which results in high network

    traffic and XML parsing and processing. This causes a high

    computational burden leading to high latency. SOAPs XML-

    based message format hinders its performance. This makes it

    unsuitable for high-performance scientific applications. The

    deserializationof SOAP messages, whichcomprises process-ing of XML data and conversion of strings to in-memory data

    types, is one of the major performance drawbacks in a SOAP

    message exchange [7,8].

    The main contribution of this paper is we developed

    a novel technique that optimizes SOAP performance in a

    disadvantaged network. The technique aggregates: client

    side caching, documentliteral WSDL description, simple

    database queries on the server side and Gzip compression.

    This technique improved compression ratio, which enhanced

    bandwidth utilization. It also improved transfer time, which

    reduced the time to send messages from client to server and

    vice versa.The rest of this paper is organized as follows: Sect. 2

    discusses related work, Sect.3presents the developed archi-

    tecture and its implementation, Sect. 4 provides experimental

    results and discussion, and Sect. 5concludes this paper.

    2 Related work

    The dependence of SOAP on XML in messaging is the major

    hindrancein performance for high-performance applications.

    There is a lot of research on optimize SOAP performance in

    WS communication.

    One approach is the client side caching. It has been

    embraced to improve traffic and latency between a service

    and underlying data providers [911]. Client side caching

    can store data temporarily in the Internet browsers or by

    JavaScript data structures. In client side caching, data are

    stored by the client side browser temporarily on the local

    disk or Web browsers internal memory. Some browsers have

    a limited amount of storage space thus a problem when

    the limits are exceeded [11]. Differential Serialization (DS)

    avoids serializing of the whole message structure. In DS,

    once a serialized message has been sent by a SOAP com-

    munication endpoint, the client saves the message so that it

    can be reused by subsequent messages as a template. Subse-

    quent messages that have the same structure or are identical

    can reuse the structure and avoid the serialization overhead

    involved in regenerating the structures from scratch. This

    technique improves response time [7,8,1215].

    Moreover, WSDL describes the public interface to a spe-

    cific Web service. WSDL binding describes how the service

    is bound on the SOAP messaging protocol [16]. Com-

    mon SOAP binding styles are RPC-encode and document

    literal. RPC has more overheads than documentliteral [1].

    Phil et al. [1] and IBM [11] recommended adoption of

    documentliteral over RPC style. RPC style requires 15%

    more time than documentliteral [17]. Experiments built

    on documentliteral encoding style were faster than RPC

    [18]. Compression was also noted to be a promising solution

    while handling the huge and verbose SOAPs XML mes-

    sages. Lossless compression algorithm exploits statisticalredundancy to represent the senders data more concisely

    without errors. A lossless compression algorithm is good

    for SOAPs XML messages which have redundant text [18].

    Lossless compression algorithms include Gzip, Bzip2, Fast

    Infoset (FI), Efficient XML Interchange (EXI) [1820]. Gzip

    compression algorithm was recommended in disadvantaged

    network[19]. Compression improved bandwidth utilization

    and response time of SOAP messages [9,10,18,19,21,22].

    One of the compression trade-off is between compression

    processing time and the CPU processing [23]. However, with

    the increasing hardware processingcapabilities,this trade-off

    is beneficial compared to the cost of increasing bandwidthwhich is widely under constrains [7,21].

    Server side caching improves response time. It is a SOAP

    performance optimization technique explored and supported

    by [9,11,24]. Server side caching is slightly different from

    client side caching as data are temporarily stored in seri-

    alized objects[11]. In server side data chunking, the client

    specifies therangeof data in therequest.The server then com-

    poses the chunk and returns it. This improves performance

    of loading sustainable large amounts of data, by doing so

    in chunks or bits [10,25]. Moreover, in server side caching,

    database caching gives a better way of modifying data. It

    eliminates the need to parse an entire XML structure which

    is computationally expensive [11,26]. Another SOAP perfor-

    mance optimization technique is Differential Deserialization

    (DDS). It works on the server side. DDS technique has been

    supported by [8,9,13,14,27,28]. However, deserialization is

    an expensive process that involves the conversion of SOAPs

    XML messages to the application object [14,27,29]. Due

    to the relatively high memory requirements experienced in

    DDS, checkpointing was introduced [28]. Moreover, [29]

    introduced Differential Checkpointing (DCP). DCP opti-

    mized DDS by improving its speed and reducing memory

    requirements. DDS is somehow similar to Differential Seri-

    alization (DS). DDS and DS take advantage of a sequence of

    similar messages.

    3 Developed architecture and implementation

    This research came up with an aggregation of the follow-

    ing techniques: client side caching, documentliteral WSDL

    description, simple database queries on the server side and

    Gzip compression technique. We adopted and modified the

    1 3

  • 7/26/2019 Senagis Paper.pdf

    3/6

    SOCA

    Fig. 1 Aggregated architecture for SOAP performance optimization. It entails entailed client side caching, simple server side database queries,

    compression technique and documentliteral style description of WSDL

    architecture proposed by [9] in aggregating these techniques.

    Figure 1 shows the aggregated SOAP performance optimiza-

    tion architecture.

    The aggregated architecture shown in Fig. 1 was real-

    ized in a software prototype. The software prototype had

    three major components: client side component, server side

    component and the database component. The client side

    was developed in Ext. Js. (JavaScript library) embeddedin an Active Server Pages (ASP.NET) Web page. Server

    side was implemented using ASP.NET.asmx Web service.

    C# was the back-end programming language for both client

    side and server side programming. ASP.NET Web services

    were compressed and uncompressed, i.e., introducing and

    not introducing Gzip compression algorithm. The database

    component was built on MySQL server.

    We set up an experiment where we manipulated the file

    size of the responses from the Web server as we set con-

    stant other experiment environment conditions. The file size

    (SOAP-based XML message) from the server contained pure

    textual data. The experiment environments were as follows:Windows Server Operating System installed in the server

    computer and Windows XP installed on three client com-

    puters. Network bandwidth was set to 10 Mbps to simulate

    a disadvantaged network. All computers had 1GB Ran-

    dom Access Memory (RAM) and 3.2GHz processor speed.

    Client and server computers were interconnected with a

    switch. Microsoft Internet Explorer was used as the default

    browser. Fiddler Web Debugger [30] was used in profiling

    the client, thus collecting actual performance statistics in the

    network. NetBalance Software was used to limit bandwidth

    to 10 Mbps. Figure2shows clientserver communication.

    4 Experimental results and discussion

    Experimental results and discussions are categorized as:

    compression percentage ratio, time to transfer SOAP mes-sages, time to process SOAP messages, Round Trip Time

    (RTT) and throughput.

    4.1 Compression ratio percentage

    Figure3 shows a line graph illustrating change in compres-

    sion ratio percentage against filesizesof SOAP response. The

    SOAP messages contained text data. The general trend of this

    graph shows the line sloping to the right. This indicates that

    smaller file sizes exhibit better compression ratios than large

    file sizes after being compressed with Gzip algorithm. Note

    that 100 % compression ratio means perfect compression,

    while 0 % compression means total compression failure. The

    average compression ratio percentage observed was 67.01%.

    SOAP-based XML message data structure is verbose, i.e.,

    they have redundant textual characteristics and uses tags to

    delimit data[6,8,9]. Lossless compression algorithms, e.g.,

    Gzip, maximizes statistical redundancy to represent data in

    a compressed format[19].

    Nevertheless,large files have more redundant textual char-

    acters and uses more tags to delimit data [30]. Therefore,

    1 3

  • 7/26/2019 Senagis Paper.pdf

    4/6

    SOCA

    Fig. 2 Clientserver communication. This show how SOAP request

    is composed on client side and then sent to the server via the commu-

    nication channel. The server processes the SOAP request, composes

    a SOAP response then forwards it client computer which receives it,

    decompresses it and renders it to the Web browser

    Fig. 3 Change in compression ratio percentage against file sizes ofSOAP response

    Gzip compression algorithm collected more statistical infor-

    mation for large file sizes which made the large file sizes

    portray larger percentage compression ratio percentages as

    compared to smaller file sizes [31].

    4.2 Time to transfer SOAP messages

    Figure4represents a graph depicting change in transfer time

    against file size of compressed and uncompressed SOAP

    requests. The trend of the graph in Fig. 4shows that both

    compressed and uncompressed lines rise steadily indicating

    that smaller file sizes portray smaller transfer time than larger

    files. The line representing compressed runs below uncom-

    pressed. This indicates that uncompressed take much longer

    time to transfer files compared to compressed files. Gzip com-

    pression reduced the size of a file being transferred by an

    average compression percentage ratio of 67.01 %. Large file

    size exhibits a large transfer time compared to smaller files

    because smaller files have fewer bytes which take lesser time

    Fig. 4 Change in transfer time against file size of compressed and

    uncompressed SOAP requests

    to be transferred in a communication channel. This can the-

    oretically be derived from the bandwidth equation (4.1)

    Bandwidth(mbps) = megabytes/seconds (4.1)

    4.3 Time to process SOAP messages

    Figure5illustrates a line graph representing change in time

    to process a SOAP request against change in file size of

    compressed and uncompressed. Processing time is the time

    difference from when theserver gotthe request to thetime the

    server begun to compose the SOAP response [30]. It majorly

    entails the time to run the SQL query and time involved in

    compressing the SOAP message.

    The graph shows the line representing change in com-

    pressed and uncompressed rising moderately. This points out

    that an increase in the file size results in an increase in the

    time to process a SOAP request for both compressed and

    1 3

  • 7/26/2019 Senagis Paper.pdf

    5/6

    SOCA

    Fig. 5 Change in time to process SOAP request against change in file

    size of ASP.NET compressed and ASP.NET uncompressed

    uncompressed. Nevertheless, uncompressed line runs below

    compressed. This depicts that compressed takes more time

    to process SOAP requests compared to uncompressed. This

    could be attributed to the fact that compression has trade-offs, e.g., Extra Central Processing Unit (CPU) processing

    time on the server side.

    4.4 Round trip time

    Figure6demonstrates a graph describing change in round

    trip time against file sizes of compressed and uncompressed.

    The general trend of the graph shows that all the lines

    rise steadily. This indicates that large files exhibit more

    RTT compared to smaller files. Moreover, the line repre-

    senting compressed runs higher than that of uncompressed,

    indicating that compressed files recorded higher RTT than

    uncompressed. Equation(4.2) was used to calculate RTT. It

    shows that transfer time that was discussed in Sect.4.2and

    processing time that was discussed in Sect.4.3 contributed

    to the overall RTT. RTT is the turnaround time of the SOAP

    message.

    RTT= Transfer Time+ Processing time (4.2)

    4.5 Throughput

    Figure7illustrates a line graph outlining change in through-

    put against change in file size of compressed and uncom-

    pressed. The general trend of the graph shows that all the

    lines representing compressed and uncompressed slope to

    the right. This indicates that smaller file sizes exhibit better

    throughputvalues compared to larger file sizes. Nevertheless,

    from the graph, compressed rides slightly below uncom-

    pressed. This shows that uncompressed exhibit a better

    throughput value than compressed. Throughput was calcu-

    lated using Eq.4.3. This measured the number of requests

    Fig. 6 Change in round trip time against file sizes of ASP.NET com-

    pressed and uncompressed requests

    Fig. 7 Change in throughput against change in file size of compressed

    and uncompressed

    processed by the Web server per second. Throughput is

    highly affected by the round trip time. As we discussed

    RTT in Sect. 4.4, we saw that compressed recorded lower

    RTT compared to uncompressed resulting in uncompressed

    exhibiting better throughput values than compressed. The

    slope becomes significantly smaller when the file size is

    greater than 2367KB for both ASP.NET compressed and

    ASP.NET uncompressed. This is because the change in

    throughput against time depreciates when the file size

    increases in size.

    Throughput = Request RTT (4.3)

    5 Conclusion and future work

    In this paper, we developed an aggregate architecture that

    entailed: client side caching, simple server side database

    1 3

  • 7/26/2019 Senagis Paper.pdf

    6/6

    SOCA

    queries, compression technique and documentliteral style

    description of WSDL. From our experimental results, we

    recorded an improved compression ratio, which was signifi-

    cant for better bandwidth utilization and SOAP transfer time.

    This resulted in improved speeds, while SOAP messages are

    in transit in the communication channel. However, a rel-

    atively high turnaround time and low network throughput

    was recorded. Notwithstanding, this research recommendsthe aggregated architecture for disadvantaged networks that

    have bandwidth speed of