performance issues of web services

Performance Issues of Web Services

CSCI 8710November 29-30, 2006

Kraemer

Web ServicesServices available via the Internet that

complete tasks or conduct transactions.

Self-contained, modular applications that can be described, published, and invoked over the Internet.

Can be automatically invoked by application programs.

Web ServicesMay be invoked at one site or may

combine results of several services executed at different sites.

Performance concerns differ from

stanard C/SMay involve both web service

processing and network delaysMay be accessed by wide variety of

devices -- desktop computers, PDAs, mobile phones, other servers

Access via wireless communication networks: dynamic connectivity, low bandwidth, high latency

Performance concerns differ from standard

C/SUndpredictable nature of requests

Highly burstyVaries with geographical location of

clients, day of week, time of dayHighly variable size of requested

objects“Robot” access

Autonomous software agents that can consume significant amounts of system resources

Types of servers providing Web

ServicesWeb serversTransaction serversProxy serversCache serversWireless gateway serversMirror servers

Common problemsInsufficient bandwidth at peak timesOverloaded serversUneven server loadsDelivery of dynamic contentShortage of connections between

application servers and database servers

Failure of third-party serversDelivery of multi-media content

Example:Bill Paying Service

Portal offers bill paying serviceCustomers can pay variety of bills

through the serviceUses services provided by others:

Debit authorization (100 tps capability)Electronic funds transferCustomer authentication

Example:Bill Paying Service

Example: Bill Paying Service

Portal B is bill paying service Treat overall web service as ‘system’ Treat component services as ‘devices’ What is the capacity of B, given that the debit

authorization service can support 100 tps and that each payment transaction requires 2 visits to the

Xi = Vi * X0

100 = 2 * X0

X0 = 50 tps

Web server elements

HTML and XML Most documents on the Web written using

HTML “markup language” Most consist of text and inline images Can also include other multimedia objects Generates multiple requests: for document

and for each inline image -- single click by user may generate series of requests

XML uses tags and attributes to define/delimit data Application must interpret meaning of the tags

Hardware and Operating System

Hardware view: performance a function of:Number and speed of processorsAmount of main memoryBandwidth and storage capacity of disk

subsystemBandwidth of the NIC

OS considerations:Performance, scalability, reliability, robustness

ContentPerformance affected by:

Content sizeContent structureHyperlinksPopularity of content

Perception of Performance

User view:Fast response time; no connections

refused

Management view:High throughput; high availability

Need to have quantitative measurements that describe behavior of Web service

MetricsTwo most important;

Response time -- secondsThroughput -- http_ops/sec, also bits/sec

Other metrics Hit

any connection to a web site, including in-line requests and errors

difficult to compare across sites Visit

Series of page requests by a user at a single site Inter-request times < timeout_value

Session Series of consecutive and related requests made

during a single visit Inter-request times < timeout_value

Other metrics User-perceived response time

Set of geographically distributed agents poll the WS

Error rate Increase indicates degrading performance Examples:

Overflow of pending connection queue

For streaming services: Jitter Startup latency

Most common measurements of

Web service performance

End-to-end response timeSite response timeThroughput (req/sec)Throughput (Mbps)Errors/secVisitors/dayUnique visitors/day

Example - Travel Agency

Monitor for 30 minutes:9000 HTTP requestsThree types of objects delivered:

Html pages (30%, avg. size 11,200 bytes)Images (65%, avg. size 17,200 bytes)Video clips (5%, avg. size 439,000 bytes)

What is the throughput:9000 requests/1800 sec = 5 req/secWhat is the throughput in Kbps?

Throughput in Kbps? Xr = (total_req * class% * avg. size)/time

Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25 Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72 Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42

X0 = 131.25 + 436.72 + 857.42 X0 = 1425.39 Kbps

To support the Web traffic, the network connection should be at least a T1 line (1.544 Mbit/s ).

http://en.wikipedia.org/wiki/Mbit/s

QoS indicators for Web Services

Response time Availability

Percentage of time a service is ‘live’ (serving customer requests)

Reliability Probability that WS will perform in satisfactory

manner for a given period of time under specified operating and load conditions

Predictability Cost

Input data needed to monitor QoS

TrafficPerformanceUsage patterns

Knowledge of average and peak load

Where are the delays?

Where are the delays?

Four categories:DNS lookup phaseTCP connection set-up phaseServer execution timeNetwork time

DNS lookup phase Browser converts server name in URL into an

IP address to establish the TCP connection If server name can’t be resolved by local

cache, send query to higher-level DNS server For leading e-commerce sites, avg. lookup

times are 0.01 and 0.11 sec. Fastest sites achieve 0.001 sec.

Anatomy of a Web Transaction

Anatomy of a Web transaction

BrowserNetworkServer

Anatomy of a Web Transaction: the Browser

User clicks on hyperlink; requests document Client (browser) checks local cache for document;

in case of hit: returns document; user response time R’Browser,hit*

In case of missBrowser asks DNS to map server hostname to IP addressCloent opens a TCP connectionto the server defined by the

URL of the linkClient sends an HTTP request to the serverBrowser formats and displays document and renders

imagesReturned document is stored in browser cacheUser response time: R’Browser,miss*

Anatomy of a Web Transaction: the Network

Imposes delays in delivering info from client to server (R’N1) and from server to client (R’N2).Delays a function of components on path

between them:Modems, routers, comm links, bridges, relays

R’Network = total time HTTP request spends in the netork= R’N1 + R’N2

Anatomy of a Web transaction: the Server

request arrives from client server parses the request according to the http server executes requested method (GET, HEAD, etc.)

if GET server looks up file in its document tree by using the file

system; file may be in cache or on disk server read contents of file from disk or cache and

writes it to network port when file send complete, close the connection (if non-

persistent HTTP) R’server = time spent in execution of HTTP request

includes service time and waiting time at the server

Anatomy of a Web transaction

If document not found in client’s cache: response time is sum of residence time at all

resources Rmiss = R’Browser, miss + R’Network + R’Server

If a hit Rhit = R’Browser, hit

Typically: Rhit << Rmiss

Average response time, R, over NT requests: R = pC * Rhit + (1-pc) * Rmiss

ExampleUser wants to analyze impact of local

cache size of browser on Web response time perceived by user20% of requests serviced by local cache

with R=400 msecR for remotely serviced requests = 3 secPrevious expts. indicate that 3x cache size

results in hit rate of 45%R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 secR_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec

Bottlenecksbottleneck = the component that

limits system performance

Need to identify the bottleneck to improve performance

Examplehome user

takes too long to download medium-size page (avg. size 20KB)

considering upgrading to processor w/2X faster CPU

How will this affect response time?

Example, continuedAssume:

R’network = 7.5 sec

R’server = 3.6 sec

R’Browser, miss = 0.3 sec

R = R’network + R’server +R’Browser, miss

R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec

Rnew = 7.5 + 3.6 + 0.15 = 11.25 secnot much difference … CPU not the bottleneck

ExamplePharma co. plans intranet for training

and display of images of moleculestraining sessions have 100 peopleassume 80% active at any one timeEach user performs avg. of 100 ops/hourEach op requests avg. of 5 imagesAvg. size of requested image is 25600

bytesWhat is minimum bandwidth of

network connection to image server?

Example, continued100 * 0.80 * 100 ops/hour * 5 images/op *

25600 bytes/image * 8 bits /byte * 1 hr/3600 sec

(100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps

Web Infrastructure

Web infrastructureThree major delay sources:

“last mile”Link between end user and phone company switch, or

DSL or cable connection to service providerISPs

Recently, more bandwidth addedImprovements via caching, load balancing, more servers

‘backbone’ of networkCollection of interconnected network providers

Connect to each other to exchange traffic (peering)Public peering: at major interconnection points (NAPs,

network access points)(MAEs, Metropolitan Access Points)Delays may occur at peering points

Basic Components Servers Browsers Firewalls

protect data, programs, and computers on private network from the uncontrolled activities of untrusted users and software on other computers

Screens network traffic going through it, usingSoftware, network hardware, computers

Potential performance bottleneck

Proxy, Cache, MirrorTechniques for improving web

performance and securityTry to reduce

access time to web documentsNetwork bandwidth required for doc xfersDemand on servers w/ very popular docs

Proxy server Special type of web server that acts as an

agent: server to the client, client to the server

Accepts requests from clients, forwards them to web servers

Receives responses from remote servers, forwards them back to the client

Originally designed to provide web access for users on private networks who had to go through a firewall

Proxy server Can be configured to cache relayed responses Benefits:

Improves access speed by bringing data closer to consumer

Cuts down on network traffic Reduces server load Increases availability in the web

Problems: Ensuring that cached docs are up-to-date What’s worth caching? For how long?

Proxy server

CachingUsed in the Web:

Client-side, at the browserIn the network, a caching proxy

Evaluating caching effectiveness:Hit ratio =

requests_satisfied/total_requestsByte hit ratio = hit ratio weighted by doc

sizeData transferred = bytes xferred/time

Example Manager wants to install caching proxy

server on corporate intranet w/ > 2000 users Use for 6 months -> then evaluate Consider two cases:

Cache holds small documents, avg. size 4800 bytes, hit ratio 60%

Cache holds medium documents, avg. size 32500 bytes, hit ratio 20%

Monitor for one hour, observe 28800 requests

Cache efficiencySaved_BW =

(num_req * hit_ratio * avg_size)/timeSaved_BW_small =

(28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps

Saved_BW_med = (28800 * 0.20 * 32500*8)/3600 = 416 Kbps

Holding larger documents can save more BW

MirroringReplicating site content at other

serversRequires:

Regular updatesDNS to direct browsers to secondary sites

when primary is busyGoals:

Increase availabilityBalance server load

Thus increasing quality of service

ExampleManufacturing co., employee portal,

too slow for European usersIdea: install mirror site in ParisWhat are the bandwidth savings ?

Example: Mirror site in Paris

Current avg. BW is 35 Mbps 40% of load from Europe 42% of traffic could be served from caching Cacheable amount: 35 * 0.42 = 14.7Mbps Estimate cache hit ratio at 38% Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps

40% of traffic from Europe, so:5.6 * 0.40 = 2.24 Mbps could be served from cache in

Paris6.4% savings on current BW usage at server improvement in perceived response time for European

users

Content Delivery Networks(CDN)

cache or replicate content as needed to meet demands from clients over the Web

coordinated caching systems implemented through proprietary networks and data centers

employ a DNS-redirecting mechanismtries to assign best location from which

to serve the requested content

Content Delivery Networks(CDN

DNS-redirecting mechanism: client requests URL; browser generates a DNS

request for the IP address corresponding to the domain name in the URL

CDN controls the DNS service for this domain name CDN modifies DNS requests with the IP addess of a

selected server rather than IP address of original server

uses a routing function to select “best” server:client location, id of requested content, load of CDN

network and servers, proximity of CDN servers to client are all considered

CDN should provide: scalability, high availability, manageability,

performance

The WAP Infrastructure

WAP = Wireless Application Protocolarchitecture + set of protocols for wireless

devices to access Web services at regular Web sites

wireless device communicates with WAP gateway, over wireless nework

WAP gateway communicates with servers


Docs for wireless devices written in form of XML known as WML (wireless markup language)

can also use WMLscriptWML docs

structured as set of “cards”, units of user interaction

deck = set of cardsusers navigate between cards


WML decks + WMLScripts stored in regular web servers on internetretrieved by WAP gateway via HTTPWeb server response is binary encoded by

WAP gateway and sent to wireless device via lightweight protocolsdesigned to minimize BW requirements

WAP protocol stack

Server ArchitecturesWeb ServerApplication ServerTransaction and Database ServerStreaming ServerMulti-tier Architecture

Web Server listens for HTTP requests establishes requested connection sends requested file returns to listening mode

can handle more than one request at a time fork a copy of the HTTP process for each request multi-threaded HTTP program pool of running processes

Dynamic contentcan use client-side or server-side

programscan improve performance by pushing

to client-side

Application Server

software that handles all application operations between broswer-based customers and back-end databasesreceive client requestexecute business logic, interacting with

transaction and/or DB servers

can be implemented in many ways:CGI scripts, FastCGIs, server-applications,

server-side scripts

Transaction and Database Server

Tranasction Processing (TP) monitor provides:an application programming interfacea set of program development toolsa system to monitor and control execution

of transaction programs

DB server:executes and monitor transaction

processing applications

Streaming ServerInitially, audio and video were

“download and play” technologiesStreaming media begins to play

“almost” immediatelyclient request arrivesserver retrieves video and audio data and

begins to deliver them over the networkvideo and audio are compressed (MPEG,

MP3)typically have control part and data part

ExampleCompany plans to offer MM online

trainingEmployee retrieves lecture of video,

audio, slides; 30 minute duration

What is the number of streaming servers needed to serve the lecture presentation during busiest period of the day: 4-5 pm

Example 400 employees at peak One MM server can stream presentations to

150 viewers simultaneously What is the average number of simultaneous

viewers during peak period? Use Little’s Law: N=R = Req/time = 400 viewers/60 min R = 30 min N = 30 * 400/60 = 200 Need two MM servers

Multi-tier Architectureweb-based apps usually in 3-tier

architecture:presentation layer

user interface (browser & HTML, XML, etc.)application layer

business logiccollection of rules to implement application logicmay also contain Java applets, ActiveX controls,

etc.

data service layerpersistent data

Multi-tier Architecture

Example application layer designed to support 400

simultaneous processes app process:

receives client request executes app logic, interacting with DB server

Monitoring shows: app process executes for 150 msec between DB

requests DB server handles 440 req/sec 400 app processes running during peak period

What if??the application servers are replaced by

new servers with 2X speedEach application server characterized

by Z, “think time” – time between receiving a reply from the DB server and submitting a new DB request

DB layer, characterized by throughput, X, in req/sec

R = N/X - Z

What if ...?DB response time:

R = 400/550 – 0.15 = 577 msec = 0.577 secafter cpu upgrade, app processing time

should be 75 msecDB response time now:

Rnew = 400/550 – 0.075 = 652 msec = 0.652 sec

Improvement in app layer may not lead to improvement overall

Dynamic Load Balancing

heavy traffic load adversely impacting performanceadd more serversbuy bigger (faster) serversneed to do cost-performance analysis

Dynamic Load Balancing

web cluster:multiple web serverssingle location addressed by one URL and

a single virtual IP addressincoming requests routed amount servers

in user-transparent wayswitch acts as dispatcher, mapping virtual

IP address to actual address

Web cluster

NetworksBandwidth

measures the rate at which data can be sent through the network

usually expressed in bps

Latencytime needed for a bit (or small packet) to

travel across the network

Bandwidth for different types of

networks

Planning Streaming service offers training videos training session -> 15 min video at 300 Kbps What impact if videos go to 25 min? Service supports 35 simultaneous sessions Average BW needed (now)

35 * 300 Kbps = 10.5 Mbps Average number simult. sessions (now)

N = 35 N = * R 35 = * 15 = 35/15 = 35/15 .. assume this remains the same

Nnew = * 25 = 35/15 * 25 = 58.33 Average BW needed (new)

58.33 * 300 Kbps = 17.5 Mbps

Exampletraining videos, avg. size 950 MB100 students, 80% active at one timeEach user requests 2 clips/hourBW needed to support:

( 0.80 * 100) * 2 * (8 * 950)/3600 sec337.7 MbpsNeed a 622 ATM network to support

performance issues of web services

Documents

web site

web servicesmay

web service processing

debit authorization

performanceuser view

application servers

importantresponse time

serviceportal b