performance issues of web services
DESCRIPTION
Performance Issues of Web Services. CSCI 8710 November 29-30, 2006 Kraemer. Web Services. Services available via the Internet that complete tasks or conduct transactions. Self-contained, modular applications that can be described, published, and invoked over the Internet. - PowerPoint PPT PresentationTRANSCRIPT
Performance Issues of Web Services
CSCI 8710November 29-30, 2006
Kraemer
Web ServicesServices available via the Internet that
complete tasks or conduct transactions.
Self-contained, modular applications that can be described, published, and invoked over the Internet.
Can be automatically invoked by application programs.
Web ServicesMay be invoked at one site or may
combine results of several services executed at different sites.
Performance concerns differ from
stanard C/SMay involve both web service
processing and network delaysMay be accessed by wide variety of
devices -- desktop computers, PDAs, mobile phones, other servers
Access via wireless communication networks: dynamic connectivity, low bandwidth, high latency
Performance concerns differ from standard
C/SUndpredictable nature of requests
Highly burstyVaries with geographical location of
clients, day of week, time of dayHighly variable size of requested
objects“Robot” access
Autonomous software agents that can consume significant amounts of system resources
Types of servers providing Web
ServicesWeb serversTransaction serversProxy serversCache serversWireless gateway serversMirror servers
Common problemsInsufficient bandwidth at peak timesOverloaded serversUneven server loadsDelivery of dynamic contentShortage of connections between
application servers and database servers
Failure of third-party serversDelivery of multi-media content
Example:Bill Paying Service
Portal offers bill paying serviceCustomers can pay variety of bills
through the serviceUses services provided by others:
Debit authorization (100 tps capability)Electronic funds transferCustomer authentication
Example:Bill Paying Service
Example: Bill Paying Service
Portal B is bill paying service Treat overall web service as ‘system’ Treat component services as ‘devices’ What is the capacity of B, given that the debit
authorization service can support 100 tps and that each payment transaction requires 2 visits to the
Xi = Vi * X0
100 = 2 * X0
X0 = 50 tps
Web server elements
HTML and XML Most documents on the Web written using
HTML “markup language” Most consist of text and inline images Can also include other multimedia objects Generates multiple requests: for document
and for each inline image -- single click by user may generate series of requests
XML uses tags and attributes to define/delimit data Application must interpret meaning of the tags
Hardware and Operating System
Hardware view: performance a function of:Number and speed of processorsAmount of main memoryBandwidth and storage capacity of disk
subsystemBandwidth of the NIC
OS considerations:Performance, scalability, reliability, robustness
ContentPerformance affected by:
Content sizeContent structureHyperlinksPopularity of content
Perception of Performance
User view:Fast response time; no connections
refused
Management view:High throughput; high availability
Need to have quantitative measurements that describe behavior of Web service
MetricsTwo most important;
Response time -- secondsThroughput -- http_ops/sec, also bits/sec
Other metrics Hit
any connection to a web site, including in-line requests and errors
difficult to compare across sites Visit
Series of page requests by a user at a single site Inter-request times < timeout_value
Session Series of consecutive and related requests made
during a single visit Inter-request times < timeout_value
Other metrics User-perceived response time
Set of geographically distributed agents poll the WS
Error rate Increase indicates degrading performance Examples:
Overflow of pending connection queue
For streaming services: Jitter Startup latency
Most common measurements of
Web service performance
End-to-end response timeSite response timeThroughput (req/sec)Throughput (Mbps)Errors/secVisitors/dayUnique visitors/day
Example - Travel Agency
Monitor for 30 minutes:9000 HTTP requestsThree types of objects delivered:
Html pages (30%, avg. size 11,200 bytes)Images (65%, avg. size 17,200 bytes)Video clips (5%, avg. size 439,000 bytes)
What is the throughput:9000 requests/1800 sec = 5 req/secWhat is the throughput in Kbps?
Throughput in Kbps? Xr = (total_req * class% * avg. size)/time
Xhtml = (9000 * 0.30 * 11,200*8)/1800 = 131.25 Ximage = (9000 * 0.65 * 17,200*8)/1800 = 436.72 Xvideo = (9000 * 0.05 * 439,000*8)/1800 = 857.42
X0 = 131.25 + 436.72 + 857.42 X0 = 1425.39 Kbps
To support the Web traffic, the network connection should be at least a T1 line (1.544 Mbit/s ).
QoS indicators for Web Services
Response time Availability
Percentage of time a service is ‘live’ (serving customer requests)
Reliability Probability that WS will perform in satisfactory
manner for a given period of time under specified operating and load conditions
Predictability Cost
Input data needed to monitor QoS
TrafficPerformanceUsage patterns
Knowledge of average and peak load
Where are the delays?
Where are the delays?
Four categories:DNS lookup phaseTCP connection set-up phaseServer execution timeNetwork time
DNS lookup phase Browser converts server name in URL into an
IP address to establish the TCP connection If server name can’t be resolved by local
cache, send query to higher-level DNS server For leading e-commerce sites, avg. lookup
times are 0.01 and 0.11 sec. Fastest sites achieve 0.001 sec.
Anatomy of a Web Transaction
Anatomy of a Web transaction
BrowserNetworkServer
Anatomy of a Web Transaction: the Browser
User clicks on hyperlink; requests document Client (browser) checks local cache for document;
in case of hit: returns document; user response time R’Browser,hit*
In case of missBrowser asks DNS to map server hostname to IP addressCloent opens a TCP connectionto the server defined by the
URL of the linkClient sends an HTTP request to the serverBrowser formats and displays document and renders
imagesReturned document is stored in browser cacheUser response time: R’Browser,miss*
Anatomy of a Web Transaction: the Network
Imposes delays in delivering info from client to server (R’N1) and from server to client (R’N2).Delays a function of components on path
between them:Modems, routers, comm links, bridges, relays
R’Network = total time HTTP request spends in the netork= R’N1 + R’N2
Anatomy of a Web transaction: the Server
request arrives from client server parses the request according to the http server executes requested method (GET, HEAD, etc.)
if GET server looks up file in its document tree by using the file
system; file may be in cache or on disk server read contents of file from disk or cache and
writes it to network port when file send complete, close the connection (if non-
persistent HTTP) R’server = time spent in execution of HTTP request
includes service time and waiting time at the server
Anatomy of a Web transaction
If document not found in client’s cache: response time is sum of residence time at all
resources Rmiss = R’Browser, miss + R’Network + R’Server
If a hit Rhit = R’Browser, hit
Typically: Rhit << Rmiss
Average response time, R, over NT requests: R = pC * Rhit + (1-pc) * Rmiss
ExampleUser wants to analyze impact of local
cache size of browser on Web response time perceived by user20% of requests serviced by local cache
with R=400 msecR for remotely serviced requests = 3 secPrevious expts. indicate that 3x cache size
results in hit rate of 45%R_orig=0.20 * 0.4 + 0.80 * 3.0 = 2.48 secR_new = 0.45 * 0.4 + 0.55 * 3.0 = 1.83 sec
Bottlenecksbottleneck = the component that
limits system performance
Need to identify the bottleneck to improve performance
Examplehome user
takes too long to download medium-size page (avg. size 20KB)
considering upgrading to processor w/2X faster CPU
How will this affect response time?
Example, continuedAssume:
R’network = 7.5 sec
R’server = 3.6 sec
R’Browser, miss = 0.3 sec
R = R’network + R’server +R’Browser, miss
R = 7.5 sec + 3.6 sec + 0.3 sec = 11.4 sec
Rnew = 7.5 + 3.6 + 0.15 = 11.25 secnot much difference … CPU not the bottleneck
ExamplePharma co. plans intranet for training
and display of images of moleculestraining sessions have 100 peopleassume 80% active at any one timeEach user performs avg. of 100 ops/hourEach op requests avg. of 5 imagesAvg. size of requested image is 25600
bytesWhat is minimum bandwidth of
network connection to image server?
Example, continued100 * 0.80 * 100 ops/hour * 5 images/op *
25600 bytes/image * 8 bits /byte * 1 hr/3600 sec
(100 * 0.80 * 5 *25600 * 8 )/3600 = 2.28 Mbps
Web Infrastructure
Web infrastructureThree major delay sources:
“last mile”Link between end user and phone company switch, or
DSL or cable connection to service providerISPs
Recently, more bandwidth addedImprovements via caching, load balancing, more servers
‘backbone’ of networkCollection of interconnected network providers
Connect to each other to exchange traffic (peering)Public peering: at major interconnection points (NAPs,
network access points)(MAEs, Metropolitan Access Points)Delays may occur at peering points
Basic Components Servers Browsers Firewalls
protect data, programs, and computers on private network from the uncontrolled activities of untrusted users and software on other computers
Screens network traffic going through it, usingSoftware, network hardware, computers
Potential performance bottleneck
Proxy, Cache, MirrorTechniques for improving web
performance and securityTry to reduce
access time to web documentsNetwork bandwidth required for doc xfersDemand on servers w/ very popular docs
Proxy server Special type of web server that acts as an
agent: server to the client, client to the server
Accepts requests from clients, forwards them to web servers
Receives responses from remote servers, forwards them back to the client
Originally designed to provide web access for users on private networks who had to go through a firewall
Proxy server Can be configured to cache relayed responses Benefits:
Improves access speed by bringing data closer to consumer
Cuts down on network traffic Reduces server load Increases availability in the web
Problems: Ensuring that cached docs are up-to-date What’s worth caching? For how long?
Proxy server
CachingUsed in the Web:
Client-side, at the browserIn the network, a caching proxy
Evaluating caching effectiveness:Hit ratio =
requests_satisfied/total_requestsByte hit ratio = hit ratio weighted by doc
sizeData transferred = bytes xferred/time
Example Manager wants to install caching proxy
server on corporate intranet w/ > 2000 users Use for 6 months -> then evaluate Consider two cases:
Cache holds small documents, avg. size 4800 bytes, hit ratio 60%
Cache holds medium documents, avg. size 32500 bytes, hit ratio 20%
Monitor for one hour, observe 28800 requests
Cache efficiencySaved_BW =
(num_req * hit_ratio * avg_size)/timeSaved_BW_small =
(28800 * 0.60 * 4800 * 8)/3600 sec = 184Kbps
Saved_BW_med = (28800 * 0.20 * 32500*8)/3600 = 416 Kbps
Holding larger documents can save more BW
MirroringReplicating site content at other
serversRequires:
Regular updatesDNS to direct browsers to secondary sites
when primary is busyGoals:
Increase availabilityBalance server load
Thus increasing quality of service
ExampleManufacturing co., employee portal,
too slow for European usersIdea: install mirror site in ParisWhat are the bandwidth savings ?
Example: Mirror site in Paris
Current avg. BW is 35 Mbps 40% of load from Europe 42% of traffic could be served from caching Cacheable amount: 35 * 0.42 = 14.7Mbps Estimate cache hit ratio at 38% Saved_BW = 14.7 Mbps * 0.38 = 5.6 Mbps
40% of traffic from Europe, so:5.6 * 0.40 = 2.24 Mbps could be served from cache in
Paris6.4% savings on current BW usage at server improvement in perceived response time for European
users
Content Delivery Networks(CDN)
cache or replicate content as needed to meet demands from clients over the Web
coordinated caching systems implemented through proprietary networks and data centers
employ a DNS-redirecting mechanismtries to assign best location from which
to serve the requested content
Content Delivery Networks(CDN
DNS-redirecting mechanism: client requests URL; browser generates a DNS
request for the IP address corresponding to the domain name in the URL
CDN controls the DNS service for this domain name CDN modifies DNS requests with the IP addess of a
selected server rather than IP address of original server
uses a routing function to select “best” server:client location, id of requested content, load of CDN
network and servers, proximity of CDN servers to client are all considered
CDN should provide: scalability, high availability, manageability,
performance
The WAP Infrastructure
WAP = Wireless Application Protocolarchitecture + set of protocols for wireless
devices to access Web services at regular Web sites
wireless device communicates with WAP gateway, over wireless nework
WAP gateway communicates with servers
The WAP Infrastructure
The WAP Infrastructure
Docs for wireless devices written in form of XML known as WML (wireless markup language)
can also use WMLscriptWML docs
structured as set of “cards”, units of user interaction
deck = set of cardsusers navigate between cards
The WAP Infrastructure
WML decks + WMLScripts stored in regular web servers on internetretrieved by WAP gateway via HTTPWeb server response is binary encoded by
WAP gateway and sent to wireless device via lightweight protocolsdesigned to minimize BW requirements
WAP protocol stack
Server ArchitecturesWeb ServerApplication ServerTransaction and Database ServerStreaming ServerMulti-tier Architecture
Web Server listens for HTTP requests establishes requested connection sends requested file returns to listening mode
can handle more than one request at a time fork a copy of the HTTP process for each request multi-threaded HTTP program pool of running processes
Dynamic contentcan use client-side or server-side
programscan improve performance by pushing
to client-side
Application Server
software that handles all application operations between broswer-based customers and back-end databasesreceive client requestexecute business logic, interacting with
transaction and/or DB servers
can be implemented in many ways:CGI scripts, FastCGIs, server-applications,
server-side scripts
Transaction and Database Server
Tranasction Processing (TP) monitor provides:an application programming interfacea set of program development toolsa system to monitor and control execution
of transaction programs
DB server:executes and monitor transaction
processing applications
Streaming ServerInitially, audio and video were
“download and play” technologiesStreaming media begins to play
“almost” immediatelyclient request arrivesserver retrieves video and audio data and
begins to deliver them over the networkvideo and audio are compressed (MPEG,
MP3)typically have control part and data part
ExampleCompany plans to offer MM online
trainingEmployee retrieves lecture of video,
audio, slides; 30 minute duration
What is the number of streaming servers needed to serve the lecture presentation during busiest period of the day: 4-5 pm
Example 400 employees at peak One MM server can stream presentations to
150 viewers simultaneously What is the average number of simultaneous
viewers during peak period? Use Little’s Law: N=R = Req/time = 400 viewers/60 min R = 30 min N = 30 * 400/60 = 200 Need two MM servers
Multi-tier Architectureweb-based apps usually in 3-tier
architecture:presentation layer
user interface (browser & HTML, XML, etc.)application layer
business logiccollection of rules to implement application logicmay also contain Java applets, ActiveX controls,
etc.
data service layerpersistent data
Multi-tier Architecture
Example application layer designed to support 400
simultaneous processes app process:
receives client request executes app logic, interacting with DB server
Monitoring shows: app process executes for 150 msec between DB
requests DB server handles 440 req/sec 400 app processes running during peak period
What if??the application servers are replaced by
new servers with 2X speedEach application server characterized
by Z, “think time” – time between receiving a reply from the DB server and submitting a new DB request
DB layer, characterized by throughput, X, in req/sec
R = N/X - Z
What if ...?DB response time:
R = 400/550 – 0.15 = 577 msec = 0.577 secafter cpu upgrade, app processing time
should be 75 msecDB response time now:
Rnew = 400/550 – 0.075 = 652 msec = 0.652 sec
Improvement in app layer may not lead to improvement overall
Dynamic Load Balancing
heavy traffic load adversely impacting performanceadd more serversbuy bigger (faster) serversneed to do cost-performance analysis
Dynamic Load Balancing
web cluster:multiple web serverssingle location addressed by one URL and
a single virtual IP addressincoming requests routed amount servers
in user-transparent wayswitch acts as dispatcher, mapping virtual
IP address to actual address
Web cluster
NetworksBandwidth
measures the rate at which data can be sent through the network
usually expressed in bps
Latencytime needed for a bit (or small packet) to
travel across the network
Bandwidth for different types of
networks
Planning Streaming service offers training videos training session -> 15 min video at 300 Kbps What impact if videos go to 25 min? Service supports 35 simultaneous sessions Average BW needed (now)
35 * 300 Kbps = 10.5 Mbps Average number simult. sessions (now)
N = 35 N = * R 35 = * 15 = 35/15 = 35/15 .. assume this remains the same
Nnew = * 25 = 35/15 * 25 = 58.33 Average BW needed (new)
58.33 * 300 Kbps = 17.5 Mbps
Exampletraining videos, avg. size 950 MB100 students, 80% active at one timeEach user requests 2 clips/hourBW needed to support:
( 0.80 * 100) * 2 * (8 * 950)/3600 sec337.7 MbpsNeed a 622 ATM network to support