[ieee 2002 ieee international workshop on workload characterization - austin, tx, usa...

9
Benchmarking a Site with Realistic Workload G. Ballocca, R. Politi, V. Russo' CSP S.c.a r.1. Via Livorno, 60 10144, Torino, Italy G. Ruffo Dip. di Informatica, Universitb di Torino Corso Svizzera, 185 10149 Torino, Italy {giouanni. Dallocca,roberto.politi} @csp.it ruffoOdi. unito.it Abstract The rapid growing of Web llSeTS number, and the consequent importance of Capacity Planning, have leaded to the development of Web benchmarking tools largely available in the market. One of the most com- mon critics to this approach, as that synthetic work- load produced by web stressing tools is far to be real- istic. This paper deals with a benchmarking method- ology based on a workload characterization generated from the log files. Customer Behavior Model Graph (CBMG) has been proposed in 1131 as a workload char- acterization of an e-Commerce site. Ifere we discuss how CBMG methodology has a wider field of applica- tion and how to use this model to eficiently improve a fully integrated Web Stressing Tool. We also evaluate the differences between OUT approach and other models based on different characterizations. Since 1995, web servers have draniatically ill- creased the number of connections served per sec- onds. Throughput strongly depends on network ca- pacity. However, network capacity is improving fa..ter than server capacity, and enhancements of the network infrastructure (Gigabytes wide-area network, ISDN, xDSL lines, cable modems, and so on) rcduce net- work latency. Latency, measured by RTT (Round 'hip time), is one of the key element of Web Perfor- mance, and is dependent on network conditions and web server capacity. A high error rate is the most immediate perception of a web server unreliability. Server errors are mainly generated for two reasons: bad implementation or inadequate capacity. Avoid- ing and/or detecting implementation errors is easier to obtain than adequate capacity. If a server has a low capacity (i.e., a real-time streaming server which can onlv serve 200 rcquests per second). manv clients I. 1 T-L-..A..-&:-- receive a "connection refused" message: service is not I l,,LlUUUL.CIUI. The World Wide Web is one of the most used in- terfaces to access remote data and services (both com- mercial and not commercial). The number of actors involved in these transactions is growing very quickly. As a consequence users may experience very slow con- nection to popular web sites during rush hours [5]. On average, users leave a site after 8 seconds of busy wait. Despite the growing development of servers and network technologies, Web performance problems are still quite difficult to solve: the Web is a composite saucerful of multimedia sources, static and dynamic objects, local and distributed architectures, and so on. - available and clients may be lost. Since the main bot- tleneck is at the server side, expected workload is dif- ficult t o define (see [7] for a survey on workload char- acterization problems). The continuing web frame- work evolution, from a simple ClientlServer architec- ture to a complex distributed system, has caused a series of consequences: a client request can be fulfilled by many servers; routing strategies can be defined at different levels (the client may originate requests to the primary server as well as to one of its mirrors); DNS servers can resolve the address at different hi- erarchical levels; Web switches are often used to dis- oatcli reouests in a DOOI of servers Drovidine load bal- ~~ a ~~~ . ~ ~ ~ ~-- . ~~~~~~ ~ - ~~---- Four metrics are co"onb' used to the per- formance of a Web Site: (1) Requests Served Per sec- and; (2) TllroughPut in bytes Per second; (3) trip time (RTT); (4) Errors. In order to improve the overall performance, all these Parameters shodd be considered as well. ancing [SI. Moreover, workload characterization deals with clients, servers and proxies [16]. In this context, http request/response messages between clients and Server are also definitely influenced by cache activi- ties. Different monitoring strategies can be cllosen to describe web workload in order to capture desired met- rics (we only need information traced at server-side). To evaluate web server performances, either commer- *V. Russo has been granted by CSP during the preparation of this paper. 0-7803-768 1-1/02/$17.0002002 EEE 14

Upload: dinhanh

Post on 01-Mar-2017

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

Benchmarking a Site with Realistic Workload

G. Ballocca, R. Politi, V. Russo' CSP S.c.a r.1.

Via Livorno, 60 10144, Torino, Italy

G. Ruffo Dip. di Informatica, Universitb di Torino

Corso Svizzera, 185 10149 Torino, Italy

{giouanni. Dallocca,roberto.politi} @csp.it ruffoOdi. unito.it

Abstract The rapid growing of Web llSeTS number, and the

consequent importance of Capacity Planning, have leaded to the development of Web benchmarking tools largely available in the market. One of the most com- mon critics to this approach, as that synthetic work- load produced by web stressing tools is far t o be real- istic. This paper deals with a benchmarking method- ology based on a workload characterization generated from the log files. Customer Behavior Model Graph (CBMG) has been proposed in 1131 as a workload char- acterization of an e-Commerce site. Ifere we discuss how CBMG methodology has a wider field of applica- tion and how to use this model to eficiently improve a fully integrated Web Stressing Tool. We also evaluate the differences between OUT approach and other models based on different characterizations.

Since 1995, web servers have draniatically ill-

creased the number of connections served per sec- onds. Throughput strongly depends on network ca- pacity. However, network capacity is improving fa..ter than server capacity, and enhancements of the network infrastructure (Gigabytes wide-area network, ISDN, xDSL lines, cable modems, and so on) rcduce net- work latency. Latency, measured by RTT (Round 'hip time), is one of the key element of Web Perfor- mance, and is dependent on network conditions and web server capacity. A high error rate is the most immediate perception of a web server unreliability. Server errors are mainly generated for two reasons: bad implementation or inadequate capacity. Avoid- ing and/or detecting implementation errors is easier to obtain than adequate capacity. If a server has a low capacity (i.e., a real-time streaming server which can onlv serve 200 rcquests per second). manv clients

I .

1 T-L-..A..-&:-- receive a "connection refused" message: service is not I l , , L l U U U L . C I U I .

The World Wide Web is one of the most used in- terfaces to access remote data and services (both com- mercial and not commercial). The number of actors involved in these transactions is growing very quickly. As a consequence users may experience very slow con- nection to popular web sites during rush hours [5 ] . On average, users leave a site after 8 seconds of busy wait. Despite the growing development of servers and network technologies, Web performance problems are still quite difficult to solve: the Web is a composite saucerful of multimedia sources, static and dynamic objects, local and distributed architectures, and so on.

- available and clients may be lost. Since the main bot- tleneck is at the server side, expected workload is dif- ficult t o define (see [7] for a survey on workload char- acterization problems). The continuing web frame- work evolution, from a simple ClientlServer architec- ture to a complex distributed system, has caused a series of consequences: a client request can be fulfilled by many servers; routing strategies can be defined at different levels (the client may originate requests to the primary server as well as to one of its mirrors); DNS servers can resolve the address at different hi- erarchical levels; Web switches are often used to dis- oatcli reouests in a DOOI of servers Drovidine load bal- ~~ a ~~~ . ~ ~ ~ ~ - - . ~~~~~~ ~ - ~ ~ - - - - Four metrics are co"onb' used to the per-

formance of a Web Site: (1) Requests Served Per sec- and; (2) TllroughPut in bytes Per second; (3) trip time (RTT); (4) Errors. In order t o improve the overall performance, all these Parameters shodd be considered as well.

ancing [SI. Moreover, workload characterization deals with clients, servers and proxies [16]. In this context, http request/response messages between clients and Server are also definitely influenced by cache activi- ties. Different monitoring strategies can be cllosen to describe web workload in order to capture desired met- rics (we only need information traced at server-side). To evaluate web server performances, either commer-

*V. Russo has been granted by CSP during the preparation of this paper.

0-7803-768 1-1/02/$17.0002002 EEE 14

Page 2: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

cia1 or opensource web stressing tools are largely avail- able (OpenSTA [15] and Load Runner [14]). They are both used to perform load and stress test on a repli- cated web site. During the testing period, monitor- ing agents collect a set of system resources parameters (CPU and RAM utilization rate, I j O Disks accesses, network traffic, and so on). Generated workload is usually based on a predefined user session, which is replicated many times by a number of Virtual Users (see 4). In the rest of the paper we will present a methodology to evaluate the performance of a web site to avoid “lack of planning” problems. This method- ology is based on the usage of a stressing tool, which imposes a synthetic realistic workload on the web site under test. In Section 2, we will briefly review related works about benchmarking and workload characteri- zation. In Section 3, a quick survey of a workload char- acterization model called Customer Behavior Model Graph (CBMG 3) will be given as well. After intrc- ducing the approach in detail (Section [13]), we will propose an integration between existing benchmark- ing tools and the CBMG workload characterization model, in order t o test the web site in a more realis- tic scenario than using random navigational patterns. Before concluding, we will outline future directions of this research, giving an experimental evaluation of it in Section 6.

2 Related Work One of the main important steps in Capacity Plan-

ning is perf0nnance.prediction: the goal is to estimate performance measures of the web farm under exam- ination for a given set of parameters (ex. response time, throughput, CPU and ram utilization, number of 1/0 disk accesses, and so on).

There are two approaches to predict performance: it’s possible to use a benchmarking suite to perform load and stress tests, and/or use a performance model. A performance model 1121 predicts the performance of a web system as function of system description and workload parameters. There are simulation mod- els [ll] and analytical models (i.e., the analytical based network queue model proposed in [12]). Both models output response times, throughput, resources utilization to be analyzed in order to plan an adequate capacity for the web system. Benchmarking suites are largely used in the industry: experts prefer t o measure directly the system responses and they think that no model can really substitute the real architecture.

Actual stressing tools (see Section 4) replicate a synthetic session made of a sequence of object- request‘s. At the other side, performance models can take as input a workload characterization. It is inter-

esting to allow benchmarking suites to take as input a workload characterization in order t o feed a series of “realistic” sessions. In such domain, we would have two immediate advantages: the results from perfor- mance models and benchmarking suites are compara- ble on a common basis (i.e., both use the same work- load characterization) and stressing tools produce a more realistic workload than replicating a “random“ session.

In literature, workload characterization has been often used to generate synthetic workload [6, 4, 171 for benchmarking purposes. In the particular case of web workload, much research has been Conducted to understand the nature of the web traffic and its influ- ence on the performance of the server [9, 3, 41. A set of invariants have been detected, i.e., phenomena com- mon to many web servers and related network traffic. A list of the most important invariants can be found in [E].

The reference bechmarking tools using a web cbar- acterization workload are SPECweb99 [17] (an evc- lution of the SPECweb96 package) and SURGE [4]. Both systems perform a sequence of requests to the web server under test, respecting some distributional models. The main difference is that the SPECweb family workload is based on independent HTTP o p erations per second (not related to specific user se% sions), while SURGE traffic is generated in terms of (virtual) user sessions. In this perspective, SURGE is similar to other stressing tools like openSTA or Mer- cury’s Load Runner, because it emulates web sessions alternating requests for web files and idle times.

The main scope of this paper is to show how an ac- tual stressing tool can be extended to generate a real- istic workload. The resulting system would have two important features: first, it is user-friendly, because traffic generation from adopted workload characteri- zation is a n automatic process. Second, the charac- terization process starts from the web log files, i.e., generated traffic is strongly based on the particular site topology. For these reasons, we compare our prc- totype derived from the openSTA tool with the basic version of openSTA, in order t o analyze their different behaviors.

Of course, the nice feature of adapting synthetic traffic to the site own characteristics has the drawback that we need to have request log files before workload can be generated. This is undesired when a research study based on empirical comparison is conducted, be- cause it is well known that obtaining such data from organisations is not an easy task. But it is the authors opinion that this is not the case for the end user of a

15

Page 3: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

on-line catalogues, latesf news, . . .

stressing tool: an expert analyst normally gains access to log files because she has to perform many prelimi- nary traffic analysis, e.g., for example, using tools like analog [Z] and so on.

Vicw hlBox We plan to compare our system with SURGE and

SPECweb99 in a future work. An analysis of the gen- erated traffic for a group of testing web servers, is also planned: the goal will be to show how GBMG-based generated traffic respects the distributional models un- derlying the invariants of Web workload.

pages allowing registered users to login and new users to create thcir own accounl;

to manage his/her& addressbook, or CO select a particular receiver to send a mossago,. . .

Table 1: CBMG States of a Web mail service 3 . Workload Characterization using CBMG

Available Web Stressing Tools (see Section 4) use a (set of) predefined session(s) t o perform a synthetic test on the system under examination. At the end, the tool measures the returned results. One of the main problems with benchmarking a web site, is how to de- fine such a (set of) session(s) that can aid to predict a real-world performance. If these sessions reflect ex- pected load and navigational patterns, then the load test will be considered more realistic. Moreover, dif- ferent kinds of users have to be considered, and, as a consequence, different kinds of navigational attitudes should be represented during the emulation process.

.Of course, any group of sessions selected (or in- vented) by a web administrator launching the s t ress ing tool will not be able to substitute the real work- load. We propose to use a workload characterization extracted from past traffic data saved at the server side. This characterization can be used to automati- cally generate realistic workload, taking into account basic components of navigational patterns (i.e., aver- age number of sessions per day, average user think times, most accessed resources, and so on), as well as different kinds of users.

CBMG [13] can be applied to give such'parameters as input to a stressing tool. CBMG is a state transi- tion graph proposed for workload characterization of Ecommerce sites. It is our opinion that such a model has a wider field of application, and that can also be used to generate realistic workload for every kind of sites. We will use a CBMG for each group of users (or group of sessions) with similar navigational patterns. Another important feature of CMBGs is that they can be automatically extracted from web log files.

A CBMG is a graph where each node represents a possible state. For example, if the site under examina- tion provides a web based e-mail service (for instance, hotmail.com or yahoo.com), each (static or dynamic)

weight associated to each arc in the graph is the tran- sition probability (Figure 1).

Formally, a CBMG is represented by a pair (P. Z ) of n x n matrices, where n is the number of considered states, P = [p,] contains the transition probabilities between states, and Z = [yj] represents the average client side think times between states (think time in a transition 'is the is the difference in two consecu- tive request arrival times minus the server execution time ').

I D1 am1 -

I I

Figure 1: The CBMG of a typical user of a web mail service (e.g., hotmail.com, yahoo.com, ...)

Observe that the user can leave the site from every

A CBMG model is highly scalable: number of states change from site t o site and it can be fixed before gen- erating the model from log files. In principle, the num- her of states can change also when a site is modified. In this case, the model can he regenerated setting a new number of states; however, the number of states changes only when the site is strongly modified and this modification influences the entire site structure.

page.

page can be assigned to one of the states in Table 1.

en; states, are represented like graph transitions. The

'In our experiments (8ee Section 6), we did not consider Existing links pages to differ- execution times, because we used Apache standard log files,

and a a consequence, we could not retrieve such information

16

Page 4: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

Normally, modifications affect only a (set of) page(s): we need to upgrade only the affected state definition, which is made of a list of pages.

Four different processes are involved in the building phase of the workload characterization:

1. Merging and filtering web logs: a Web site may he composed of several servers. All the log files in the cluster are merged together. Moreover, all log entries not influencing the identification of user sessions, are filtered out (for instance, requests of embedded objects, as images, video clip, and so on).

2. Getting sessions: log files contain request- oriented information. During this phase, data are transformed into a session-oriented format. A session is characterized by a sequence of requests from the same IP-address. If there is a pause longer than a certain amount of time (on average 30 minutes, depending upon the web site) since a ' request from the same IP-address and the next, then the session is considered closed. After this process, a Request Log File is generated, where each session is identified hy an unique ID and it is represented by a sequence of page-requests.

3. Transforming Sessions: Let us assume that n is the number of states. For each session S, we cre- ate B point Xs = (Cs, ws), where Cs = [cij] is a n x n matrix of transition counts between states i and j , and WS = [w;j] is a n x n matrix of accumulated client-side think times;

4. CMBGs clustering: from all Xs points, a num- ber of CBMGs is generated. A K-means cluster- ing algorithm is adopted in order t o find a rela- tively small number of CMBGs that model differ- ent users with similar navigational behaviors [lo].

As pointed ont in [13] (where algorithms seen above are explained in detail), the CBMG gives other use- ful information, as the average number of visits t o each state, the average session length and the buy to visit ratio. These metrics are important to analyze e- Commerce site, hut their influence in such analysis is out of the scope of this paper.

4 Stressing Framework A Web Stressing Tool is a benchmarking suite. In

this Section, we describe the architecture of a generic stressing tool: we are taking into account the large part of available web stressing tools.

A stressing tool, a t least, is made of two modules: a script recorder and a load generator.

The script recorder, captures the http browse ses- sion performed by the customer and saves it as a se- quence of pair (object-requests, think time). The c a p tured object-requests are web pages requests, CGI ex- ecution requests (i.e., after a form submission), images requests, dynamic elements requests and other. The think time is the time that the user "spends" in the transition from a state to another, by clicking a link, compiling a form, typing a URL or, simply, hy reading a page. Traced requests and think times are converted to a sequence of commands of a (proprietary) script- ing language. This script is used during the load test, which consist simply in emulating navigational pat- terns by running the script a t client-side.

The load generator module provides means to gen- erate a synthetic workload according to the recorded scripts.

A stressing client, as shown in Figure 2, is a load generator that runs a test; during a test session, the stressing clients executes one or more scripts by means of each Virtual User (VU). We can assign many VUs to each script, batch them in time and number ac- cording to our needs of analysis and measurement. Moreover, each stressing client includes a performance collector for gathering the client's performance pa- rameters and a SNMP agent for sending measuring results to the master stressing client. In other words, each stressing client outputs a synthetic http- Workload which reflects recorded workload. This workload is replicated for each VU assigned to the given stressing client.

VIRTUAL USERS (Vu)

Figure 2 The architecture of a stressing client

As shown in Figure 3, a plurality of stressing clients, under the master stressing client control, launch gen-

17

Page 5: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

erated http-Workload against the targeted web farm via Internet or LAN.

The master stressing client, which includes an SNMP manager, collects the performance measure ments of the web farm and other stressing clients via SNMP protocol, and produces a report of perfor- mances observed during the test.

I ' J

Figure 3: The stressing framework

These tests are usually performed at night time so to avoid to stress the web site when the service should be accessible. A better (but more expensive) approach is to perform load and stress tests in a replicated ar- chitecture.

5 . Generation of realistic web server

5.1 Preliminary considerations about

Using CBMG to generate Web traffic has the obvi- ous advantage of creating a synthetic workload which reflects the topology of the site (i.e., CBMGs are de- fined in terms of navigational patterns) and the be- havior of real users of the given web site (i.e., CBMGs are extracted from web logs). Of course, there are some disadvantages: CBMG can he used only when analysts already have web logs, and, as a consequence, this technique cannot he used with a new site, or when log files are not easily available. In those cases, we can perform preliminary analysis with SPECWebQ9, SURGE or even with standard OpenSTA. These ini- tial tests can be useful to carefully configure a new site with an expected workload. But for an accurate ca- pacity planning of an existent site, our proposed anal- ysis is suggested. Moreover, in [9], it has been o b

workload

CBMG

served that the distribution of the set the transferred (i.e., requested) file sizes can show heavy tail. One reason behind this behavior is to be searched in the effects of caching algorithms inside the web browsers. CBMG is extracted directly from the log files which represent only the set of file requests at the server side. Cache impact to the workload is filtered out, and traf- fic generated from CBMG would eventually behave as expected. 5.2 From CBMG to Scripts

The basic idea is using the CBMG to automati- cally create a group of scripts that can be used by the stressing clients. In particular, we extract a ses- sion from each cluster (i.e., CBMG) of our workload characterization.

As shown at Section 3, from web logs analysis, we can build many CBMG diagrams, modelled by matri- ces, that are batched in K clusters, where the value of K , as discussed in 1121, can be reasonably chosen in the interval [3 ,6] . These clusters describe K several user navigational behaviors for the given web site. Let us call xl, xz, . . . XK these clusters. Let us define the cardinality of a cluster 1x.l as the number of sessions belonging to cluster xi. Later in this Section we will use the weight percentage wp; of a cluster x i . We de- fine this value as the normalized cardinality of a cluster xi, i.e. wp; = +J-- E,=, lXil

Now, for each xi, we randomly select a session S,, which is representative of the entire cluster because the distance with other internal points in the cluster is less than the distance with external points. Once a session S;, with 1 5 i 5 K , identified by its own 1D.sessionnumber has been randomly selected from each cluster we can start building a script reflecting SI.

The procedure of building a script is described in the following. After Si has been selected, we build the script by using the consecutive pagerequests as- sociated to each session (as shown in Figure 4 ) ; this information can he retrieved by the Request Log Fife created during the creation of the CBMGs (see Sec- tion 3 ) . It is useful to observe that a CBMG describes a navigational pattern through a series of logic states and not through the requests of specific pages. As a consequence, we use the Request Log File to recon- struct the original session.

To complete the script reflecting representative s e sion S;, we need think time values between consecutive page-requests. These think times can he taken from matrix 2 of the CBMG that includes Si. In 2 we find the average think times between the states of the given CBMG. Let us suppose that a session S; is com-

18

Page 6: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

posed of successive requests to the following pages: 61, &, .._,q+,. Let us call 01 the state of page $ 1 , with 1 s.t. 1 5 1 5 s; in the script, after q+I request, and be- fore requesting page we will make the VU waits for a think time of value z+,+~+,, where z+,+,+, is the average think time between corresponding states. In cases where the traffic burst is quite huge (i.e., the official site of an international sport event), we can make our scenario more realistic selecting more than one session from the same cluster. To maintain the

SI $2 I SI ! L.....1

Request Log File

efficiency of our approach, instead oi selecting differ- ent sessions, we can duplicate the same final script slightly changing think times between successive r e quests. In this case, we can set the think time value to +T, where T , is a value randomly generated for each different script built from the same session,

Random YPIY~ 10 k added to think time

such that --E 5 T 5 +E *. Now, for each session S; selected from one of the K

clusters, we have (at least) one script emulating a rep resentative navigational pattern. As a consequence, we have (at least) K scripts (in t,he following CBMG- scripts) Scl, Sc2, . . . , SCK) each representative of an user model.

These scripts must be run by the stressing clients. For each stressing client we should set a number of VUs executing generated CBMG-scripts. To perform a stress test, we have to emulate an incoming traffic bigger than traced workload. Let us call f the multi- plication factor that we will use to reach higher traffic peaks. If U is the total number of VUs that should be run by stressing clients, we set its value to f multiplied t o the number of users that accessed the site during the analyzed period. Each VU is associated to one of the K generated scripts. Association between virtual user U and CBMG-script Sc; (generated from session Si) is influenced by the weight percentage of the clus- ter xi: if wp; is the weight percentage of cluster xi, then the total number of virtual users executing Sc, will be: wpi x U .

Of course, the presented implementation is not the best one: it has to he considered as a prototype for testing the basic idea. The best implementation should permit to create a different script for each ses- sion. In this perspective, CBMG should be used as an automatic generator of new scripts. The final ver- sion of this package will not just replicate a group of selected sessions using matrices P and 2 as distribu- tional references. It will always create new sessions reflecting the given workload characterization. 'Ran- sition probabilities and think times, saved respectively

ZValue E > 0 should be small enough and should be selected by the analyst

Figure 4: The final script's building

in matrices P and Z, will be used instead for an au- tomatic generation of sequences of GET and WAIT commands. The script generator should transact from state t o state following the given probabilities, start- ing from Home. When it transacts to a CBMG state, it will select a web page belonging to the given state'. When the Exit etate'is reached, then the session is terminated. 5.3 Stressing Prototype

For our experiments, we used a Web Stressing Tool called OperLSTA, (Open System Testing Archi- tecture) 1151, which is an Open Source stressing tool, released under GPL license and built for Windows 2000/NT platform, with CORBA architecture. In ac- cordance t o the schema presented in Section 4, we can find the script recorder module called Script Mod- eler. Another module generates the synthetic h t t p Workload and another one collects performance data of the local system and the remote web farm. Open- STA receives performance data from remote system via SNMP protocol. A master client can control many other OpenSTA clients, both through LAN and through Internet. OpenSTA allows us to simulate thousands VUs, batched in groups and for a long p e riod of time.

We implemented a package called CBMGmaker to automatically create CBMG workload characteriza- tion from log files. The CBMGmaker is a collection

3The web page can be sekted using the Zipf's law [I]; in fact, it has been shown that the popularity distribution for files on a Web Server follows the Zipf's law, i.e.. if files are ordered from most popular to least popular, then the number of refer- ence to a file tends to be inversely propositional to its rank.

19

Page 7: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

of Per1 scripts that helps us t o analyze web server log files and to build the CBMG graphs. According to the CBMG procedure proposed in [13] and described in Section 3, we implemented a group of modules, namely Merger.pl, LogZReq.pl, ReqZSession.pl, Clu4 CBMG.pl.

Merger.pl merges the log file of a local net- work of server; Log2Req.pl filters out non interest- ing log file's entries in order t o produce a cleaned log file. Req2Session.pl outputs two files; the Re quests Log File, a file where the requests are sorted by,IDsessionnumber and by clientjdentifier, and the Matrices File, where the pair of matrices (C,W) (see Section 3) are stored for each detected sessions.

Clu4CBMG.pl implements the clustering a l g e rithm. It takes as input the Matrices File seen above of clusters to produce and outputs the CBMGs File. Here, for each of the K clusters, a pair of matrices is saved storing transitions probabilities between state average think-times.

In this package we also included a configuration file CBMGconfig.cfg, where the analyst can set: the num- ber n of states, assigning a name to each one; the number K of clusters; the file extensions identifying the object-requests entries t o filter out from the web log files (i.e., .jpg, .gif, and so on).

Finally, we implemented another Per1 script, called CBMGZscripts.pl, that implements the conversion procedures described in Section 5.2.

6 A preliminary experimental compar- ison between Random and Realistic synthetic workload

In this Section we will provide a brief descLiption of the test environment configuration, methodoloe and

. . experimental results. The goal of this experimental task is to~show that GBMG-based traffic is sensibly different from random one. We are currently.working on. a wider spectrum of experimental settings, making comparison with Specweb99 and other tools to see if there is a significant difference in accuracy,of perfor- mance estimations. Comparisons are made with r< spect to ieal monitored traffic and considered param- eters are based on resource consumptions measures (CPU, RAM, Disks, ). Results presented here're- pdrts some observations made on a single web server, but current experiments are made on a group of mon- itored web servers. As a consequence, this empirical discussion must be considered only a preliminary test for the developed prototype. -st Environment Architecture

~. :

.

~.

' '

'The test environment was composed of a web server hqsting the web site and ,a group of client machines

accessing the web site though LAN. The web server machine hosted a copy of a web site of a scientific con- ference held on 16th of May, 2002. It was conlposed of a hunch of static HTML pages providing informa- tion about the conference and a few dynamic pages including forms and response pages for the registra- tion section. The web server used was Apache 1.3.14. Every client machine was equipped with a web stress- ing tool (namely, OpenSTA v 1.3.20).

CMBG states are four: Home, Browse, Reggistra- tion, E d . Registration state contains a cgi permitting the user to subscribe to the conference. Test Methodology

The first step was the.analysis of the access log file of the real web site. Using the Analog [2] utility the log file has been analyzed to plot the profile of user access to the site during the days April, 15th 2002 to May, 15th 2002 (the 30 days preceding the event).

The average number of accesses during the 24 hours of the day has been calculated. These data were used to provide a histogram representing the average num- ber of accesses to the site during each hour of the day (in the following Average Traffic Profile). Figure 5 illustrates the results.

I

Figure 5: The average number of access during the 24 hours

Then the' CBMGmaker was used to create, from the log file, four synthetic user sessions to describe a user surfing the web site, so as to mimic.the real user behavioral patterns. Each of the patterns was characterized. The first pattern represented the user entering the site and immediately exiting. The second one described a user entering the site, browsing to the registration form, and exiting 'after the registration. The third one described a user surfing to browse the conference program (skim reading). The fourth one describer an in-depth visit to the site.

The aim of this analysis was to produce a few -

20

Page 8: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

CBMG-scripts to be feeded to the stressing tool hosted on the client. Moreover, a further script was gener- ated (in the following Random-Script), simulating a random access to the web pages of the site. Test execution

The test were carried out to collect the following data: (1) Number of active users vs. elapsed time; (2) HTTP response time vs. elapsed time (3) CPU used vs. elapsed time.

The measurements have been carried out in two phases. In the first one the Random-Script was used, i.e., standard usage of openSTA. For the first group of measurements the stressing clients were configured to generate a number of virtual users reproducing the Average Traffic Profile. The second group of measure- ments was carried out using a traffic profile in which each channel had a height three times the height of the corresponding channel in the Average Traffic Profile. For lack of space, in Figure 6 we illustrated only the results for the second group of experiments without reporting CPU utilization metr ia .

L I U N . we" **. a p s e * rhlr l-Adre G.SS -liW,.c A& L**. 1

U

,* ................... ........................................

.......................

..I a l l , ..". a.,. 1111 1.2. ,U, *..,I *%a> *l.Jl I.."

-"411-

reproducing the Average Traffic Profile and then a traffic distribution having three times the height of the Average Traffic Profile.

Figures 7 show the results for the part of the exper- iment comparable with metrics reported in Figure 6.

Figure 7: Second group of experiments using realistic benchmarking

I I

Figure 6 Second group of experiments benchmarking the site with just one session

During the second phase the above described CBMG-scripts were used. Again, the stressing clients were first configured to generate a number of requests

Experiment discussion The experiments results show that when a stressing

tools use CBMG-scripts instead of Random-scripts, the generated traffic obtains a higher average response times than the classic approach (Figures 6 and 7) and higher peaks were reached more often in the CBMG case than in the Random one.

CBMG-based approach is much more pessimistic. For that reason, it should be considered as an improve- ment of the existing stressing framework because let the administrators to avoid unexpected crisis.

7 Future Work and Conclusion A Workload Characterization based on Customer

Behavior Model Graphs has been proposed to improve existing web stressing tools: using a CBMG model, it is possible to produce a realistic workload instead of stressing the web site with a random selected session to be replicated over and over. . ,

21

Page 9: [IEEE 2002 IEEE International Workshop on Workload Characterization - Austin, TX, USA (2002.11.25-2002.11.25)] 2002 IEEE International Workshop on Workload Characterization - Benchmarking

A prototype has been implemented permitting the analyst to produce CBMGs from log files and stressing scripts from CBMGs.

Future planned work is to compare a CBMG-based tool with other important web benchmarking pack- ages based on workload characterization as SURGE and SPECweb99. Authors would also like to analyze generated traffic against known web traffic invariants, in order to find out how much this synthetic work- load can be considered representative for a real traffic. Moreover, a study of secure web server (https) work- load is planned in order t o generalize our results in a wider and more complex environment. Finally, we plan to apply this entire methodology to some impor- tant case study and to compare predicted performance with actual measured metrics.

8 Acknowledgment This paper has been realized at WTLAB, a con-

sortium between the Dipartimento di Informatica at the University of Turin and CSP. Authors would like also to thank Roberto Borri, Matteo Sereno, Francesco Bergadano, Monica Resemini and anonymous review- ers for their helpful comments and suggestions.

References [l]:V. Almeida, A. Bestavros, M. Crovella and A.

de Oliveira, “Characterizing reference locality in the W W W . In Proc. of 1996 Intern. Conference on Parallel and Distributed Information Systems (PDIS’SG), pages 92-103, December 1996

.

[Z], Analog, http://ww. analog.cz

[3] M. Arlitt and C. Williamson, “Web server work- load characterization: the search for.inwariants.. In Proc. of the 1996 ACM SIGMETRICS Confer- ence, Philadelphia, PA. ACM Press . .

[4]: P. Barford and M.E. Crovella. “Generating Rep- resentative Web Workloads for Network and Server Performance Evaluation., Proc. of the 1998 ACM SIGMETRICS Conference. Philadel- phia, PA. ACM Press

[5j P. Barford, M.E. Crovella. “Critical Path analysis of TCP tmnsactios. IEEEfACM Trans. Network- ing, 9 (3):238248, June 2001.

161 L. Bertolotti, M. Calzarossa. “Workload Charac-

[7] M. Calzarossa, G. Serazzi. “Workload Characteri- zation: A Survey”, Proc. of the IEEE, 81(8):1136 1150. 1993

[SI W. Cardellini, M. Colajanni, P.S. Yu. “Dynamic load balancing on Web Server systems, IEEE In- ternet Computing, 3(3):2839, May-June, 1999.

[9] M.E. Crovella and A. Bestavros. “Self-Similarity in World Wide Web Trafic: Evidence and Possi- ble Causes., Proc. of the 1996 ACM SIGMETRICS Conference. pp.151-160, July 1996. ACM Press. IEEE Internet Computing, 3(3):28-39, May-June, 1999.

[lo] D. Ferrari, G. Serazzi, and A. Zeigner. Measure ment and ‘hning of Computer Systems. Prentice- Hall, 1983.

(111 M.H. MacDougall, “Simulation Computer Sys- tems: Techniques and Tools., Cambridge, MA: MIT Press, 1987.

[12] Meuasce’, D. A., and V. A. F. Almeida “Scal- ing for E-Business: technologies, models, perfor- mance, and capacity planning”, Prentice Hall, U p per Saddle River, NJ, 2000.

[13] Menace’, D. A., V. A. F. Almeida, R. C. Fon- seca, and M.A. Mendes, ”A Methodology for Work- load Characterization for E-commerce Servers”, in ”Proc. 1999 ACM Conference in Electronic Com- merce”, Denver, CO, Nov. 1999

[14] Mercury Interactive Load Runner , - h t t p : / / w w w . m e r c u r y i n t e m c t i u e . c o m / p m d n c “ ~ ~

[15] OpenSTA (Open System Testing Architecture), http://unuzu.opensta. org

[16] J. E..Pitkow, “Summary of W W W Chamcteriza- tion, The Web Journal, 2000.

[17] Standard Performance Evalutation Corporation, http://unuzu. spec bench. o q

[18] O.R. Zaine, M. Xin, and J. Han, ”Discower- ing Web Access Patterns and Trends by Apply- ing OLAP and Data Mining Technology on Web Logs”, in Proc. of Advances in Digital Libraries Conf. (ADL’98), Santa Barbara, CA, April 1998, pp. 19 - 29.

. . terization of Mail Servers”, Proc. of SPECTS’2000, July 16-20, 2000, Vancouver, Canada

22