httperf

9
httperf—A Tool for Measuring Web Server Performance David Mosberger HP Research Labs [email protected], http://www.hpl.hp.com/personal/David Mosberger/ Tai Jin HP Research Labs [email protected], http://www.hpl.hp.com/personal/Tai Jin/ Abstract This paper describes httperf, a tool for measuring web server performance. It provides a flexible facil- ity for generating various HTTP workloads and for measuring server performance. The focus of httperf is not on implementing one particular benchmark but on providing a robust, high-performance tool that facilitates the construction of both micro- and macro-level benchmarks. The three distinguishing characteristics of httperf are its robustness, which includes the ability to generate and sustain server overload, support for the HTTP/1.1 protocol, and its extensibility to new workload generators and performance measurements. In addition to report- ing on the design and implementation of httperf this paper also discusses some of the experiences and insights gained while realizing this tool. 1 Introduction A web system consists of a web server, a number of clients, and a network that connects the clients to the server. The protocol used to communicate be- tween the client and server is HTTP [2]. In order to measure server performance in such a system it is necessary to run a tool on the clients that gen- erates a specific HTTP workload. Currently, web server measurements are conducted using bench- marks such as SPECweb or WebStone [6, 8] which simulate a fixed number of clients. Given that the potential user base of an Internet-based server is on the order of hundreds of millions of users it is clear that simulating a fixed and relatively small num- ber of clients is often insufficient. For this rea- son, Banga and Druschel [1] recently argued the case for measuring web servers with tools that can generate and sustain overload, which is effectively equivalent to simulating an infinite user base. They also presented a tool called “s-clients” that is ca- pable of generating such loads. The s-clients ap- proach is similar to httperf in that both are capa- ble of sustaining overload but they differ signifi- cantly in design and implementation. For example, httperf completely separates the issue of how to per- form HTTP calls from issues such as what kind of workload and measurements should be used. Con- sequently, httperf can readily be used to perform various kinds of web-server related measurements, including SPECweb/WebStone-like measurements, s-client-like measurements, or new kinds of mea- surements such as the session-based measurements that we will discuss briefly in Section 4. Creating httperf turned out to be a surprisingly dif- ficult task due to factors inherent in the problem and shortcomings of current operating systems (OSes). The first challenge is that a web system is a dis- tributed system and as such inherently more diffi- cult to evaluate than a centralized system that has little or no concurrency and a synchronized clock.

Upload: douglas-harrison

Post on 26-Oct-2014

104 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: httperf

httperf—A Tool for MeasuringWeb Server Performance

David MosbergerHP Research Labs

[email protected], http://www.hpl.hp.com/personal/David Mosberger/

Tai JinHP Research Labs

[email protected], http://www.hpl.hp.com/personal/TaiJin/

Abstract

This paper describes httperf, a tool for measuringweb server performance. It provides a flexible facil-ity for generating various HTTP workloads and formeasuring server performance. The focus of httperfis not on implementing one particular benchmarkbut on providing a robust, high-performance toolthat facilitates the construction of both micro- andmacro-level benchmarks. The three distinguishingcharacteristics of httperf are its robustness, whichincludes the ability to generate and sustain serveroverload, support for the HTTP/1.1 protocol, andits extensibility to new workload generators andperformance measurements. In addition to report-ing on the design and implementation of httperf thispaper also discusses some of the experiences andinsights gained while realizing this tool.

1 Introduction

A web system consists of a web server, a numberof clients, and a network that connects the clients tothe server. The protocol used to communicate be-tween the client and server is HTTP [2]. In orderto measure server performance in such a system itis necessary to run a tool on the clients that gen-erates a specific HTTP workload. Currently, webserver measurements are conducted using bench-

marks such as SPECweb or WebStone [6, 8] whichsimulate a fixed number of clients. Given that thepotential user base of an Internet-based server is onthe order of hundreds of millions of users it is clearthat simulating a fixed and relatively small num-ber of clients is often insufficient. For this rea-son, Banga and Druschel [1] recently argued thecase for measuring web servers with tools that cangenerate and sustain overload, which is effectivelyequivalent to simulating an infinite user base. Theyalso presented a tool called “s-clients” that is ca-pable of generating such loads. The s-clients ap-proach is similar to httperf in that both are capa-ble of sustaining overload but they differ signifi-cantly in design and implementation. For example,httperf completely separates the issue of how to per-form HTTP calls from issues such as what kind ofworkload and measurements should be used. Con-sequently, httperf can readily be used to performvarious kinds of web-server related measurements,including SPECweb/WebStone-like measurements,s-client-like measurements, or new kinds of mea-surements such as the session-based measurementsthat we will discuss briefly in Section 4.

Creating httperf turned out to be a surprisingly dif-ficult task due to factors inherent in the problem andshortcomings of current operating systems (OSes).The first challenge is that a web system is a dis-tributed system and as such inherently more diffi-cult to evaluate than a centralized system that haslittle or no concurrency and a synchronized clock.

Page 2: httperf

Second, HTTP in general and HTTP/1.0 in partic-ular cause connection usage patterns that TCP wasnot designed for. Some of these problems have beenfixed in response to the experience gained from run-ning web servers. However, since tools such ashttperf run on the client-side they exercise the sys-tem in a manner that is quite different from the wayservers do. As a consequence, there are a number ofadditional issues that such a test tool needs to guardagainst.

A third challenge is that the world-wide web is ahighly dynamic system. Almost every part in it—server and client software, network infrastructure,web content, and user access pattern—is subject torelatively frequent and fundamental changes. Fora test tool to remain useful over a period of timerequires a design that makes it relatively easy to ex-tend and modify as need arises.

The rest of this paper is organized as follows: thenext section gives a brief introduction on how touse httperf. Section 3 describes the overall designof the tool and presents the rationale for the mostimportant design choices. Section 4 discusses thecurrent state of httperf and some of the more subtleimplementation issues discovered so far. Finally,Section 5 presents some concluding remarks.

2 An Example of Using httperf

To convey a concrete feeling of how httperf is used,this section presents a brief example of how to mea-sure the request throughput of a web server. Thesimplest way to achieve this is to send requests tothe server at a fixed rate and to measure the rateat which replies arrive. Running the test severaltimes and with monotonically increasing requestrates, one would expect to see the reply rate leveloff when the server becomes saturated, i.e., when itis operating at its full capacity.

To execute such a test, it is necessary to invokehttperf on the client machines. Ideally, the toolshould be invoked simultaneously on all clients, butas long as the test runs for several minutes, startup

differences in the range of seconds do not cause sig-nificant errors in the end result. A sample commandline is shown below:

httperf --server hostname \--port 80 --uri /test.html \--rate 150 --num-conn 27000 \--num-call 1 --timeout 5

This command causes httperf to use the webserver on the host with IP namehostname, run-ning at port 80. The web page being retrieved is“ /test.html ” and, in this simple test, the samepage is retrieved repeatedly. The rate at which re-quests are issued is 150 per second. The test in-volves initiating a total of 27,000 TCP connectionsand on each connection one HTTP call is performed(a call consists of sending a request and receivinga reply). The timeout option selects the number ofseconds that the client is willing to wait to hear backfrom the server. If this timeout expires, the tool con-siders the corresponding call to have failed. Notethat with a total of 27,000 connections and a rateof 150 per second, the total test duration will be ap-proximately 180 seconds, independent of what loadthe server can actually sustain.

Once a test finishes, several statistics are printed.An example output of httperf is shown in Figure 1.The figure shows that there are six groups of statis-tics, separated by blank lines. The groups consistof overall results, results pertaining to the TCP con-nections, results for the requests that were sent, re-sults for the replies that were received, CPU andnetwork utilization figures, as well as a summary ofthe errors that occurred (timeout errors are commonwhen the server is overloaded).

A typical performance graph that can be obtainedwith the statistics reported by httperf is shown inFigure 2. For this particular example, the serverconsisted of Apache 1.3b2 running on a HP Net-Server with one 200MHz P6 processor. The serverOS was Linux v2.1.86. The network consisted of a100baseT Ethernet and there were four client ma-chines running HP-UX 10.20. As the top-mostgraph shows, the achieved throughput increases lin-early with offered load until the server starts to be-

Page 3: httperf

Total: connections 27000 requests 26701 replies 26701 test-duration 179.996 s

Connection rate: 150.0 conn/s (6.7 ms/conn, <=47 concurrent connections)Connection time [ms]: min 1.1 avg 5.0 max 315.0 median 2.5 stddev 13.0Connection time [ms]: connect 0.3

Request rate: 148.3 req/s (6.7 ms/req)Request size [B]: 72.0

Reply rate [replies/s]: min 139.8 avg 148.3 max 150.3 stddev 2.7 (36 samples)Reply time [ms]: response 4.6 transfer 0.0Reply size [B]: header 222.0 content 1024.0 footer 0.0 (total 1246.0)Reply status: 1xx=0 2xx=26701 3xx=0 4xx=0 5xx=0

CPU time [s]: user 55.31 system 124.41 (user 30.7% system 69.1% total 99.8%)Net I/O: 190.9 KB/s (1.6*10ˆ6 bps)

Errors: total 299 client-timo 299 socket-timo 0 connrefused 0 connreset 0Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

Figure 1: Example of Basic Performance Statistics

1ms

4ms

16ms

64ms

256ms

1024ms

4096ms

0 500 1000 1500 2000 0%

20%

40%

60%

80%

100%

100200

300399

497594

672778

792792

788781

780776

772768

768763

761758

Linux 100bt 1K queue=32 procs=32 clients=4 duration=180s timeout=1s CPUs=1

Reply Rate [1/s]Errors [%]

Response time [ms]

Figure 2: Example Server Performance Graph

Page 4: httperf

come saturated at a load of 800 calls per second. Asoffered load is increased beyond that point, serverthroughput starts to fall off slightly as an increas-ing amount of time is spent in the kernel to handlenetwork packets for calls that will fail eventually(due to client timeouts). This is also reflected in theerror graph, which shows the percentage of callsthat failed: once the server is saturated, the num-ber of calls that fail increases quickly as more andmore calls experience excessive delays. The thirdand final graph in the figure shows the average re-sponse time for successful calls. The graph showsthat response time starts out at about 2ms and thengradually increases until the server becomes satu-rated. Beyond that point, response time for suc-cessful calls remains largely constant at 43ms percall.

3 Design

The two main design goals of httperf were (a) pre-dictable and good performance and (b) ease of ex-tensibility. Good performance is achieved by im-plementing the tool in C and paying attention to theperformance critical execution paths. Predictabilityis improved by relying as little as possible on theunderlying OS. For example, httperf is designed torun as a single-threaded process using non-blockingI/O to communicate with the server and with oneprocess per client machine. With this approach,CPU scheduling is trivial for the OS which mini-mizes the risk of excessive context switching andpoor scheduling decisions. Another example istimeout management: rather than depending onOS-mechanisms, httperf implements its own, spe-cialized and light-weight timer management facil-ity that avoids expensive system calls and POSIXsignal delivery wherever possible.

Based on experiences with an earlier test tool, itwas clear that httperf will undergo fairly exten-sive changes during its lifetime. To accommodatethis need for change, httperf is logically dividedinto three different parts: the core HTTP engine,workload generation, and statistics collection. The

HTTP engine handles all communication with theserver and as such takes care of connection man-agement and HTTP request generation and replyhandling. Workload generation is responsible forinitiating appropriate HTTP calls at the appropri-ate times so a particular workload is induced on theserver. The third part, statistics collection, is re-sponsible for measuring various quantities and pro-ducing relevant performance statistics. Interactionsbetween these three parts occur through a simpleyet general event signalling mechanism. The ideais that whenever something interesting occurs in-side httperf, an event is signalled. Parties interestedin observing a particular event can register a handlerfor the event. These handlers are invoked wheneverthe event is signalled. For example, the basic statis-tics collector measures the time it takes to estab-lish a TCP connection by registering events handlerfor the events that signal the initiation and estab-lishment of a connection, respectively. Similarly,a workload generator responsible for generating aparticular URL access pattern can register a han-dler for the event indicating the creation of a newcall. Whenever this handler gets invoked, the URLgenerator can insert the appropriate URL into thecall without having to concern itself with the otheraspects of call creation and handling.

3.1 Sustaining Overload

As alluded to earlier, an important design issue ishow to sustain an offered load that exceed the ca-pacity of the web server. The problem is that oncethe offered rate exceeds the server’s capacity, theclient starts building up resources at a rate that isproportional to the difference between offered andsustained rate. Since each client has only a finiteamount of resources available, sooner or later theclient would run out of resources and therefore beunable to generate any new requests. For example,suppose that each httperf process can have at most2,000 TCP connection open at any given time. Ifthe difference between offered and sustained rate is100 requests per second, a test could last at most20 seconds. Since web server tests usually requireminutes to reach a stable state, such short test du-

Page 5: httperf

rations are unacceptable. To solve this problem,httperf times out calls that have been waiting fora server response for too long. The length of thistimeout can be selected through command-line op-tions.

With this timeout approach, the amount of client re-sources used up by httperf is bounded by the time-out value. In the worst case scenario where theserver does not respond at all, httperf will neveruse more than the amount of resources consumedwhile running httperf for the duration of the time-out value. For example, if connections are initiatedat a rate of 100 per second and the timeout is 5 sec-onds, at most 500 connections would be in use atany given time.

3.1.1 Limits to Client-Sustainable Load

It is interesting to consider just what exactly limitsthe offered load a client can sustain. Apart from theobvious limit that the client’s CPU imposes, thereis a surprising variety of resources that can becomethe first-order bottleneck. It is important to keepthese limits in mind so as to avoid the pitfall ofmistakingclient performance limits asserverper-formance limits. The three most important clientbottlenecks we have identified so far are describedbelow.

Size of TCP port space:TCP port numbers are 16bits wide. Of the 64K available port numbers,1,024 are typically reserved for privileged pro-cesses. This means that a client machine run-ning httperf can make use of at most 64,512port numbers. Since a given port numbercannot be reused until the TCP TIMEWAITstate expires, this can seriously limit the clientsustainable offered rate. Specifically, with a1 minute timeout (common for BSD-derivedOSes) the maximum sustainable rate per clientis about 1,075 requests per second. With theRFC-793 [5] recommended value of 4 min-utes, the maximum rate would drop to just 268requests per second.

Number of open file descriptors: Most operatingsystems limit both thetotal and per-processnumber of file descriptors that can be opened.The system-wide number of open files is nor-mally not a limiting factor and hence we willfocus on the latter. Typical per-process lim-its are in the range from 256 to 2,048. Sincea file descriptor can be reused as soon as anearlier descriptor has been closed, the TCPTIME WAIT state plays no role here. In-stead, the duration that is of interest here isthe httperf timeout value. Assuming a valueof 5 seconds and a limit of 2,000 open file de-scriptors per process, a maximum rate of about400 requests per second could be sustained.If this becomes the first-order bottleneck in aclient, it is possible to avoid it either by tuningthe OS to allow a larger number of open filedescriptors or by decreasing the httperf time-out value. Note that decreasing the timeoutvalue effectively truncates the lifetime distri-bution of TCP connections. This effect has tobe taken into consideration when selecting anappropriate value. Another seemingly obvioussolution would be to run multiple processeson a single machine. However, as will be ex-plained in Section 4.1, there are other reasonsthat make this approach undesirable.

Socket buffer memory: Each TCP connectioncontains a socket receive and send buffer. Bydefault, httperf limits send buffers to 4KBand receive buffers to 16KB. With limits inthe kilobyte range, these buffers are typicallythe dominant per-connection costs as far ashttperf memory consumption is concerned.The offered load that a client can sustain istherefore also limited by how much memoryis available for socket buffers. For example,with 40MB available for socket buffers, aclient could sustain at most 2,048 concurrentTCP connections (assuming a worst-casescenario where all send and receive buffers arefull). This limit is rarely encountered, but formemory-constrained clients, httperf supportsoptions to select smaller limits for the send-and receive-buffers.

Page 6: httperf

The above list of potential client performance bot-tlenecks is of course by no means exhaustive. Forexample, older OSes often exhibit poor perfor-mance when faced with several hundred concurrentTCP connections. Since it is often difficult to pre-dict the exact rate at which a client will start to be-come the performance bottleneck, it is essential toempirically verify that observed performance is in-deed a reflection of the server’s capacity and notthat of the client’s. A safe way to achieve this is tovary the number of test clients, making sure that theobserved performance is independent of the numberof client machines that participate in the test.

3.2 Measuring Throughput

Conceptually, measuring throughput is simple: is-sue a certain number of requests, count the numberof replies received and divide that number by thetime it took to complete the test. This approach hasunfortunately two problems: first, to get a quanti-tative idea of the robustness of a particular mea-surement, it is necessary to run the same test sev-eral times. Since each test run is likely to take sev-eral minutes, a fair amount of time has to be spentto obtain just a single data point. Equally impor-tant, computing only one throughput estimate forthe entire test hides variations that may occur attime scales shorter than that of the entire test. Forthese reasons, httperf samples the reply throughputonce every five seconds. The throughput samplescan optionally be printed in addition to the usualstatistics. This allows observing throughput duringall phases of a test. Also, with a sample period of5 seconds, running a test for at least 3 minutes re-sults in enough throughput samples that confidenceintervals can be computed without having to makeassumptions on the distribution of the samples [3].

4 Implementation

In this section, we first present the capabilities ofthe current version of httperf and then we discusssome of the more subtle implementation issues dis-

covered so far. In the third part, we mention somepossible future directions for httperf.

The HTTP core engine in httperf currently supportsboth HTTP/1.0 and HTTP/1.1. Among the moreinteresting features of this engine are support for:persistent connections, request pipelining, and the“chunked” transfer-encoding [2, 4]. Higher-levelHTTP processing is enabled by the fact that the en-gine exposes each reply header-line and all of thereply body to the other parts of httperf by signallingappropriate events. For example, when one of theworkload generators required simple cookie sup-port, the necessary changes were implemented andtested in a matter of hours.

The current version of httperf supports two kinds ofworkload generators: request generators and URLgenerators.

Request Generation: Request generators initiateHTTP calls at the appropriate times. Atpresent, there are two such generators: the firstone generates new connections deterministi-cally and at a fixed rate and each connectionis used to perform a command-line specifiednumber of pipelined HTTP calls. By default,the number of pipelined calls per connectionis one, which yields HTTP/1.0-like behaviorin the sense that each connection is used for asingle call and is closed afterwards.

The second request generator createssessionsdeterministically and at a fixed rate. Each ses-sion consists of a specified number of call-bursts that are spaced out by the command-linespecifieduser think-time. Each call-burst con-sists of a fixed number of calls. Call-burstsmimic the typical browser behavior where auser clicks on a link which causes the browserto first request the selected HTML page andthen the objects embedded in it.

URL Generation: URL generators create the de-sired sequence of URLs that should be ac-cessed on the server. The most primitive gen-erator simply generates the same, command-line specified URL over and over again.

Page 7: httperf

The second generator walks through a fixed setof URLs at a given rate. With this generator,the web pages are assumed to be organized asa 10ary directory tree (each directory containsup to ten files or sub-directories) on the server.This generator is useful, for example, to in-duce a specific file buffer cache miss rate onthe server under test.

As far as statistics collectors are concerned, httperfalways collects and prints the basic informationshown in Figure 1. The only other statistics collec-tor at this time is one that collects session-relatedinformation. It measures similar quantities as thebasic connection statistics with the main differencebeing that the unit of measurement is the sessioninstead of the connection.

We now proceed to discuss some of the implemen-tation issues that conspire to raise the difficulty towrite a robust high-performance test tool.

4.1 Scheduling Granularity

The process scheduling granularity of today’s OSesis in the millisecond range. Some support one mil-lisecond, but most use a timer tick of around 10 mil-liseconds. This often severely limits the accuracywith which a given workload can be generated. Forexample, with a timer tick of 10 milliseconds, de-terministically generating a rate of 150 requests persecond would have to be implemented by sendingone request during even-numbered timer ticks andtwo requests during odd-numbered ticks. While theaverage rate is achieved, the bursts sent during theodd-number ticks could cause server-queue over-flows that in turn could severely affect the observedbehavior. This is not to say that measuring webservers with bursty traffic is a bad idea (quite theopposite is true), however, the problem here is thatburstiness was introduced due to the OS, not be-cause the tester requested it.

To avoid depending on OS scheduling granularity,httperf executes in a tight loop that checks for net-work I/O activity viaselect()and keeps track of real

time via gettimeofday(). This means that httperfconsumes all available CPU cycles (on a multipro-cessor client, only one CPU will be kept busy inthis way). This approach works fine because theonly other important activity is the asynchronousreceiving and processing of network packets. Sincethis activity executes as a (soft-) interrupt handler,no scheduling problem arises. However, executingin a tight loop does imply that only one httperf pro-cess can run per client machine (per client CPU, tobe more precise). It also means that care should betaken to avoid unnecessary background tasks on theclient machine while a test is in progress.

4.2 Limited Number of Ephemeral Ports

Many TCP implementations restrict the TCP portsavailable to sockets that are not bound to a spe-cific local address to the so-called ephemeral ports[7]. Ephemeral ports are typically in the range from1,024 to 5,000. This has the unfortunate effectthat even moderate request rates may cause a testclient to quickly run out of port numbers. For ex-ample, assuming a TIMEWAIT state duration ofone minute, the maximum sustainable rate wouldbe about 66 requests per second.

To work around this problem, httperf can optionallymaintain its own bitmap of ports that it believes tobe available. This solution is not ideal because thebitmap is not guaranteed to be accurate. In otherwords, a port may not be available, even thoughhttperf thinks otherwise. This can cause additionalsystem calls that could ordinarily be avoided. It isalso suboptimal because it means that httperf dupli-cates information that the OS kernel has to maintainat any rate. While not optimal, the solution workswell in practice.

A subtle issue in managing the bitmap is the or-der in which ports are allocated. In a first imple-mentation, httperf reused the most recently freedport number as soon as possible (in order to min-imize the number of ports consumed by httperf).This worked well as long as both the client andserver machines were UNIX-based. Unfortunately,

Page 8: httperf

a TCP incompatibility between UNIX and NTbreaks this solution. Briefly, the problem is thatUNIX TCP implementations allow pre-empting theTIME WAIT state if a new SYN segment arrives.In contrast, NT disallows such pre-emption. Thishas the effect that a UNIX client may consider itlegitimate to reuse a given port at a time NT con-siders the old connection still to be in TIMEWAITstate. Thus, when the UNIX client attempts to cre-ate a new connection with the reused port number,NT will respond with a TCP RESET segment thatcauses the connection attempt to fail. In the case ofhttperf this had the effect of dramatically reducingthe apparent throughput the NT server could sustain(half the packets failed with a “connection reset bypeer” error). This problem is avoided in the cur-rent version of httperf by allocating ports in strictround-robin fashion.

4.3 Slow System Calls

A final issue with implementing httperf is that evenon modern systems, some OS operations are rel-atively slow when dealing with several thousandTCP control blocks. The use of hash-tables to lookup TCP control blocks for incoming network trafficis standard nowadays. However, it turns out that atleast some BSD-derived systems still perform lin-ear control block searches for thebind() andcon-nect()system calls. This is unfortunate because inthe case of httperf, these linear searches can easilyuse up eighty or more percent of its total executiontime. This, once again, can severely limit the max-imum load that a client can generate.

Fortunately, this is an issue only when running a testthat causes httperf to close the TCP connection—aslong as the server closes the connection, no problemoccurs. Nevertheless, it would be better to avoidthe problem altogether. Short of fixing the OS, theonly workaround we have found so far is to changehttperf so it closes connections by sending a RE-SET instead of going through the normal connec-tion shutdown handshake. This workaround may beacceptable for certain cases, but should not be usedin general. The reason is that closing a connection

via a RESET may cause data corruption in futureTCP connections or, more likely, can lead to need-lessly tying up server resources. Also, a RESETartificially lowers the cost of closing a connection,which could lead to overestimating a server’s ca-pacity. With these reservation in mind, we observein passing that at least one popular web browser(IE 4.01) appears to be closing connections in thismanner.

4.4 Future Directions

In its current form httperf is already useful for per-forming several web server measurement tasks butits development has by no means come to a halt. In-deed, there are several features that are likely to beadded. For example, we believe it would be use-ful to add a workload generator that attempts tomimic the real-world traffic patterns observed byweb servers. To a first degree of approximation,this could be done by implementing a SPECweb-like workload generator. Another obvious and use-ful extension would be to modify httperf to allowlog file based URL generation. Both of these ex-tensions can be realized easily thanks to the event-oriented structure of httperf.

Another fruitful direction would be to modifyhttperf to make it easier to run tests with multi-ple clients. At present, it is the tester’s responsi-bility to start httperf on each client machine andto collect and summarize the per-client results. Adaemon-based approach where a single commandline would control multiple clients could be a firststep in this direction.

5 Conclusions

The experience gained from designing and imple-menting httperf clearly suggests that realizing a ro-bust and high-performance tool for assessing webserver performance is a non-trivial undertaking.Given the increasing importance of the web and theneed for quantitative performance analysis, the im-

Page 9: httperf

portance of such tools will continue to increase aswell. While httperf is certainly not a panacea, it hasproven useful in a number of web-related measure-ment tasks and is believed to be flexible and per-formant enough that it could provide a solid foun-dation to realize macro-level benchmarks such asSPECweb. For those cases where httperf may notbe the solution of choice, we hope that the experi-ence and lessons reported in this paper will be help-ful in avoiding the most common pitfalls in measur-ing web server performance.

Acknowledgments

We would like to thank the anonymous reviewersfor their helpful comments. Our thanks also go toRick Jones, Rich Friedrich, and Gita Gopal for re-viewing and commenting on draft versions of thispaper on extremely short notice.

Availability

The httperf tool is available in source code form andfree of charge from the following URL:

ftp://ftp.hpl.hp.com/pub/httperf/

References

[1] Gaurav Banga and Peter Druschel. Measuringthe capacity of a web server. InUSENIX Sym-posium on Internet Technologies and Systems,pages 61–71, Monterey, CA, December 1997.http://www.usenix.org/publications/library/proceedings/usits97/banga.html.

[2] R. Fielding, J. Gettys, J. Mogul, H. Frystyk,and T. Berners-Lee.Hypertext Transfer Pro-tocol – HTTP/1.1. Internet Engineering TaskForce, January 1997.ftp://ftp.internic.net/rfc/rfc2068.txt.

[3] Rai Jain. The Art of Computer Systems Per-formance Analysis. John Wiley & Sons, NewYork, NY, 1991.

[4] H. Nielsen, J. Gettys, A. Baird-Smith,E. Prud’hommeaux, Hakon W. Lie, andC. Lilley. Network performance effects ofHTTP/1.1, CSS1, and PNG. InProceedings ofSIGCOMM ’97 Symposium, pages 155–166,Cannes, France, October 1997. Association ofComputing Machinery.http://www.acm.org/sigcomm/sigcomm97/papers/p102.html.

[5] Jon Postel. Transmission Control Protocol.DARPA, September 1981.ftp://ftp.internic.net/rfc/rfc793.txt.

[6] SPEC. An explanation of the SPECweb96benchmark, December 1996.http://www.specbench.org/osg/web96/webpaper.html.

[7] Richard W. Stevens.TCP/IP Illustrated: TheProtocols, volume 1. Addison-Wesley, Read-ing, MA, 1994.

[8] Gene Trent and Mark Sake. WebSTONE: Thefirst generation in HTTP server benchmarking,February 1995.http://www.mindcraft.com/webstone/paper.html.