copyright © 2005 hyperformix, inc. moving beyond test and guess using modeling with load testing to...

Copyright © 2005 HyPerformix, Inc.

Moving Beyond Test and Guess

Using Modeling with Load Testingto Improve

Web Application Readiness

Keith Smith

Authors: Richard Gimarc, Amy Spellmann & Jim Reynolds

HyPerformix, Inc.

Kansas City CMG 2005

Copyright © 2005 HyPerformix, Inc. 2

Agenda

• Load testing goals & effectiveness• Bridging the gap from test to production• Natural partnership: Load Testing & Modeling• Case study – FMStocks

– Data collection & analysis– Model construction & validation– What-If scenario evaluation– Evaluating predictive accuracy

• Summary


Why Load Test?

• Verify that applications meet SLAs for availability and performance

• Mitigate risk• Provide assurance that applications will

deploy successfully• Demonstrate application readinessapplication readiness


Application Readiness

• The practice of fully preparing a business software application for deployment to customers, partners and internal users

• The goal of application readinessapplication readiness is to verify that an application provides – Functionality that meets user needs – Accuracy to consistently produce the correct results– Reliability to perform according to SLAs


Load Testing Goals

• Software– Ensure applications are readyapplications are ready to be deployed from a

performance & functionality perspective

• Hardware– Verify hardware requirements for production

• Tuning– Tune/optimize application performance


Load Testing Challenges

• Time consuming– Significant time required to perform tests limits the number of

configurations that can be evaluated– Breadth of exploration is limited by time and test environment

• Expensive– Test environments that mirror production can be prohibitively

expensive– Difficult to match production workloads, databases, mainframes,

etc.

• Predicting human behavior– Difficult to predict exactly how the system will be used– Which use cases will mimic reality?


Modeling Improves Load Test Effectiveness

• Time– Tasks that take hours in Test take minutes to model

– Utilizes a scientific, repeatable methodology

– Quickly evaluates hardware and software configurations

– Guides the load test

– Increases the scope and coverage of Test

• Money– The model is a virtual test lab

– Does not require hardware to mirror production

– Extrapolates test to production

• Future– Rapid analysis of new & different usage scenarios

– Provides ongoing capacity planning capability


Value of Modeling with Load Testing

• “We saved $500k in hardware investments for our production environment by combining modeling with our load testing effort.”

• “We identified excess server capacity for the planned application deployment – which we were able to use for other capacity constrained applications.”


The BIG Question

If– You have successfully completed your load testing– Your load test environment does not mirror production– You were time-constrained, did not evaluate all usage scenarios

Then– Are you confident your production deployment will succeed?

What do you do?– Test & Guess – Apply Performance Modeling


Load Test Decision Points: Where Modeling Assists

Extrapolatefrom the test environment

to production

Determinethe most cost effectiveusage scenarios andtest configurations

Design Production Test Develop


Coverage of Usage Scenarios

• Load testing should focus on the most critical scenario(s)– High frequency– Resource intensive

• Challenges– Which scenarios are most likely to match reality?– How does performance vary with different use cases?– What if end users do not actually use the application the way we

tested it?– Which configurations are likely to provide the best performance?

• Modeling efficiently evaluates different scenarios to guide the focus of the load test


Example of System Sensitivity to Usage Scenarios

0%

20%

40%

60%

80%

100%

Scenario 1 Scenario 2 Scenario 3 Scenario 4

Server Utilization

WebServer

App Server

DB Server


Bridging the Gap: Test to Production


Bridging the Gap #1 : Test Lab Topology


Bridging the Gap #1 : Production Topology

6 App Servers

PrivateWAN

2 data centers


Bridging the Gap #2: Test Lab Topology


Bridging the Gap #2 : Production Topology

5 App ServersV880 & E450

External Credit Auth Svc

Firewalls

DMZ

Internet


Load Testing & Modeling – A Natural Partnership

• Well-established practices– Load & Stress Testing– Performance Modeling

• Common goal– Assuring application performance readiness

• Different perspective & approach– Load Tester – deals with reality– Modeler – creates a representation of reality

Question: How do you integrate these two practices to form a synergistic partnership?


Two Views of Accuracy

Load Testers– Their tests are the “the ultimate in performance simulation”*– Mirrored production, then they are 100% accurate– Differs from production, there is no way to estimate its accuracy,

it’s guesswork

Modelers– Models do not have to be 100% accurate to be useful– A repeatable proven process is always better than a guess– Providing directional guidance requires 70-80%– Detailed response time analysis may require 85-95%

* Neil Gunther, “How to Get Unbelievable Load Test Results”


The Cost of Modeling

Additional data requirements– Business function flow– Business function cost per tier– Differences between test & production

Modeling is extra work– Requires running additional tests– Load testing offers a convenient & controlled environment– Extra data collection cost can be minimized with proper planning

Modeling isless expensive

thanmirroring production


Case Study - FMStocks

• Demonstrate how to integrate Modeling into the Load Testing process

• Describe the additional data collection tasks• Show how the model was built• Validate the model’s accuracy• Perform What-If scenarios to evaluate scalability• Evaluate the model’s predictive accuracy


FMStocks

Introducing FMStocks 2000: Scales to support

tens of thousands of concurrent users! Microsoft Windows 2000,

Microsoft SQL Server 7.0 and .Net Services.


FMStocks Terminology

• Business Functions – Transactions– Login– Buy– Sell– View Portfolio– View Summary– Logout

• Business Processes– Sequence of business functions

• View Account (80%)

• Stock Purchase (10%)

• Sell Stock (10%)

Stock Purchase

• Login• View Portfolio• Buy• View Portfolio• Logout

StockSell

• Login• Sell• View Portfolio• Logout

View Account

• Login• View Summary• View Portfolio• Logout


Model Data Requirements - WIFR

• Workload– Describes the load placed on the modeled system. Examples

include business function arrival rate, number of users, and response time.

• Infrastructure– Description of the hardware, network, and software subsystems.

This execution environment supports the application and workload.

• Flow– Documents the step-by-step progress of business functions

through the infrastructure.

• Resource– Resource requirements are needed that describe the processing

performed on each tier for each business function. Examples of resources are CPU seconds, I/Os, and memory.


Data Collection

• Tools– LoadRunner – Ethereal (LAN Analyzer)– Windows System Monitor (a.k.a. PerfMon)

• Three types of tests– Single business function trace: T1 tests

• Capture business function flow

– Single business function load test: T2 tests• Determine business function resource usage

– Traditional load tests: T3 tests• Validation targets


WIFR Population with the T Tests

SingleBusiness Function

Trace (T1)

SingleBusiness Function

Load Test (T2)

MixedBusiness FunctionStress Test (T3)

Workload

Infrastructure

Flow

Resource

Tier-to-tierbusiness functionflow

Business functionresource usage ateach tier (e.g.,CPU, I/O)

Hardwarecomponentutilization

Network bytestransmittedbetween tiers perbusiness function

Response timeand throughputfor T2 validation

Response timeand throughputfor T3 validation

Hardwarecomponentutilization

Interview andexamination ofload testenvironment

Pre-TestData Collection


FMStocks - Austin Test Lab

SES/strategizer

Client [2]

WebServer [2]

AppServer

DBServer

SwitchedEthernet

ClientWorkload [Vusers]AppServer_Subsystem

DBServer_Subsystem

WebServer_Subsystem

Load Scaled By Vusers

2 Load Drivers WebServer: [email protected] GHz

AppServer: [email protected] GHz

DBServer: [email protected] GHz

100 Mb LAN

FMStocks Application Processes


Data Collection: T1 Tests

Submit #1

Login View Summary

15Seconds

View Portfolio

15Seconds

Buy

15Seconds

Sell

15Seconds

Logout

15Seconds

60Seconds

• Created LoadRunner script to step through each business function• LoadRunner submitted the business functions as shown above• Network traffic collected using Ethereal• Transformed the raw packet trace into a business function flow

T1 Testing Time: 2 hours


T1 Result: Business Function Flow - Buy

1 0 :2 5 :1 4 .1 9 0

1 0 :2 5 :1 4 .1 9 5

1 0 :2 5 :1 4 .2 0 0

1 0 :2 5 :1 4 .2 0 5

1 0 :2 5 :1 4 .2 1 0

1 0 :2 5 :1 4 .2 1 5

1 0 :2 5 :1 4 .2 2 0

1 0 :2 5 :1 4 .2 2 5

1 0 :2 5 :1 4 .2 3 0

1 0 :2 5 :1 4 .2 3 5

1 0 :2 5 :1 4 .2 4 0

1 0 :2 5 :1 4 .2 4 5

1 0 :2 5 :1 4 .2 5 0

1 0 :2 5 :1 4 .2 5 5

1 0 :2 5 :1 4 .2 6 0

1 0 :2 5 :1 4 .2 6 5

1 0 :2 5 :1 4 .2 7 0

1 0 :2 5 :1 4 .2 7 5

1 0 :2 5 :1 4 .2 8 0

Tim e A ppSe rv e rC lie nt D B Se rv e rW e b Se rv e r

32 Turns 5 Turns4 Turns


Data Collection: T2 Tests

Login Buy Buy Buy Logout...

• Created LoadRunner script to drive single business functions• LoadRunner submitted the business functions as shown above• LoadRunner collected driver and server metrics• Use utilization and throughput to calculate resource usage

T2 Testing Time: 8 hours


T2 Result: Buy Throughput & Utilization

Step153_Run1_out[Action_Transaction\Bus. Func. per Sec]\\Action_Transaction\Trans. per Sec\Pass

0

10

20

30

40

50

60

70

80

90

100

2004/04/1214:58:33.600

2004/04/1215:01:26.400

2004/04/1215:04:19.200

2004/04/1215:07:12.000

2004/04/1215:10:04.800

2004/04/1215:12:57.600

2004/04/1215:15:50.400

2004/04/1215:18:43.200

Step153_Run1_out[AppServer\CPU Utilization]\\AppServer\% Processor Time (Processor _Total)

0

5

10

15

20

25

30

35

40

45

50

2004/04/1214:58:33.600

2004/04/1215:01:26.400

2004/04/1215:04:19.200

2004/04/1215:07:12.000

2004/04/1215:10:04.800

2004/04/1215:12:57.600

2004/04/1215:15:50.400

2004/04/1215:18:43.200

)secBF

(_

__)

BF

sec(_

TputBF

CPUNnUtilizatioCPUCPUBF

Apply Utilization Law to determine business function resource usage:

Trans Per SecTrans Per Sec App Server Utilization

2 Users 4 Users 6 Users 2 Users 4 Users 6 Users


FMStocks – Data Analysis & Model Creation

SES/strategizer

Client [2]

WebServer [2]

AppServer

DBServer

SwitchedEthernet

ClientWorkload [Vusers]AppServer_Subsystem

DBServer_Subsystem

WebServer_Subsystem

Data Analysis: 4 daysModel Creation: 1 day


Validation Methodology (T3 Tests)

Vuser counts of100, 200, 300, 400, 500, 600, 650, 700, 750

Perform T3 TestsPerform T3 Tests Server utilizationNetwork utilizationResponse TimeThroughput

DetermineDetermineLow, Medium & HighLow, Medium & High

Load LevelsLoad Levels

Extract T3 Validation MetricsExtract T3 Validation Metrics

Evaluate ModelEvaluate ModelUsing Same Load LevelsUsing Same Load Levels

Compare ResultsCompare ResultsModeled versus MeasuredModeled versus Measured

Successful Comparison?Successful Comparison?

Done

Refine Model&

Repeat You may have to refine the model to account for differences between measured and modeled results.

LoadRunner Drives T3 Tests

Collects Validation Metrics


FMStocks Validation Results

• Hardware utilization comparison was on target• Modeled throughput matched measurements• Modeled response times were not tracking with

measurements• Model predictions were always lower than the measured• Significant deviation at 600 Vusers• Indicates software contention (no hardware contention)• Further data analysis identified contention in the Web Server


Initial Validation Results – Throughput

Initial Validation ResultsMeasured Transaction Throughput

0

5

10

15

20

25

30

35

40

45

50

100 200 300 400 500 600 700 800

Number of Vusers

Tra

nsa

ctio

ns

Per

Sec

on

d

Login Buy Sell ViewPortfolio ViewSummary

Initial Validation ResultsModeled Transaction Throughput

0

5

10

15

20

25

30

35

40

45

50

100 200 300 400 500 600 700 800

Number of Vusers

Tra

nsa

ctio

ns

Per

Sec

on

d


Modeled transaction throughput matches measurements.


Initial Validation – CPU & Network Utilization

App Server - CPU Utilization

0%

20%

40%

60%

80%

100%

100 200 300 400 500 600 650 700 750

%C

PU

Uti

liza

tio

n

Measured Modeled

DB Server - CPU Utilization

0%

20%

40%

60%

80%

100%

100 200 300 400 500 600 650 700 750

%C

PU

Uti

liza

tio

n

Measured Modeled

Web Server - CPU Utilization

0%

20%

40%

60%

80%

100%

100 200 300 400 500 600 650 700 750

%C

PU

Uti

liza

tio

n

Measured Modeled

Network Utilization

0%

20.0%

40.0%

60.0%

80.0%

100.0%

100 200 300 400 500 600 650 700 750

% U

tili

zati

on

Modeled Simulated

Utilizations match measurements.

App Server Utilization DB Server Utilization

Web Server Utilization Network Utilization


Initial Validation Results – Response Time

Initial Validation ResultsModeled Response Times

0.0

0.1

0.2

0.3

0.4

0.5

100 200 300 400 500 600 700 800

Number of Vusers

Re

sp

on

se

Tim

e (

se

c)


Initial Validation ResultsMeasured Response Times

0.0

0.1

0.2

0.3

0.4

0.5

100 200 300 400 500 600 700 800

Number of Vusers

Re

sp

on

se

Tim

e (

se

c)


Response time increasing rapidly in measured system.

Deviates from model results at 500 Vusers.

BUT


Initial Validation Results Analysis:Software Contention in the Web Server

Web server request rate grows as expected with load.

Web server “contentions” curve matches increasing response time.

LoadRunner StatisticsTotal Number of Contentions

y = 2.0988e0.0111x

0

2,000

4,000

6,000

8,000

10,000

200 250 300 350 400 450 500 550 600 650 700 750 800

Number of Vusers

NumContentions Expon. (NumContentions)

LoadRunner StatisticsWeb Requests Per Second

0

50

100

150

200

250

300

350

200 250 300 350 400 450 500 550 600 650 700 750 800

Number of Vusers

Re

q P

er

Se

c

0

500

1000

1500

2000

2500

GE

T R

eq

Pe

r S

ec

ISAPI Ext Req ASP.Net Req Post Req Get Req


Validation Model Refinement

• Load test identified software queuing in the Web server at increasing load levels

• Determine refinement options√ Load dependent delay Load dependent delay

function for each business function for each business functionfunction

– Queuing algorithm or thread limitation (requires the ability to decipher code & configuration parameters)

• Add contention function (ADN Behavior) to compute software delay in Optimizer

• Repeat Validation Runs

Software Queuing Delay Function"Buy" Business Function

y = 0.00088x4.21849

0.000

0.200

0.400

0.600

0.800

1.000

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Business Functions Per Second (Load)

So

ftw

are

Qu

eu

ing

De

lay

(s

ec

)Response Time Delta Power (Response Time Delta)


New Validation Results

• Updated the model to reflect the observed contention• Repeat validation runs with the model• Evaluate modeled vs. measured results

– Updated model tracks accurately with measurements (10-20%)– Business function response time– CPU utilization– Network utilization

Modeled results within 10-20% of measurements!


Validation Results: Response Time[100 to 750 Vusers]

Validation Results - Business Function - Response TimeBuy Transaction

0.0

0.1

0.2

0.3

0.4

0.5

0.6

100 200 300 400 500 600 650 700 750

Number of Vusers

Re

spo

ns

e T

ime

(se

co

nd

s)

Measured Modeled

Validation Results - Business Function - Response TimeSell Transaction

0.0

0.1

0.2

0.3

0.4

0.5

100 200 300 400 500 600 650 700 750

Number of Vusers

Re

spo

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Validation Results - Business Function - Response TimeViewPortfolio Transaction

0.00

0.05

0.10

0.15

0.20

0.25

0.30

100 200 300 400 500 600 650 700 750

Number of Vusers

Res

po

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Validation Results - Business Function - Response TimeViewSummary Transaction

0.00

0.05

0.10

0.15

0.20

100 200 300 400 500 600 650 700 750

Number of Vusers

Res

po

nse

Tim

e (s

ec

on

ds

)

Measured Modeled

Buy View Portfolio

View SummarySell


Validation Results: Utilization [100 to 750 Vusers]

Validation Results - Server UtilizationWeb Server

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

100 200 300 400 500 600 650 700 750

Number of Vusers

CP

U U

tiliz

ati

on

Measured Modeled

Validation Results - Server UtilizationDatabase Server

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

100 200 300 400 500 600 650 700 750

Number of Vusers

CP

U U

tiliz

ati

on

Measured Modeled

Validation Results - Server UtilizationApplication Server

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

100 200 300 400 500 600 650 700 750

Number of Vusers

CP

U U

tiliz

ati

on

Measured Modeled

Validation Results - Network UtilizationWeb Server-to-Driver

0%

20%

40%

60%

80%

100%

100 200 300 400 500 600 650 700 750

Number of Vusers

Ne

two

rk L

ink

Uti

liza

tio

n

Measured Modeled

Web Server

App Server Network

DB Server


Validation Results Summary

• Network utilization tracked well in both the initial and final validation

• CPU utilization tracked well in both cases• Accuracy of response time predictions closely

linked to representation of software constraints

Confident the model can be used for What-if scenarios


What-If Modeling Scenarios

• Goal– Ensure application readiness to support 1,000 users

• Questions– What Load Test configuration is required to support

1,000 users?– What are the sensitive application components?– How accurate is the model’s extrapolation to

production?


Modeling Scenario - Baseline Scalability

• Determine Baseline Capacity– How many users can our baseline configuration

support?– Increase the number of users until the system “breaks”

• System component becomes saturated• Response time degradation

– Identify limiting component


Modeling Scenario – Baseline ScalabilityBusiness Function Response Time

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

100 200 300 400 500 600 700 800 900 1000 1100 1200

Number of Vusers

Re

sp

on

se

Tim

e (

se

co

nd

s)

Buy Login Sell ViewPortfolio ViewSummary

Knee at 900 Vusers


Baseline Scalability Extrapolation:Response Time Comparison

Scenario 1 - Bottleneck AnalysisBuy - Transaction Response Time

0.0

0.5

1.0

1.5

2.0

750 800 900 1000

Number of Vusers

Re

spo

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Scenario 1 - Bottleneck AnalysisSell - Transaction Response Time

0.0

0.5

1.0

1.5

2.0

750 800 900 1000

Number of Vusers

Re

spo

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Scenario 1 - Bottleneck AnalysisViewPortfolio - Transaction Response Time

0.0

0.5

1.0

1.5

2.0

750 800 900 1000

Number of Vusers

Re

spo

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Scenario 1 - Bottleneck AnalysisViewSummary - Transaction Response Time

0.0

0.5

1.0

1.5

2.0

750 800 900 1000

Number of Vusers

Re

spo

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Response times track well up to 1,000 Vusers

Buy

Sell View Portfolio

View Summary

750* 800 900 1000 750* 800 900 1000

750* 800 900 1000750* 800 900 1000


Modeling Scenario – Baseline Scalability Business Function Throughput

0

10

20

30

40

50

60

100 200 300 400 500 600 700 800 900 1000 1100 1200

Number of Vusers

Tra

ns

ac

tio

ns

Pe

r S

ec

on

d

Buy Login Sell ViewPortfolio ViewSummary

Limit at 900 Vusers


Modeling Scenario – Baseline ScalabilityWeb Server Transaction Processing

0

20

40

60

80

100 300 500 700 900 1100

Number of Vusers

Tra

ns

ac

tio

ns

in P

rog

res

s

0

60

120

180

240

Qu

eu

e L

en

gth

Transactions in Progress Queue Length

Software Thread

Constraint


Modeling Scenario – Baseline Scalability Server CPU Utilization

0%

20%

40%

60%

80%

100%

100 200 300 400 500 600 650 700 750 800 900 1000 1100 1200

Number of Vusers

CP

U U

tiliz

ati

on

DB Server App Server Web Server

Limit at 900 Vusers


Modeling Scenario – Baseline Scalability Network Utilization

0%

20%

40%

60%

80%

100%

100 200 300 400 500 600 650 700 750 800 900 1000 1100 1200

Number of Vusers

Ne

two

rk L

ink

Uti

liza

tio

n

Limit at 900 Vusers


Modeling Scenario: Hardware Configuration Changes

• Web Server was the bottleneck in baseline– Software constraint– Highest CPU utilization

• Relieve Web Server software constraint– Replace single 4-way 1.5 GHz Web Server with two 2-way 2.4

GHz servers– This will decrease the load to the Web Server

• Rerun the scalability scenario– Increase the number of Vusers to 1,000+– Can we defer response time degradation?– Can we safely support 1,000 users?


Baseline vs. Web Server UpgradeResponse Time Comparison

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

0 200 400 600 800 1000 1200 1400

Number of Vusers

Re

sp

on

se

Tim

e (

se

co

nd

s)

Scenario 1 Scenario 2 Upgrade knee at 1,100 Vusers

Previous knee at 900


Modeling Scenario – Web Server UpgradeBusiness Function Throughput

0

10

20

30

40

50

60

70

100 200 300 400 500 600 700 800 900 1000 1100 1200 1300

Number of Vusers

Tra

ns

ac

tio

ns

Pe

r S

ec

on

d


Limit at 1,100

Vusers


Modeling Scenario Summary

• Baseline Scalability– Simulated business function response times remain under 4

seconds for up to 1,200 Vusers– Response times start to degrade at 900 Vusers– Primary bottleneck is the Web Server software constraint– The Network is secondary bottleneck

• Hardware Upgrade to Two Web Servers – Improves the throughput – More fully utilizes hardware infrastructure– Response times start to degrade at 1,100 Vusers (+200 Vusers)

Use upgraded configuration for Load Test


Extrapolating to Production

• Is the model accurate enough to extrapolate to the production load of 1,000 users with the upgrade?

• The model was demonstrated to be accurate up to 750 users (baseline validation)

• How accurate is the model with 750+ users?– Run load tests with same user loads & hardware changes that

we just modeled in the scenarios• Baseline Scalability

• Web Server Upgrade (increase to 2 Web Servers)

– Compare model predictions to load test measurements


Baseline Scalability Extrapolation:Utilization Comparison

Scenario 1 - Bottleneck Analysis Network Utilization - W eb Server-to-Driver

0%

20%

40%

60%

80%

100%

750 800 900 1000

Number o f Vusers

Ne

two

rk L

ink

Util

izat

ion

Measured Modeled

Scenario 1 - Bottleneck AnalysisW eb Server - CPU Utilization

0%

20%

40%

60%

80%

100%

750 800 900 1000

Number o f Vusers

CP

U U

tiliz

atio

n

Measured Modeled

Scenario 1 - Bottleneck AnalysisApplication Server - CPU Utilization

0%

20%

40%

60%

80%

100%

750 800 900 1000

Number of Vusers

CP

U U

tiliz

atio

n

Measured Modeled

Scenario 1 - Bottleneck AnalysisDatabase Server - CPU Utilization

0%

20%

40%

60%

80%

100%

750 800 900 1000

Number of Vusers

CP

U U

tiliz

atio

n

Measured Modeled

Utilizations are good up to 1,000 Vusers

750* 800 900 1000 750* 800 900 1000

750* 800 900 1000750* 800 900 1000

DB Server Network

Web Server App Server


Web Server Upgrade Extrapolation:Response Times Comparison

Scenario 2 - Two 2-CPU Web ServersBuy Response Time

0.00

0.05

0.10

0.15

0.20

0.25

0.30

600 800 1000

Number of Vusers

Res

po

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Scenario 2 - Two 2-CPU Web ServersSell Response Time

0.00

0.05

0.10

0.15

0.20

0.25

0.30

600 800 1000

Number of Vusers

Res

po

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Scenario 2 - Two 2-CPU Web ServersView Portfolio Response Time

0.00

0.05

0.10

0.15

0.20

0.25

0.30

600 800 1000

Number of Vusers

Re

spo

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Scenario 2 - Two 2-CPU Web ServersViewSummary Response Time

0.00

0.05

0.10

0.15

0.20

0.25

0.30

600 800 1000

Number of Vusers

Res

po

nse

Tim

e (s

eco

nd

s)

Measured Modeled

Response times are good up to 800 Vusers

600* 800 1000

600* 800 1000 600* 800 1000

600* 800 1000

Buy

Sell View Portfolio

View Summary


Web Server Upgrade Extrapolation:Utilization Comparison

Scenario 2 - Two 2-CPU Web ServersNetwork Utilization - Web Server-to-Driver

0%

20%

40%

60%

80%

100%

600 800 1000

Number of Vusers

Net

wo

rk L

ink

Util

iza

tion

Measured Modeled

Scenario 2 - Two 2-CPU Web ServersWeb Server - CPU Utilization

0%

20%

40%

60%

80%

100%

600 800 1000

Number of Vusers

CP

U U

tiliz

atio

n

Measured Modeled

Scenario 2 - Two 2-CPU Web ServersApplocation Server - CPU Utilization

0%

20%

40%

60%

80%

100%

600 800 1000

Number of Vusers

CP

U U

tiliz

atio

n

Measured Modeled

Scenario 2 - Two 2-CPU Web ServersDatabase Server - CPU Utilization

0%

20%

40%

60%

80%

100%

600 800 1000

Number of Vusers

CP

U U

tiliz

atio

n

Measured Modeled

Utilizations are good up to 1,000 Vusers

Web Server App Server

DB Server Network

600* 800 1000 600* 800 1000

600* 800 1000600* 800 1000

DB server in load test changed to a 1-CPU

server!!


Effect of Hyper-Threading on CPU Measurements

With hyper-threading enabled, the operating system measures what it sees – the logical processors. The test results show that processor time measurements for a fixed amount of work depend on the level of activity for each logical processor. The level of activity determines whether the physical processor operates in single-task state or multi-task state. When in multi-task state, wait time due to the contention of logical processors appears to be included in the measurement of logical processor time.

Johnson, S. “Measuring CPU Time from Hyper-Threading Enabled Intel Processors”, Proceedings of the Computer Measurements Group 2003 International Conference.


Throughput vs. Response Time

0.000

0.050

0.100

0.150

0.200

0.250

0.300

0.350

0.400

0 20 40 60 80 100 120 140 160

Business Functions/second

Res

po

nse

Tim

e (s

eco

nd

s)

hyperthreaded non-hyperthreaded

Consistent response time across throughput range


CPU Utilization vs. CPU per Business Function

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Application Server CPU utilization

CP

U s

eco

nd

s p

er b

usi

nes

s fu

nct

ion

non-hyperthreaded hyperthreaded

HT CPU/BF gradually increases with utilization


Throughput vs. CPU per Business Function

0.000

0.005

0.010

0.015

0.020

0.025

0.030

0 20 40 60 80 100 120 140 160

Business functions per second

CP

U s

eco

nd

s p

er b

usi

nes

s fu

nct

ion

non-hyperthreaded hyperthreaded

HT CPU/BF gradually increases with throughput


Simple Linear Analytical Model

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 20 40 60 80 100 120 140


CP

U u

tiliz

atio

n

Measured 0.014 0.019 0.025

Constant CPU/BF is not a good

predictor for HT server performance


Non-Hyper-Threaded Application ServerIPS Optimizer

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 20 40 60 80 100 120 140 160


Ap

plic

atio

n C

PU

Uti

lizat

ion

measured predicted

Modeled utilization tracks nicely with

measurement


Hyper-Threaded Application ServerIPS Optimizer, without Hyper-Threading Simulation

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 20 40 60 80 100 120 140 160

Business functions / second

Ap

plic

atio

n C

PU

uti

lizat

ion

measured predicted

Modeled does not represent the increase in CPU/BF as load increases


Hyper-Threaded Application ServerIPS Optimizer, with Hyper-Threading Simulation

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 20 40 60 80 100 120 140 160

Business functions / second

Ap

plic

atio

n C

PU

uti

lizat

ion

measured predicted

HT model results track measurements


Summary - Extrapolating to Production

• Can the model be used to extrapolate?– Yes!

– The model’s predicted utilizations are good

– Response time predictions tracked with the load test

• Are the model predictions good enough to

– Guide the load testing effort? Yes!– Make a business decision about production deployment?

Yes! Good enough!


Summary – Moving Beyond Test and Guess

• Load Testing is essential to ensuring application performance-readiness

• Load Testing has it’s limits: time and money

• Performance modeling improves the effectiveness of Load Testing

• Modeling has minimal impact to the Load Test effort

• Modeling & Load Testing are natural partners