performance evaluation of computer systems introduction

1

Performance Evaluation of Computer Systems

Introduction

ByBehzad Akbari

Tarbiat Modares UniversitySpring 2009

In the Name of the Most High

2

Outline

Introduction to performance evaluation Objectives of performance evaluation Techniques of performance evaluation Metrics in performance evaluation

3

Introduction Computer system users, administrators, and designers are all

interested in performance evaluation. The goal in system performance evaluation is to provide the

highest performance at the lowest cost. Computer performance evaluation has important role in

selection of computer systems, design of systems and applications, and analysis of existing systems.

4

Objectives of Performance Study Evaluating design alternatives (system design) Comparing two or more systems (system selection) Determining the optimal value of a parameter (system

tuning) Finding the performance bottleneck (bottleneck

identification) Characterizing the load on the system (workload

characterization) Determining the number and sizes of components

(capacity planning) Predicting the performance at future loads

(forecasting).

5

Basic Terms System: Any collection of hardware, software and

network. Metrics: Criteria used to analysis the performance of

the system or components. Workloads: The requests made by the users of the

system.

6

Performance Evaluation Activities Performance evaluation of a system can be done at different

stages of system development System in planning and design stage

Use high level models to obtain performance estimates for alternative system configurations and alternative designs.

System is operational Measure the system behavior with a view to improve the

performance Develop validated model that can be used for performance

prediction and capacity planning.

7

Techniques for Performance Evaluation Performance measurement

Obtain measurement data by observing the events and activities on an existing system

Performance modeling Represent the system by a model and manipulate the

model to obtain information about system performance

8

Performance Measurement Measure the performance directly on a system Need to characterize the workload placed on the

system during measurement Generally provide the most valid results Nevertheless, not very flexible

May be difficult (or even impossible) to vary some workload parameters

9

Performance Modeling Model

An abstraction of the system obtained by making a set of assumptions about how the system works

Capture the essential characteristics of the system Reasons of using models

Experimenting with the real system may be too costly too risky, or too disruptive to system operation

System may only be in the design stage

10

Performance Modeling Workload characterization

Capture the resource demands and intensity of the load brought to the system

Performance metrics The measure of interest, such as mean response time, the

number of transactions completed per second, the ratio of blocked connection requests, etc.

11

Performance Modeling

Solution methods Analytic modeling Simulation modeling

12

Analytic Modeling Mathematical methods are used to obtain solutions

to the performance measures of interest Numerical results are easy to compute if a simple

analytic solution is available Useful approach when one only needs rough

estimates of performance measures Solutions to complex models may be difficult to

obtain

13

Simulation Modeling Develop a simulation program that implements the

model Run the simulation program and use the data

collected to estimate the performance measurement of interest

A system can be studied at an arbitrary level of detail

It may be costly to develop and run the simulation program

14

Stochastic Model Model contains some random input components

which are characterized by probability distributions, e.g., time between arrivals to a system by exponential distribution

Output is also random, and provides probability distributions of the performance measures of interest

15

Queuing Model The most commonly used model to analyze the

performance of computer systems and networks. Single queue: models a component of overall

system, such as CPU, disk, communication channel Network of queues: models system components and

their interaction.

16

Steps in Performance Modeling

17

Commonly Used Performance Metrics Response Time

Turn around time Reaction time Stretch factor

Throughput Operations/second

Jobs per second Requests per second Millions of Instructions Per Second (MIPS) Millions of Floating Point Operations Per Second (MFLOPS) Packets Per Second (PPS) Bits per second (bps) Transactions Per Second (TPS

Efficiency Utilization

18

Commonly Used Performance Metrics (Cont…)

Reliability R(t) MTTF

Availability Mean Time to Failure (MTTF) Mean Time to Repair (MTTR) MTTF/(MTTF+MTTR)

19

Response Time Interval between user’s request and system

response

Time

User’sRequest

System’sResponse

20

Response Time (cont…)

Can have two measures of response time Both ok, but 2 preferred if execution long

Time

User FinishesRequest

System Starts

Response

User Starts

Request

System FinishesResponse

System Starts

Execution

ReactionTime

ResponseTime 1

ResponseTime 2

21

Response Time (cont…) Turn around time: time between submission of a

job and completion of output For batch job systems

Reaction time: Time between submission of a request and beginning of execution Usually need to measure inside system since nothing

externally visible Stretch factor: ratio of response time at load to

response time at minimal load Most systems have higher response time as load

increases

22

Throughput Rate at which requests can be serviced by system (requests

per unit time)

23

Efficiency

Ratio of maximum achievable throughput (ex: 9.8 Mbps) to nominal capacity (ex: 10 Mbps) 98%

For multiprocessor systems, ratio of n-processor to that of one-processor (in MIPS or MFLOPS)

Effi

cienc

y

Number of Processors

24

Utilization Typically, fraction of time resource is busy serving

requests Time not being used is idle time System managers often want to balance resources to have

same utilization Ex: equal load on CPUs But may not be possible. Ex: CPU when I/O is bottleneck

May not be time Processors: busy / total Memory: fraction used / total

25

Miscellaneous Metrics Reliability

Probability of errors or mean time between errors (error-free seconds)

Availability Fraction of time system is available to service requests

(fraction not available is downtime) Mean Time To Failure (MTTF) is mean uptime

Useful, since availability high (downtime small) may still be frequent and no good for long request

26

Definition of Reliability Recommendations E.800 of the International Telecommunications Union (ITU-T) defines reliability as follows:

“The ability of an item to perform a required function under given conditions for a given time interval.”

In this definition, an item may be a circuit board, a component on a circuit board, a module consisting of several circuit boards, a base transceiver station with several modules, a fiber-optic transport-system, or a mobile switching center (MSC) and all its subtending network elements. The definition includes systems with software.

27

Basic Definitions of Reliablity

Reliability R(t):X : time to failure of a systemF(t): : distribution function of system lifetime

Mean Time To system Failure:

f(t): density function of system lifetime

tFtXPtR 1

00

dttRdtttfXEMTTF

28

Definition of Availability Availability is closely related to reliability, and is also defined in ITU-T Recommendation E.800 as follows:

"The ability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval, assuming that the external resources, if required, are provided."

An important difference between reliability and availability is that reliability refers to failure-free operation during an interval, while availability refers to failure-free operation at a given instant of time, usually the time when a device or system is first accessed to provide a required function or service

29

Availability (Cont…) Instantaneous (point) Availability A(t):

A(t) = P (system working at t)

Let H(t) be the convolution of F and G: g(t): density function of system repair time

Then:

Inst. Availability , , Reliability

dxxgxtFtHt

)()(0

t

xdHxtAtRtA0

)()()()(

)()( tRtA

30

First failed and got repaired at time x<t & UP at end of interval (x,t), prob:

Availability (Cont…)

0 x t

x + dx

First repair completed here

Never failed in (0,t), prob: R(t) System working at time t

t

xdHxtA0

)()(

31


MTTR: Mean Time to Repair Y: repair period of the system

Availability and Reliability are related but different!

0

)( dtttgYEMTTR

32

We can show from equation (1) that:

Also:


MTTRMTTFMTTFASS

)yearminutes(60*8760*)1(

perinAdowntime ss

36

Three Rules of Validation Do not trust the results of a simulation model until

they have been validated by analytical modeling or measurements.

Do not trust the results of an analytical model until they have been validated by a simulation model or measurements.

Do not trust the results of a measurement until they have been validated by simulation or analytical modeling.

performance evaluation of computer systems introduction

Documents