m. keshtgary spring 91 chapter 3 selection of techniques and metrics

Click here to load reader

Upload: egbert-owen

Post on 27-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Chapter 3 SELECTION OF TECHNIQUES AND METRICS

M. KeshtgarySpring 91Chapter 3SELECTION OF TECHNIQUES AND METRICS Overview2One or more systems, real or hypotheticalYou want to evaluate their performanceWhat technique do you choose?Analytic Modeling?Simulation?Measurement?What metrics do you use?Outline3Selecting an Evaluation TechniqueSelecting Performance MetricsCase StudyCommonly Used Performance MetricsSetting Performance RequirementsCase StudySelecting an Evaluation Technique (1 of 4)4Which life-cycle stage the system is in? Measurement only when something existsIf new, analytical modeling or simulation are only optionsWhen are results needed? (often, yesterday!)Analytic modeling only choiceSimulations and measurement can be sameWhat tools and skills are available?Maybe languages to support simulationTools to support measurement (e.g.: packet sniffers, source code to add monitoring hooks)Skills in analytic modeling (e.g.: queuing theory)Selecting an Evaluation Technique (2 of 4)5Level of accuracy desired?Analytic modeling coarse (if it turns out to be accurate, even the analysts are surprised!)Simulation has more details, but may abstract key system detailsMeasurement may sound real, but workload, configuration, etc., may still be missingAccuracy can be high to none without proper designEven with accurate data, still need to draw proper conclusionsE.g.: so response time is 10.2351 with 90% confidence. So what? What does it mean?

Selecting an Evaluation Technique (3 of 4)6What are the alternatives?Can explore trade-offs easiest with analytic models, simulations moderate, measurement most difficultCost?Measurement generally most expensiveAnalytic modeling cheapest (pencil and paper)Simulation often cheap but some tools expensiveTraffic generators, network simulatorsSelecting an Evaluation Technique (4 of 4)7Saleability?Much easier to convince people with measurementsMost people are skeptical of analytic modeling results since they are hard to understandOften validate with simulation before usingCan use two or more techniquesValidate one with anotherMost high-quality performance analysis papers have analytic model + simulation or measurementSummary Table for Evaluation Technique Selection8CriterionModelingSimulationMeasurement1. StageAnyAnyPrototype+2. TimeSmallMediumVariesrequired3. ToolsAnalystsSome Instrumentationlanguages4. AccuracyLowModerateVaries5. Trade-offEasyModerateDifficultevaluation6. CostSmallMediumHigh7. SaleabiltyLowMediumHighMore importantLess importantHybrid modelingSometimes it is helpful to use two or more techniques simultaneouslyDo not trust the results of a simulation model until they have been validated by analytical modeling or measurements. Do not trust the results of an analytical model until they have been validated by a simulation model or measurements. Do not trust the results of a measurement until they have been validated by simulation or analytical modeling. Outline10Selecting an Evaluation TechniqueSelecting Performance MetricsCase StudyCommonly Used Performance MetricsSetting Performance RequirementsCase Study

Selecting Performance Metrics

First list all the services of your systemFor each service that requested, we ask if the service done or not done

ExamplesA gateway in a computer network offers the service of forwarding packets to the specified destinations on heterogeneous networks. When presented with a packet, it may forward the packet correctly, it may forward it to the wrong destination, or it may be down, in which case it will not forward it at all. A database offers the service of responding to queries. When presented with a query, it may answer correctly, it may answer incorrectly, or it may be down and not answer it at all. system performs the service correctlyIf the system performs the service correctly, its performance is measured by the time taken to perform the service, the rate at which the service is performed, and the resources consumed while performing the service. These three metrics related to time-rate-resource for successful performance are also called responsiveness, productivity, and utilization metricsThe utilization gives an indication of the percentage of time the resources of the gateway are busy for the given load level. The resource with the highest utilization is called the bottleneck.system performs the service incorrectlyIf the system performs the service incorrectly, an error is occurred. We classify errors and determine the probabilities of each class of errors. For example, in the case of the gateway, we may want to find the probability of single-bit errors, two-bit errors, and so on. We may also want to find the probability of a packet being partially delivered (fragment).

system does not perform the serviceIf the system does not perform the service, it is said to be down, failed, or unavailable. Once again , we classify the failure modes and determine the probabilities of each class. For example, the gateway may be unavailable 0.01% of the time due to processor failure and 0.03% due to software failure.

speed, reliability, and availabilityThe metrics associated with the three outcomes, namely successful service, error, and unavailability, are also called speed, reliability, and availability metrics.Selecting Performance Metrics(1 of 3)17RequestSystem

DoneNotDoneCorrectNotCorrectErroriProbabilityTime betweenEventkDurationTime betweenTimeRateResourceSpeedReliabilityAvailabilityPossible OutcomesresponsivenessproductivityutilizationSelecting Performance Metrics(2 of 3)18Mean is what usually mattersBut do not overlook the effect of variabilityIndividual vs. Global (systems shared by many users)May be at oddsIncrease individual may decrease globalE.g.: response time at the cost of throughputIncrease global may not be most fairE.g.: throughput of cross trafficPerformance optimizations of bottleneck have most impactE.g.: Response time of Web requestClient processing 1s, Latency 500ms, Server processing 10s Total is 11.5 sImprove client 50%? 11 sImprove server 50%? 6.5 sSelecting Performance Metrics(3 of 3)19May be more than one set of metricsResources: Queue size, CPU Utilization, Memory Use Criteria for selecting subset, choose:Low variability need fewer repetitionsNon redundancy If two metrics give essentially the same information, it is less confusing to study only oneE.g.: queue size and delay may provide identical informationCompleteness should capture All possible outcomes E.g.: one disk may be faster but may return more errors so add reliability measureOutline20Selecting an Evaluation TechniqueSelecting Performance MetricsCase StudyCommonly Used Performance MetricsSetting Performance RequirementsCase Study

Case Study (1 of 5)21Computer system of end-hosts sending packets through routersCongestion occurs when number of packets at router exceed buffering capacity Goal: compare two congestion control algorithmsUser sends block of packets to destination; Four possible outcomes:A) Some delivered in orderB) Some delivered out of orderC) Some delivered more than onceD) Some dropped

Case Study (2 of 5)22For A), straightforward metrics exist:1) Response time: delay for individual packet2) Throughput: number of packets per unit time3) Processor time per packet at source4) Processor time per packet at destination5) Processor time per packet at routerSince large response times can cause extra (unnecessary) retransmissions:6) Variability in response time (is also important)Case Study (3 of 5)23For B), out-of-order packets cannot be delivered to the user immediately They are often discarded (considered dropped)Alternatively, they are stored in destination buffers awaiting arrival of intervening packets 7) Probability of out of order arrivalsFor C), consume resources without any use8) Probability of duplicate packetsFor D), for many reasons is undesirable9) Probability of lost packetsAlso, excessive loss can cause disconnection10) Probability of disconnectCase Study (4 of 5)24Since a multi-user system, want fairness:11) Fairness = A function of variability of throughput across users; for any given set of user throughputs (x1, x2, , xn), the fairness is:f(x1, x2, , xn) = (xi)2 / (n xi2)Index between 0 and 1All users get same, then 1If k users get equal throughput and n-k get zero, than index is k/n

Case Study (5 of 5)25After a few experiments (pilot tests)Found throughput and delay redundanthigher throughput had higher delayinstead, combine with power = throughput/delayA higher power meant either a higher throughput or a lower delay; in either case it was considered better than a lower powerFound variance in response time redundant with probability of duplication and probability of disconnectionDrop variance in response timeThus, left with nine metrics

Outline26Selecting an Evaluation TechniqueSelecting Performance MetricsCase StudyCommonly Used Performance MetricsSetting Performance RequirementsCase Study

Commonly Used Performance Metrics27Response TimeTurn around timeReaction timeStretch factorThroughputOperations/secondCapacityEfficiencyUtilizationReliabilityUptimeMTTF

Response Time (1 of 2)28Interval between users request and system responseTimeUsersRequestSystemsResponseBut simplistic since requests and responses are not instantaneousUsers spend time typing the request and the system takes time to output the responseResponse Time (2 of 2)29Can have two measures of response timeBoth ok, but 2 preferred if execution longThink time : Time until next requestTimeUser FinishesRequestSystem StartsResponseUser StartsRequestSystem FinishesResponseSystem StartsExecutionReactionTimeResponseTime 1ResponseTime 2ThinkTimeResponse Time+30Turnaround time time between submission of a job and completion of outputFor batch job systems responsiveness is measured by turnaround timeReaction time - Time between submission of a request and beginning of executionUsually need to measure inside system since nothing externally visibleStretch factor ratio of response time at a particular load to the response time at minimum loadMost systems have higher response time as load increases For a timesharing system, for example, the stretch factor is defined as the ratio of the response time with multiprogramming to that without multiprogrammingThroughput (1 of 2)31Rate at which requests can be serviced by system (requests per unit time)Batch: jobs per secondInteractive: requests per secondCPUsMillions of Instructions Per Second (MIPS)Millions of Floating-Point Ops per Sec (MFLOPS)Networks: pkts per second or bits per secondTransactions processing: Transactions Per Second (TPS)Throughput (2 of 2)After a certain load, the throughput stops increasing; in most cases, it may even start decreasing.Nominal Capacity: maximum achievable throughput under ideal workload conditions For computer networks, the nominal capacity is called the bandwidth and is usually expressed in bits per second. Often the response time at maximum throughput is too high to be acceptable. In such cases, it is more interesting to know the maximum throughput achievable without exceeding a prespecified response time limit. This may be called the usable capacity of the system. In many applications, the knee of the throughput or the response-time curve is considered the optimal operating point. Knee Capacity: throughput at the knee Throughput (2 of 3)33Throughput increases as load increases, to a pointThrputKneeKneeCapacityUsableCapacityNominalCapacityLoadResponseTimeLoadNominal capacity is ideal (e.g.: 10 Mbps)Usable capacity is achievable (e.g.: 9.8 Mbps)Knee is where response time goes up rapidly for small increase in throughput

Efficiency34Ratio of maximum achievable throughput For example, if the maximum throughput from a 100-Mbps is only 85 Mbps, its efficiency is 85%For multiprocessor, ratio of n-processor to that of one-processor (in MIPS or MFLOPS) EfficiencyNumber of ProcessorsUtilization35Typically, fraction of time resource is busy serving requestsTime not being used is idle timeSystem managers often want to balance resources to have same utilizationE.g.: equal load on CPUsBut may not be possible.

For Processors busy / total makes senseFor other resources, such as memory, only a fraction of the resource may be used at a given time; their utilization is measured as the average fraction used over an interval.Miscellaneous Metrics36ReliabilityProbability of errors or mean time between errors (error-free seconds)AvailabilityFraction of time system is available to service requests (fraction not available is downtime)Mean Time To Failure (MTTF) is mean uptimeUseful, since availability high (downtime small) may still be frequent and no good for long requestCost/Performance ratioTotal cost / Throughput, for comparing 2 systemsEx: For Transaction Processing system may want Dollars / TPSUtility Classification37HB Higher is better (ex: throughput)LB - Lower is better (ex: response time)NB Nominal is best (ex: utilization) UtilityMetricBetterLB UtilityMetricBetterHB UtilityMetricBestNBOutline38Selecting an Evaluation TechniqueSelecting Performance MetricsCase StudyCommonly Used Performance MetricsSetting Performance RequirementsCase Study

Setting Performance Requirements(1 of 2)39Consider these typical requirement statements

The system should be both processing and memory efficient. It should not create excessive overheadThere should be an extremely low probability that the network will duplicate a packet, deliver it to a wrong destination, or change the data

Whats wrong?Setting Performance Requirements(2 of 2)40General ProblemsNonspecific no numbers. Only qualitative words (rare, low, high, extremely small)Nonmeasureable no way to measure and verify that the system meets requirementsNonacceptable numerical values of requirements are set based upon what can be achieved or on what looks good; If set on what can be achieved, they may turn out to be too lowNonrealizable numbers based on what sounds good, but not realsticNonthorough no attempt is made to specify all the parametersOutline41Selecting an Evaluation TechniqueSelecting Performance MetricsCase StudyCommonly Used Performance MetricsSetting Performance RequirementsCase Study

Setting Performance Requirements: Case Study (1 of 2)42Performance for high-speed LANSpeed if packet delivered, time taken to do so is importantA) Access delay should be less than 1 secB) Sustained throughput at least 80 Mb/sReliability A) Prob of bit error less than 10-7B) Prob of frame error less than 1%C) Prob of frame error not caught 10-15D) Prob of frame miss-delivered due to uncaught error 10-18E) Prob of duplicate 10-5F) Prob of losing frame less than 1%Setting Performance Requirements: Case Study (2 of 2)43AvailabilityA) Mean time to initialize LAN < 15 msecB) Mean time between LAN inits > 1 minuteC) Mean time to repair < 1 hourAll above values were checked for realizeability by modeling, showing that LAN systems satisfying the requirements were possiblePart I: Things to Remember44Systematic ApproachDefine the system, list its services, metrics, parameters, decide factors, evaluation technique, workload, experimental design, analyze the data, and present resultsSelecting Evaluation TechniqueThe life-cycle stage is the key. Other considerations are: time available, tools available, accuracy required, trade-offs to be evaluated, cost, and saleability of results.Part I: Things to Remember45Selecting MetricsFor each service list time, rate, and resource consumptionFor each undesirable outcome, measure the frequency and duration of the outcomeCheck for low-variability, non-redundancy, and completeness.Performance requirements: Should be SMART. Specific, measurable, acceptable, realizable, and thorough.