variability in architectural simulations of multi-threaded workloads alaa r. alameldeen and david a....

31
Variability in Variability in Architectural Architectural Simulations of Multi- Simulations of Multi- threaded Workloads threaded Workloads Alaa R. Alameldeen Alaa R. Alameldeen and David A. and David A. Wood Wood University of Wisconsin-Madison University of Wisconsin-Madison {alaa,david}@cs.wisc.edu {alaa,david}@cs.wisc.edu http://www.cs.wisc.edu/ http://www.cs.wisc.edu/ multifacet/ multifacet/

Upload: teresa-willis

Post on 18-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Variability in Architectural Variability in Architectural Simulations of Multi-threaded Simulations of Multi-threaded

WorkloadsWorkloads

Alaa R. AlameldeenAlaa R. Alameldeen and David A. Wood and David A. Wood

University of Wisconsin-MadisonUniversity of Wisconsin-Madison

{alaa,david}@cs.wisc.edu{alaa,david}@cs.wisc.edu

http://www.cs.wisc.edu/multifacet/http://www.cs.wisc.edu/multifacet/

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 22

MotivationMotivation

Experimental scientists use statisticsExperimental scientists use statistics Computer architects in simulation Computer architects in simulation

experiments experiments don’tdon’t!! Why ignore statistics?Why ignore statistics?

Simulations are deterministicSimulations are deterministic

This can lead to wrong conclusions!This can lead to wrong conclusions!

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 33

Workload VariabilityWorkload Variability

OLTP

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 44

Workload VariabilityWorkload Variability

OLTP

Slower memory is

better!

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 55

What Went Wrong?What Went Wrong?

Many possible executions for each Many possible executions for each configurationconfiguration

Why? Different timing effectsWhy? Different timing effects OS scheduling decisionsOS scheduling decisions Different orders of lock acquisitionDifferent orders of lock acquisition Different transaction mixesDifferent transaction mixes

This is magnified by short simulationsThis is magnified by short simulations

Variability can lead to wrong Variability can lead to wrong conclusionsconclusions

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 66

OverviewOverview

Variability is a real phenomenon for Variability is a real phenomenon for multi-threaded workloadsmulti-threaded workloads Runs from same initial state can be differentRuns from same initial state can be different

Variability is a challenge for Variability is a challenge for simulationssimulations Simulations are shortSimulations are short

Our solution accounts for variabilityOur solution accounts for variability Multiple runs, statistical techniquesMultiple runs, statistical techniques

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 77

OutlineOutline

Motivation and OverviewMotivation and Overview Variability in Real SystemsVariability in Real Systems

Time and Space VariabilityTime and Space Variability Variability in SimulationsVariability in Simulations Accounting for VariabilityAccounting for Variability ConclusionsConclusions

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 88

What is Variability?What is Variability?

Differences between multiple Differences between multiple estimates of a workload’s estimates of a workload’s performanceperformance

Time Variability:Time Variability: Performance changes during different phases Performance changes during different phases

of a single runof a single run Space Variability:Space Variability:

Runs starting from the same state follow Runs starting from the same state follow different execution pathsdifferent execution paths

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 99

Time Variability in Real SystemsTime Variability in Real Systems

OLTP

One-second intervals

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1010

Time Variability Example (Cont’d)Time Variability Example (Cont’d)

How is this handled in real experiments?How is this handled in real experiments? Solution:Solution: Run your experiment long enough! Run your experiment long enough!

OLTP

One-minute intervals

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1111

Space Variability in Real Systems Space Variability in Real Systems

OLTP

One-second averages

5 runs

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1212

Space Variability Example (Cont’d)Space Variability Example (Cont’d) How is this handled in real experiments?How is this handled in real experiments?

Same Solution:Same Solution: Run your experiment long enough! Run your experiment long enough!

16-day simulation

OLTP

One-minute averages

5 runs

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1313

OutlineOutline

Motivation and OverviewMotivation and Overview Variability in Real SystemsVariability in Real Systems Variability in SimulationsVariability in Simulations

Simulation InfrastructureSimulation Infrastructure Injecting Randomness Injecting Randomness The Wrong Conclusion RatioThe Wrong Conclusion Ratio

Accounting for VariabilityAccounting for Variability ConclusionsConclusions

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1414

Simulation InfrastructureSimulation Infrastructure

WorkloadsWorkloads Two scientific and five commercial benchmarksTwo scientific and five commercial benchmarks

Target System: E10000-like 16-node Target System: E10000-like 16-node systemsystem

Full System SimulationFull System Simulation Virtutech Simics running Solaris 8 on SPARC V9Virtutech Simics running Solaris 8 on SPARC V9 A blocking processor model (Simics)A blocking processor model (Simics) An OoO processor model (TFSim – Mauer et al., An OoO processor model (TFSim – Mauer et al.,

SIGMETRICS’02)SIGMETRICS’02) Memory system simulatorMemory system simulator

MOSI invalidation-based broadcast coherence MOSI invalidation-based broadcast coherence protocol (Martin et al., HPCA-02)protocol (Martin et al., HPCA-02)

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1515

Simulating Space Variability?Simulating Space Variability?

Simulations are deterministicSimulations are deterministic Variability cannot be ignored for Variability cannot be ignored for

multi-threaded applications multi-threaded applications One execution may not be representativeOne execution may not be representative Execution paths affect simulation conclusionsExecution paths affect simulation conclusions

We need to obtain a space of resultsWe need to obtain a space of results

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1616

Injecting RandomnessInjecting Randomness

We introduce artificial random We introduce artificial random perturbations in each simulation runperturbations in each simulation run

For each memory access, latency in For each memory access, latency in nanoseconds becomes Latency + rnanoseconds becomes Latency + r(r = -2, -1, 0, 1, 2 nanoseconds, uniform dist.)(r = -2, -1, 0, 1, 2 nanoseconds, uniform dist.)

Roughly models contention due to Roughly models contention due to DMA trafficDMA traffic

Other methods are possibleOther methods are possible

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1717

Simulated Space VariabilitySimulated Space Variability

Space variability exists in our benchmarksSpace variability exists in our benchmarks

20 runs~10 hrs sim.

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1818

Quantifying Variability: Quantifying Variability: The Wrong Conclusion Ratio (WCR)The Wrong Conclusion Ratio (WCR)

WCR (16,32) = 18%WCR (16,32) = 18% WCR (16,64) = 7.5%WCR (16,64) = 7.5% WCR (32,64) = 26%WCR (32,64) = 26%

OLTP

20 runs50 Xacts

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1919

OutlineOutline

Motivation and OverviewMotivation and Overview Variability in Real SystemsVariability in Real Systems Variability in SimulationsVariability in Simulations Accounting for VariabilityAccounting for Variability ConclusionsConclusions

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2020

Confidence IntervalsConfidence Intervals

Definition:Definition: Range of values expected to include Range of values expected to include

population parameter (e.g. mean)population parameter (e.g. mean) Confidence Probability:Confidence Probability:

Probability that true mean lies inside Probability that true mean lies inside confidence intervalconfidence interval

For the same confidence probability:For the same confidence probability: Sample Size Sample Size ↑↑ → Confidence Interval → Confidence Interval ↓↓

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2121

Accounting for Space VariabilityAccounting for Space Variability

OLTP

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2222

Accounting for Space VariabilityAccounting for Space Variability

Simple solution: Estimate #runs such that Simple solution: Estimate #runs such that confidence intervals do not overlapconfidence intervals do not overlap

Tests of hypotheses can be used (paper)Tests of hypotheses can be used (paper)

OLTP

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2323

ConclusionsConclusions

Short runs of multi-threaded Short runs of multi-threaded workloads exhibit variabilityworkloads exhibit variability

Variability can lead to wrong Variability can lead to wrong simulation conclusionssimulation conclusions

Our Solution:Our Solution: Injecting randomnessInjecting randomness Multiple runsMultiple runs Apply statistical techniquesApply statistical techniques

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2424

Backup SlidesBackup Slides

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2525

Effects of OS SchedulingEffects of OS Scheduling

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2626

WCR DefinitionWCR Definition

Percentage of comparison simulation Percentage of comparison simulation experiments that reach a wrong experiments that reach a wrong conclusionconclusion

The correct conclusion is the The correct conclusion is the relationship between averages of the relationship between averages of the two populationstwo populations

WCR can be used to estimate the WCR can be used to estimate the wrong conclusion probability for wrong conclusion probability for single experimentssingle experiments

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2727

Confidence Intervals - EquationsConfidence Intervals - Equations

The confidence The confidence interval for the interval for the mean of a normally mean of a normally distributed infinite distributed infinite population:population:

Sample Size Sample Size needed to limit needed to limit mean relative error mean relative error to r:to r:

n

tsymean

n

tsy

2

Yr

tSn

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2828

Hypothesis TestingHypothesis Testing

Tests whether there is no difference Tests whether there is no difference between two population meansbetween two population means Hypothesis: Hypothesis: μμ3232 = = μμ64 64 tests whether the two tests whether the two

means of the 32 and 64 ROB configurations means of the 32 and 64 ROB configurations are differentare different

Hypothesis is tested using sample Hypothesis is tested using sample means and variancesmeans and variances

If hypothesis rejected If hypothesis rejected Our Our conclusion is significantconclusion is significant

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2929

Accounting for Time VariabilityAccounting for Time Variability

Is time variability caused by the Is time variability caused by the same effects that cause space same effects that cause space variability?variability? Use Analysis of Variance (ANOVA)Use Analysis of Variance (ANOVA)

If time variability is caused by If time variability is caused by different effects, we need to obtain a different effects, we need to obtain a time sampletime sample Observations obtained from different starting Observations obtained from different starting

pointspoints

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 3030

Multi-threaded Workloads and Multi-threaded Workloads and SimulationSimulation

Multi-threaded workloads are Multi-threaded workloads are importantimportant Workloads for commercial serversWorkloads for commercial servers New architectures support multi-threadingNew architectures support multi-threading

Performance metrics are different Performance metrics are different from traditional benchmarksfrom traditional benchmarks Throughput-oriented (transactions)Throughput-oriented (transactions) IPC is not appropriate (idle time!)IPC is not appropriate (idle time!)

Simulation Challenge: Comparing systems Simulation Challenge: Comparing systems running multi-threaded applicationsrunning multi-threaded applications

HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 3131

Simulation of Multi-threaded WorkloadsSimulation of Multi-threaded Workloads

Simulation is slow!Simulation is slow! We cannot simulate the whole workloadWe cannot simulate the whole workload

Solution: Solution: Run for a fixed number of transactionsRun for a fixed number of transactions Measure the per-transaction runtime (cycles Measure the per-transaction runtime (cycles

per transaction)per transaction) Use to compare different systemsUse to compare different systems