variability in architectural simulations of multi-threaded workloads alaa r. alameldeen and david a....
TRANSCRIPT
Variability in Architectural Variability in Architectural Simulations of Multi-threaded Simulations of Multi-threaded
WorkloadsWorkloads
Alaa R. AlameldeenAlaa R. Alameldeen and David A. Wood and David A. Wood
University of Wisconsin-MadisonUniversity of Wisconsin-Madison
{alaa,david}@cs.wisc.edu{alaa,david}@cs.wisc.edu
http://www.cs.wisc.edu/multifacet/http://www.cs.wisc.edu/multifacet/
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 22
MotivationMotivation
Experimental scientists use statisticsExperimental scientists use statistics Computer architects in simulation Computer architects in simulation
experiments experiments don’tdon’t!! Why ignore statistics?Why ignore statistics?
Simulations are deterministicSimulations are deterministic
This can lead to wrong conclusions!This can lead to wrong conclusions!
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 33
Workload VariabilityWorkload Variability
OLTP
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 44
Workload VariabilityWorkload Variability
OLTP
Slower memory is
better!
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 55
What Went Wrong?What Went Wrong?
Many possible executions for each Many possible executions for each configurationconfiguration
Why? Different timing effectsWhy? Different timing effects OS scheduling decisionsOS scheduling decisions Different orders of lock acquisitionDifferent orders of lock acquisition Different transaction mixesDifferent transaction mixes
This is magnified by short simulationsThis is magnified by short simulations
Variability can lead to wrong Variability can lead to wrong conclusionsconclusions
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 66
OverviewOverview
Variability is a real phenomenon for Variability is a real phenomenon for multi-threaded workloadsmulti-threaded workloads Runs from same initial state can be differentRuns from same initial state can be different
Variability is a challenge for Variability is a challenge for simulationssimulations Simulations are shortSimulations are short
Our solution accounts for variabilityOur solution accounts for variability Multiple runs, statistical techniquesMultiple runs, statistical techniques
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 77
OutlineOutline
Motivation and OverviewMotivation and Overview Variability in Real SystemsVariability in Real Systems
Time and Space VariabilityTime and Space Variability Variability in SimulationsVariability in Simulations Accounting for VariabilityAccounting for Variability ConclusionsConclusions
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 88
What is Variability?What is Variability?
Differences between multiple Differences between multiple estimates of a workload’s estimates of a workload’s performanceperformance
Time Variability:Time Variability: Performance changes during different phases Performance changes during different phases
of a single runof a single run Space Variability:Space Variability:
Runs starting from the same state follow Runs starting from the same state follow different execution pathsdifferent execution paths
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 99
Time Variability in Real SystemsTime Variability in Real Systems
OLTP
One-second intervals
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1010
Time Variability Example (Cont’d)Time Variability Example (Cont’d)
How is this handled in real experiments?How is this handled in real experiments? Solution:Solution: Run your experiment long enough! Run your experiment long enough!
OLTP
One-minute intervals
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1111
Space Variability in Real Systems Space Variability in Real Systems
OLTP
One-second averages
5 runs
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1212
Space Variability Example (Cont’d)Space Variability Example (Cont’d) How is this handled in real experiments?How is this handled in real experiments?
Same Solution:Same Solution: Run your experiment long enough! Run your experiment long enough!
16-day simulation
OLTP
One-minute averages
5 runs
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1313
OutlineOutline
Motivation and OverviewMotivation and Overview Variability in Real SystemsVariability in Real Systems Variability in SimulationsVariability in Simulations
Simulation InfrastructureSimulation Infrastructure Injecting Randomness Injecting Randomness The Wrong Conclusion RatioThe Wrong Conclusion Ratio
Accounting for VariabilityAccounting for Variability ConclusionsConclusions
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1414
Simulation InfrastructureSimulation Infrastructure
WorkloadsWorkloads Two scientific and five commercial benchmarksTwo scientific and five commercial benchmarks
Target System: E10000-like 16-node Target System: E10000-like 16-node systemsystem
Full System SimulationFull System Simulation Virtutech Simics running Solaris 8 on SPARC V9Virtutech Simics running Solaris 8 on SPARC V9 A blocking processor model (Simics)A blocking processor model (Simics) An OoO processor model (TFSim – Mauer et al., An OoO processor model (TFSim – Mauer et al.,
SIGMETRICS’02)SIGMETRICS’02) Memory system simulatorMemory system simulator
MOSI invalidation-based broadcast coherence MOSI invalidation-based broadcast coherence protocol (Martin et al., HPCA-02)protocol (Martin et al., HPCA-02)
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1515
Simulating Space Variability?Simulating Space Variability?
Simulations are deterministicSimulations are deterministic Variability cannot be ignored for Variability cannot be ignored for
multi-threaded applications multi-threaded applications One execution may not be representativeOne execution may not be representative Execution paths affect simulation conclusionsExecution paths affect simulation conclusions
We need to obtain a space of resultsWe need to obtain a space of results
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1616
Injecting RandomnessInjecting Randomness
We introduce artificial random We introduce artificial random perturbations in each simulation runperturbations in each simulation run
For each memory access, latency in For each memory access, latency in nanoseconds becomes Latency + rnanoseconds becomes Latency + r(r = -2, -1, 0, 1, 2 nanoseconds, uniform dist.)(r = -2, -1, 0, 1, 2 nanoseconds, uniform dist.)
Roughly models contention due to Roughly models contention due to DMA trafficDMA traffic
Other methods are possibleOther methods are possible
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1717
Simulated Space VariabilitySimulated Space Variability
Space variability exists in our benchmarksSpace variability exists in our benchmarks
20 runs~10 hrs sim.
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1818
Quantifying Variability: Quantifying Variability: The Wrong Conclusion Ratio (WCR)The Wrong Conclusion Ratio (WCR)
WCR (16,32) = 18%WCR (16,32) = 18% WCR (16,64) = 7.5%WCR (16,64) = 7.5% WCR (32,64) = 26%WCR (32,64) = 26%
OLTP
20 runs50 Xacts
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 1919
OutlineOutline
Motivation and OverviewMotivation and Overview Variability in Real SystemsVariability in Real Systems Variability in SimulationsVariability in Simulations Accounting for VariabilityAccounting for Variability ConclusionsConclusions
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2020
Confidence IntervalsConfidence Intervals
Definition:Definition: Range of values expected to include Range of values expected to include
population parameter (e.g. mean)population parameter (e.g. mean) Confidence Probability:Confidence Probability:
Probability that true mean lies inside Probability that true mean lies inside confidence intervalconfidence interval
For the same confidence probability:For the same confidence probability: Sample Size Sample Size ↑↑ → Confidence Interval → Confidence Interval ↓↓
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2121
Accounting for Space VariabilityAccounting for Space Variability
OLTP
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2222
Accounting for Space VariabilityAccounting for Space Variability
Simple solution: Estimate #runs such that Simple solution: Estimate #runs such that confidence intervals do not overlapconfidence intervals do not overlap
Tests of hypotheses can be used (paper)Tests of hypotheses can be used (paper)
OLTP
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2323
ConclusionsConclusions
Short runs of multi-threaded Short runs of multi-threaded workloads exhibit variabilityworkloads exhibit variability
Variability can lead to wrong Variability can lead to wrong simulation conclusionssimulation conclusions
Our Solution:Our Solution: Injecting randomnessInjecting randomness Multiple runsMultiple runs Apply statistical techniquesApply statistical techniques
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2424
Backup SlidesBackup Slides
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2525
Effects of OS SchedulingEffects of OS Scheduling
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2626
WCR DefinitionWCR Definition
Percentage of comparison simulation Percentage of comparison simulation experiments that reach a wrong experiments that reach a wrong conclusionconclusion
The correct conclusion is the The correct conclusion is the relationship between averages of the relationship between averages of the two populationstwo populations
WCR can be used to estimate the WCR can be used to estimate the wrong conclusion probability for wrong conclusion probability for single experimentssingle experiments
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2727
Confidence Intervals - EquationsConfidence Intervals - Equations
The confidence The confidence interval for the interval for the mean of a normally mean of a normally distributed infinite distributed infinite population:population:
Sample Size Sample Size needed to limit needed to limit mean relative error mean relative error to r:to r:
n
tsymean
n
tsy
2
Yr
tSn
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2828
Hypothesis TestingHypothesis Testing
Tests whether there is no difference Tests whether there is no difference between two population meansbetween two population means Hypothesis: Hypothesis: μμ3232 = = μμ64 64 tests whether the two tests whether the two
means of the 32 and 64 ROB configurations means of the 32 and 64 ROB configurations are differentare different
Hypothesis is tested using sample Hypothesis is tested using sample means and variancesmeans and variances
If hypothesis rejected If hypothesis rejected Our Our conclusion is significantconclusion is significant
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 2929
Accounting for Time VariabilityAccounting for Time Variability
Is time variability caused by the Is time variability caused by the same effects that cause space same effects that cause space variability?variability? Use Analysis of Variance (ANOVA)Use Analysis of Variance (ANOVA)
If time variability is caused by If time variability is caused by different effects, we need to obtain a different effects, we need to obtain a time sampletime sample Observations obtained from different starting Observations obtained from different starting
pointspoints
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 3030
Multi-threaded Workloads and Multi-threaded Workloads and SimulationSimulation
Multi-threaded workloads are Multi-threaded workloads are importantimportant Workloads for commercial serversWorkloads for commercial servers New architectures support multi-threadingNew architectures support multi-threading
Performance metrics are different Performance metrics are different from traditional benchmarksfrom traditional benchmarks Throughput-oriented (transactions)Throughput-oriented (transactions) IPC is not appropriate (idle time!)IPC is not appropriate (idle time!)
Simulation Challenge: Comparing systems Simulation Challenge: Comparing systems running multi-threaded applicationsrunning multi-threaded applications
HPCA 2003HPCA 2003 Alaa Alameldeen and David WoodAlaa Alameldeen and David Wood 3131
Simulation of Multi-threaded WorkloadsSimulation of Multi-threaded Workloads
Simulation is slow!Simulation is slow! We cannot simulate the whole workloadWe cannot simulate the whole workload
Solution: Solution: Run for a fixed number of transactionsRun for a fixed number of transactions Measure the per-transaction runtime (cycles Measure the per-transaction runtime (cycles
per transaction)per transaction) Use to compare different systemsUse to compare different systems