acmse’04, aldepartment of electrical and computer engineering - uah execution characteristics of...
Post on 02-Jan-2016
217 Views
Preview:
TRANSCRIPT
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Execution Characteristics of SPEC CPU2000 Execution Characteristics of SPEC CPU2000 Benchmarks: Intel C++ vs. Microsoft VC++Benchmarks: Intel C++ vs. Microsoft VC++
Swathi Tanjore Gurumani, Aleksandar MilenkovicSwathi Tanjore Gurumani, Aleksandar Milenkovic
Electrical and Computer Engineering DepartmentElectrical and Computer Engineering Department
University of Alabama in HuntsvilleUniversity of Alabama in Huntsville
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
OutlineOutline
• Objective• Background• Problem Overview• Performance Evaluation - Overview• Experimental Setup• Results• Conclusion and Future Research
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Problem ObjectiveProblem Objective
Prove and stress the importance of Prove and stress the importance of designing architecture-aware compilersdesigning architecture-aware compilers
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Background - Application Performance Background - Application Performance
Advancement in processor technology• Deep pipelining• Multi-level cache hierarchy• Improved branch predictors• Out of order execution engine• Advanced floating point• Multimedia units
Compilers • Optimization levels and switches
Compilers should keep up with processor technology
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Compiler/hardware interaction can maximize application performance by• Exploiting advances in processor technology• Generating target-specific optimal codes
Path length reduction Efficient instruction selection Pipelining scheduling Instruction level parallelism Memory penalty minimization
Architecture-aware CompilersArchitecture-aware Compilers
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Performance EvaluationPerformance Evaluation
Systematic process of data collection and analysis to determine and evaluate any system
Benchmarks ExeCompile Performance Metrics
Benchmarks: A program that performs a strictly defined set of operations (a workload) and returns some form of result (a metric) describing how the tested computer performed.
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Performance Evaluation Performance Evaluation – Previous Works– Previous Works
Study underlying architecture and characterize workloads • Evaluation of Pentium Pro using SPEC 2000• Evaluation of Pentium II using Multimedia applications
Processor centric optimization• Xeon vs. Pentium III• Pentium III vs. Pentium IV
Compilers and optimization• Branch optimizations by different compilers
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Problem OverviewProblem Overview
ObjectiveObjectiveProve and stress the importance of architecture aware compilers
How?How?• Compile benchmarks using different compilers• Use same optimization switches• Execute the binaries using performance analyzer• Analyze and compare the performance metrics collected
Same OS, hardware features - difference in metrics only due to compiler used
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Experimental SetupExperimental Setup
SPEC CPU2000
ExeIC++Performance Metrics
ExeVC++ Performance Metrics
VTune
VTune
Processor : Pentium IVOperating System : Windows 2000Optimization Level : /O2Input : Reference set from SPEC
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
SPEC CPU2000 SPEC CPU2000
Portray real user application and computation intensive
Can measure performance of processor, memory and compilercompiler
Does not stress on I/O devices, networking and OS
Used CINT2000 and CFP2000
Name Description
164.gzip (INT) Data Compression written in C
176.gcc (INT) C Programming Language Compiler
177.mesa (FP) 3-D Graphics Library written in C
181.mcf (INT) Combinatorial Optimization written in C
186.crafty (INT) Chess – Game Playing written in C
197.parser (INT) Word Processing written in C
252.eon (INT) Computer Visualization written in C++
253.perlbmk (INT) PERL Programming Language written in C
254.gap (INT) Group Theory, Interpreter written in C
255.vortex (INT) Object Oriented database written in C
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
VTune Performance AnalyzerVTune Performance Analyzer
Simultaneous sampling of multiple events and real time display using counter monitors
Supports time-based and event-based samplingevent-based sampling• To take advantage of Pentium IV’s EBS feature
Has a low intrusion• Samples collected provide a closer representation of application’s
actual performance
Events Collected• Clockticks, instructions retired, loads retired, stores retired,
branches retired, I level cache misses and mispredicted branches
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Compiler OptimizationsCompiler Optimizations
Both compilers were used with /O2 option Invoke the same switches and have same
functions Microsoft VC++ has special switches to target
Pentium (/G5) & Pentium Pro (/G6) Intel C++ compiler optimizes performance for
applications running on Intel architecture-based computers
Option Effect
/Od Disable optimization
/O1 Minimize size
/O2 Maximize speed
Performance gains by using IC++ are result of- profile-guided optimization- pre-fetch instruction- support for Streaming SIMD
Extensions (SSE) - data prefetching- inter-procedural optimization
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of Clock ticksComparison of Clock ticks
On average, 10% performance gain with IC++
Performance gain more pronounced for 3D graphics library and computer visualization application
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255
Applications
Cloc
ktic
ks R
atio
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of BinariesComparison of Binaries
Benchmark Code Size (in
Bytes)
MSVC++ IC++
164.gzip 69,632 77,824
176.gcc 1,089,536 1,314,816
177.mesa 442,368 610,304
181.mcf 49,152 53,248
186.crafty 241,664 258,048
197.parser 118,784 131,072
252.eon 405,504 413,696
253.perlbmk 516,096 651,264
254.gap 356,352 413,696
255.vortex 417,792 454,656
VC++ produced smaller sized binariesVC++ produced smaller sized binaries
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of Instruction CountComparison of Instruction Count
3D and Computer Visualization applications have a much reduced instruction count than others
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255
Applications
Instr
ucti
on
Co
un
t
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of LoadsComparison of Loads
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255Applications
Dis
trib
uti
on
of
Lo
ad
s
Icount
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of StoresComparison of Stores
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255
Applications
Dis
trib
uti
on
of
Sto
res
Icount
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of BranchesComparison of Branches
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255
Applications
Dis
trib
uti
on
of
Bra
nch
es
Icount
MSVC++
IC++
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255Applications
Mis
pred
icte
d B
ranc
hes
Rat
io
Branches
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of Other InstructionsComparison of Other Instructions
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255Applications
Dis
trib
uti
on
of
Oth
er
Inst
ruct
ion
s Icount
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Comparison of Cache MissesComparison of Cache Misses
0
0.2
0.4
0.6
0.8
1
1.2
164 176 177 181 186 197 252 253 254 255Applications
I-L
evel
Cac
he
Mis
ses
Rat
io
References
MSVC++
IC++
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Conclusion & Future ResearchConclusion & Future Research
Execution characteristics of CPU2000 benchmarks was presented for VC++ and IC++
IC++ performed better than VC++ for all considered applications and more pronounced for graphics applications
Distribution of loads, stores and branches were same – difference in absolute numbers
No difference in branch prediction and memory references Use - Strength and weakness of compilers Future Directions
• Different Optimization switches
• Usage of microbenchmarks for better control
ACMSE’04, ALACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Department of Electrical and Computer Engineering - UAH
Thank You!
Questions and Feedback…
top related