Overview of SPEC and the
SPEC High Performance Group
Robert Henschel, Sandra Wienke, Bo Wang, Sunita
Chandrasekaran, Jimmy ChengGuido Juckeland, Junjie Li,
Veronica G. Vergara Larrea
http://go.iu.edu/21BQ
June 24, 2018 2
Contents
• Tutorial Overview
• Intro to SPEC and SPEC HPG
• The SPEC Benchmark Philosophy
• SPEC HPG Benchmarks
ISC'18 SPEC Tutorial - Part A: Overview
3
Tutorial Overview
• Overview of SPEC and SPEC HPG (30 min)
• How to get, setup and run the SPEC benchmarks (2 h)
• How to interpret and compare SPEC benchmark results (45 min)
• Conclusion and Wrap-Up (10 min)
ISC'18 SPEC Tutorial - Part A: Overview
http://go.iu.edu/21BQ
June 24, 2018
4
Contents
• Intro to SPEC and SPEC HPG
• The SPEC Benchmark Philosophy
• SPEC HPG Benchmarks
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
5
Standards Performance Evaluation Corporation (SPEC)
• SPEC is a non-profit corporation formed in 1988 to establish, maintain and endorse
standardized benchmarks and tools to evaluate performance and energy efficiency
for the newest generation of computing systems.
• Composed of four groups
• Graphics and Workstation Performance Group (GWPG)
• High Performance Group (HPG)
• Open Systems Group (OSG)
• Research Group (RG)
• https://www.spec.org
• https://www.spec.org/hpg/
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
6
Standards Performance Evaluation Corporation (SPEC)
• SPEC is a non-profit corporation formed in 1988 to establish, maintain and endorse
standardized benchmarks and tools to evaluate performance and energy efficiency
for the newest generation of computing systems.
• Composed of four groups
• Graphics and Workstation Performance Group (GWPG)
• High Performance Group (HPG)
• Open Systems Group (OSG)
• Research Group (RG)
• https://www.spec.org
• https://www.spec.org/hpg/
ISC'18 SPEC Tutorial - Part A: Overview
Largest & Oldest Group
• Cloud
• CPU
• Java
• Power
• Virtual Machine
• File Server
June 24, 2018
7
Standards Performance Evaluation Corporation (SPEC)
• SPEC is a non-profit corporation formed in 1988 to establish, maintain and endorse
standardized benchmarks and tools to evaluate performance and energy efficiency
for the newest generation of computing systems.
• Composed of four groups
• Graphics and Workstation Performance Group (GWPG)
• High Performance Group (HPG)
• Open Systems Group (OSG)
• Research Group (RG)
• https://www.spec.org
• https://www.spec.org/hpg/
ISC'18 SPEC Tutorial - Part A: Overview
HPC benchmarks
• MPI
• OpenMP
• Accelerator
- OpenCL
- OpenACC
- OpenMP 4.5
June 24, 2018
8
SPEC Members
ISC'18 SPEC Tutorial - Part A: Overview
135 Organizations as of April-2018, including:
- 99 companies
- 36 academic institutions
June 24, 2018
9
SPEC High Performance Group (HPG)
ISC'18 SPEC Tutorial - Part A: Overview
HPG develops benchmarks for high-performance
computing systems, using real world applications.
30 Organizations as of April-2018
10 companies
20 academic
June 24, 2018
10
Contents
• Intro to SPEC and SPEC HPG
• The SPEC Benchmark Philosophy
• SPEC HPG Benchmarks
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
11
SPEC Benchmark Philosophy
• The result of a SPEC benchmark suite is always a SPEC score.
• Higher is better
• Some benchmarks also have a power score, in addition to a performance score
• This score is always in relation to a reference machine.
• Each benchmark has its own reference machine
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
12
SPEC Benchmark Philosophy cont’d
• SPEC (HPG) benchmarks are full applications.
• Including all the overhead of a real application
• SPEC harness ensures correctness of results.
• To detect “overly aggressive optimization”
• To guard against tampering
• Each benchmark suite has a set of run rules.
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
13
SPEC Benchmark Philosophy cont’d
• Hierarchy within benchmark suits
• Benchmark suite i.e. SPEC OMP2012
• Dataset size i.e. Large (“gross”)
• Component i.e. 350.md
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
14
SPEC Benchmark Philosophy cont’d
• Benchmarks support “Base” and “Peak” configuration
• These yield separate SPEC scores.
• “Peak” runs allow for more freedom.
• Base runs
• The same compiler switches for all components
• The same parallelism
• Only portability switches allowed
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
15
SPEC Power
• SPEC provides a standard methodology to measure and
report power usage which can be incorporated into a
SPEC benchmark.
• Normalizes the power usage across the full run of the
suite
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
16
SPEC Power
ISC'18 SPEC Tutorial - Part A: Overview
Power and
temperature
daemon
Power meter
Temperature sensor
System Under Test
(SUT)
June 24, 2018
17
Benchmark Development Process
• Group effort, with lots of discussions
• Final decisions are by vote, even though we strive for consensus
• Technical and managerial parts
• Find benchmark components and define run rules
• Using SPEC provided tools
• GIT, harness, “common rules”
• Websites, mailing lists, meeting venues
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
18
Result Submission Process
• Obtain and install the benchmark
• Perform a valid run
• Supply hardware and software description
• Submit result for review (and publication) to SPEC HPG
• 2 week review process
• (Define embargo period)
• Use the result as you like, respecting the SPEC fair use guidelines.
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
19
The Value of a Curated Result Repository
• Given appropriate hardware…. a published result should be
reproducible just with the information available in the
submission.
• Peer reviewed results are so much better than “everyone can
upload a result”!
• The value of a benchmark suite lies in public results, their
correctness and the ability to compare them.
• SPEC has demonstrated that benchmarks can be sustainable
over a long period of time!ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
20
Contents
• Intro to SPEC and SPEC HPG
• The SPEC Benchmark Philosophy
• SPEC HPG Benchmarks
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
22
SPEC CPU – Not an HPG Benchmark!!
• SPEC CPU (2006 and 2017) is the most well known SPEC benchmark.
• Created by the Open Systems Group of SPEC
• HPG uses the same framework, if you are familiar with running SPEC CPU, you
can run SPEC HPG benchmarks (And the other way around!).
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
23
SPEC OMP2012
• Follow on to SPEC OMP2001
• 14 applications
• Scales up to 512 threads
• Support for power measurement
• Citation:
Matthias S. Müller, John Baron, William C. Brantley, Huiyu Feng, Daniel Hackenberg, Robert Henschel, Gabriele
Jost, Daniel Molka, Chris Parrott, Joe Robichaux, Pavel Shelepugin, Matthijs van Waveren, Brian Whitney, and
Kalyan Kumaran. 2012. SPEC OMP2012 -- an application benchmark suite for parallel systems using OpenMP.
In Proceedings of the 8th international conference on OpenMP in a Heterogeneous World (IWOMP'12), Barbara
M. Chapman, Federico Massaioli, Matthias S. Müller, and Marco Rorro (Eds.). Springer-Verlag, Berlin, Heidelberg,
223-236. DOI=http://dx.doi.org/10.1007/978-3-642-30961-8_17
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
24
SPEC OMP2012
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
25
SPEC OMP2012
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
26
SPEC ACCEL
• SPEC Accel provides a comparative performance measure of
• Hardware accelerator devices (GPU, Co-processors, etc.)
• Supporting software tool chains (Compilers, Drivers, etc.)
• Host systems and accelerator interface (CPU, PCIe, etc.)
• Computationally-intensive parallel HPC applications and mini-apps
• Portable across multiple accelerators
• Three distinct benchmarks
• OpenACC v1.2
• OpenCL v1.1
• OpenMP v1.2
• Support for power measurementISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
27
SPEC MPI2007
• Large and medium data set
• 13 applications
• Scales to 2048 MPI processes
• Power not supported
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
28
Under Development
• SPEC MPI+X
• First hybrid benchmark to address larger heterogeneous systems
• https://www.spec.org/hpg/search/
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
29
Join and Contribute
• Submit results
• Full members vs. associate members
• Contribute benchmark components
• Help with benchmark suite development
• Test release candidates
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
30
Result Submissions by Benchmark
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
31
Annual Result Submissions
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
32
Thank you!
ISC'18 SPEC Tutorial - Part A: Overview
ISC Feedback Forms: https://2018.isc-program.com/
“Using the SPEC HPG Benchmarks for Better Analysis and Evaluation of Current
and Future HPC Systems”
Contact
• SPEC Headquarters: [email protected]
• Robert Henschel, Indiana University, USA: [email protected]
• Sandra Wienke, RWTH Aachen University, Germany: [email protected]
• Bo Wang, RWTH Aachen University, Germany: [email protected]
Or meet us at the
ISC18 show floor:
JARA booth B-1320!
June 24, 2018
33
Thank you!
Questions?
ISC'18 SPEC Tutorial - Part A: OverviewJune 24, 2018
How to get, setup and run
SPEC benchmarks
Robert Henschel, Sandra Wienke, Bo Wang, Sunita Chandrasekaran,
Guido Juckeland, Junjie Li, Verónica G. Vergara Larrea
http://go.iu.edu/21BQ
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 2
Cluster login
• Read section “Overview” in handout
• Follow instructions in section “Setup”
in handout
• Then, stop.
• Follow along interactive demo
June 24, 2018 3ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
• Interactive demo time!
• We present SPEC OMP config files
• Outlook on SPEC ACCEL OpenACC
• Opportunity to follow instructions interactively
(also see handouts)
• Later: run benchmarks
Choose from OpenMP or OpenACC (whatever
you like)
Note: MPI runs take very long – not covered here
• Interpret results
hands-onIf you have any problems, let us know
immediately! We are happy to help you!
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 4
System Requirements
• Different benchmarks suites
different requirements
• SPEC OMP2012, SPEC MPI2007, SPEC ACCEL
• Supported operating systems: AIX, Linux, MacOS, Solaris, Windows (except very
old Windows)
• Please do not use Windows/Unix compatibility products
• Compatible processors
• CPU
• GPU
• APU
• Xeon Phi
June 24, 2018 5ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
OpenMP: http://spec.org/omp2012/Docs/system-requirements.html
MPI: http://spec.org/mpi2007/Docs/system-requirements.html
ACCEL: https://www.spec.org/accel/Docs/system-requirements.html
links
System Requirements
• Memory requirements
• OpenMP: 28GB for the whole system
• MPI: 1GB/rank (medium size) and 2GB/rank (large size)
• ACCEL: 4GB of host mem + 2GB of device mem
• Otherwise, you are measuring your paging file, not your system
• Disk space requirements
• OpenMP: 8GB
• MPI: 10GB (medium), 17GB (large, big endian), 24GB (large, little endian)
• ACCEL: 9GB
• Support of compilers
• C99, C++98 and Fortran-95 compilers + MPI library for SPEC MPI 2007
June 24, 2018 6ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
OpenMP: http://spec.org/omp2012/Docs/system-requirements.html
MPI: http://spec.org/mpi2007/Docs/system-requirements.html
ACCEL: https://www.spec.org/accel/Docs/system-requirements.html
links
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 7
Acquisition of SPEC Benchmarks
• Single SPEC suites
• Commercial license
• Non-profit license
• SPEC membership
• Receive benchmarks for free
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 8
http://spec.org/order.html
https://www.spec.org/hpg/joining.html
links
Let’s go shopping!
Non-profit organizations get 100% off
• Commercial license
• Must be purchased via order form
• Commercial enterprises (not being SPEC/HPG member) engaging in marketing, developing,
testing, consulting for and/or selling computers, computer services, accelerator devices, drivers,
software or other high performance computing systems or components in the computer
marketplace
• Non-commercial license
• Free of charge
• Organizations that do not require a
commercial license
• Valid for the organization (not individual)
• Institutional e-mail address required
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 9
Benchmark Suite Non-Profit Commercial
CPU2017 V1.0.2 $250 $1,000
ACCEL V1.2 free $2,000
OMP2012 V1.0 free $2,000
MPI2007 V2.0.1 free $2,000
SPECpower_ssj2008 V1.12 $400 $1,600
Press Release 2018: https://www.spec.org/news/hpgnonprofitpricing.html
Non-commercial download + definition: http://spec.org/hpgdownload.html
links
SALE
100%
OFF
Download
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 10
Typically an
ISO image
Order Form Download as member
$> md5sum –c omp2012-1.0.iso.xz.md5
$> xz –d omp2012-1.0.iso.xz
$> mkdir spec-iso
$> mount –t iso9660 –o loop,ro omp2012-1.0.iso spec-iso
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 11
Unpacking (when you can mount ISO images)
Check for correct
download
Unpack archive
Mount ISO image in
subdirectory
Useful hint:
Generate a subdirectory for every benchmark suite! Move the downloads there!
Unpacking (when you cannot mount ISO images)
• Use the tar.xz file (available to members and upon requests)
• OR: Copy tar archive from the ISO
• Use tools such as isoinfo or mc
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 12
$> md5sum –c omp2012-1.0.tar.xz.md5
$> tar xvJf omp2012-1.0.tar.xz
$> isoinfo –J –l –i omp2012-1.0.iso
S> isoinfo -J -i omp2012-1.0.iso -x /install_archives/omp2012.tar.xz.md5 >
omp2012.tar.xz.md5
$> isoinfo -J -i omp2012-1.0.iso -x /install_archives/omp2012.tar.xz >
omp2012.tar.xz
Check for correct
downloadUnpack archive
List files in iso image
Extract md5 and tar
ball from iso image-i: iso image
-x: extracts to stdout,
redirect needed
Then: unpack tar
ball (see above)
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 13
Installation
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 14
http://spec.org/omp2012/Docs/install-guide-unix.html
links
$> ./install.sh
SPEC OMP2012 Installation
Top of the OMP2012 tree is '/home/hpclab00/SPEC_OpenMP'
Installing FROM /home/hpclab00/SPEC_OpenMP
Installing TO /home/hpclab00/SPEC_OpenMP
Is this correct? (Please enter 'yes' or 'no')
yes
Install SPEC suite[-d <dest dir>]
Type yes and hit
enter
hands-on
Installation
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 15
http://spec.org/omp2012/Docs/install-guide-unix.html
links
The following toolsets are expected to work on your platform. If the
automatically installed one does not work, please re-run install.sh and
exclude that toolset using the '-e' switch.
The toolset selected will not affect your benchmark scores.
linux-suse10-amd64 For 64-bit AMD64/EM64T Linux systems running
SuSE Linux 10 or later, and other
compatible Linux distributions, including
some versions of RedHat Enterprise Linux
and Oracle Linux Server.
Built on SuSE Linux 10 with
GCC v4.1.0 (SUSE Linux)
linux-redhat72-ia32 For x86, IA-64, EM64T, and AMD64-based Linux
systems with GLIBC 2.2.4+.
Built on RedHat 7.2 (x86) with gcc 3.1.1
=================================================================
Attempting to install the linux-suse10-amd64 toolset...
Attempt to
automatically determine platform
Installation
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 16
http://spec.org/omp2012/Docs/install-guide-unix.html
links
=================================================================
Attempting to install the linux-suse10-amd64 toolset...
Checking the integrity of your source tree...
Checksums are all okay.
Removing previous tools installation
Unpacking binary tools for linux-suse10-amd64...
Checking the integrity of your binary tools...
Checksums are all okay.
Testing the tools installation (this may take a minute)
........................................................................o.....................
..............................................................................................
..............................
Automatic unpacking of files
Automatic testing of installation
Loading SPEC tools
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 17
http://spec.org/omp2012/Docs/install-guide-unix.html
links
[..]
Installation successful. Source the shrc or cshrc in
/home/hpclab00/SPEC_OpenMP
to set up your environment for the benchmark.
Hint how to proceed
>$ . ./shrc.sh
hands-onSource shrc or cshrc!
Without this nothing will work!!
Setup of environment variables and paths for SPEC, e.g.,
$SPEC to root path
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 18
Benchmark components
• Coming from real-world applications
• May have completely different characteristics
• Identification: number and name
• First number: affiliation to specific SPEC suite
• Documentation: website/ benchmark tree
• Language, application domain,…
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 19
Intel Vtune Amp. 350.md 363.swim
CPI rate 0.698 1.634
CPU utilization 69 % 23.6 %
Memory bound 11.8 % 48.5 %
% of packed FP instr 9.7 % 100 %
Intel Vtune Amplifier results
@ 2x Intel Broadwell EP w/
24 cores
350.md
363.swimDescription of benchmarks: https://www.spec.org/omp2012/Docs/index.html
links
Workloads
Data set sizes
• test: data for a simple test that an executable is functional
• train: data for feedback-directed optimization
• ref: the real data set, required for all result reporting
Runtime increases from a few seconds to tens of minutes per benchmark
(depending on your configuration)
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 20
SPEC Tools
Tools provided to ensure consistent operation of benchmarks across variety of
platforms
• specmake
• GNU make to build benchmarks
• runspec
• Primary tool in the suite
• Used to build the benchmarks, run them, and report their results
• Config file needed for usage (with detailed instructions on building/running the benchmarks)
• rawformat
• Results formatter needed for publishing SPEC results
• And more
June 24, 2018 21ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
Tool overview: http://spec.org/omp2012/Docs/tools-build.html
links
$> . ./shrc
$>
$> runspec
runspec - Run SPEC benchmarks
• Select benchmarks (suite or single ones)
• Specify config file
• Specify options
• E.g., data set size, iteration number, thread number
• Specify actions
• E.g., compile single benchmarks, run them and validate them
June 24, 2018 22
Use specific config file
Run selected (single) benchmark: 350
Or: whole suite, i.e., all
option: base & peak runs
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
--config=tutISC18-openmp-intel --tune=base,peak 350
Reminder! Must be sourced before usage
Runspec: https://www.spec.org/omp2012/Docs/runspec.html
links
Can overwrite
parameters in
config file
Run rules
• Part of SPEC’s philosophy
(see Part A “Overview of SPEC and SPEC HPG”)
• Aim: fair and objective benchmarking
Published results are meaningful, comparable to other results, reproducible
• Public SPEC results must adhere to these rules (license agreement)
• or must be clearly described as estimate
• Estimates
• May fail to provide one or more of the characteristics of public results• E.g., new chip design, prototype/ research compilers
• Still encouraged to obey as many of the run rules as practical
• Deviations from the rules must be clearly disclosed
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 23
OpenMP: https://www.spec.org/omp2012/docs/runrules.html
MPI: https://www.spec.org/mpi/docs/runrules.html
ACCEL: https://www.spec.org/accel/docs/runrules.html
links
You can use SPEC results
as “estimates” in your
research environment (also for publications)!
Run rules
Run rules also cover (amongst others)
• Building
• Restrictions for base & peak runs
• Portability flags must be approved
• Running
• Data set sizes pre-specified (test, train, ref)1
• Number of iterations in a reportable result
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 24
OpenMP: https://www.spec.org/omp2012/docs/runrules.html
MPI: https://www.spec.org/mpi/docs/runrules.html
ACCEL: https://www.spec.org/accel/docs/runrules.html1 For MPI2007: medium and large data sets
links
Details see later
Test Sponsor: entity
sponsoring the testing
(defaults to hardware
vendor) or can be
your university
Tester: entity actually
carrying out the tests
(defaults to test sponsor)
or can be your name
Config Files
• Contain instructions for
• Building benchmarks
• Running them
• Description of system under test
• How to write a config file?
• Often start off using a config file that
someone else has previously written1
• E.g. directory $SPEC/config/
• E.g., SPEC result submissions similar
to your system2
• Write your own3
June 24, 2018 25ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
Key for reproducibility!
1 https://www.spec.org/accel/docs/runspec.html#about_config2 https://www.spec.org/omp2012/results/omp2012.html3 https://www.spec.org/accel/docs/config.html
links
Source: as of May 20182
Config Files
Use Case:
How to investigate single-node performance with SPEC OMP2012?
What we will investigate…
1. Basics on writing config files
2. From base to peak runs
3. Switch of compilers
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 26
Structure of Config Files
June 24, 2018 27
#########################################################
# The header section of the config file. Must appear
# before any instances of "default="
#########################################################
# what to do: build, validate = build+ run+ check+ report
action = validate
# Number of iterations of a test
iterations = 1
# Tuning levels: base, peak
tune = base
# Dataset size: test, train, ref
size = test
# Number of used threads
threads = 48
# Environment variable will be set using "ENV_*"
env_vars = 1
# Output format: all, pdf, text, html and so on
output_format = text
flagsurl = $[top]/../flagsfile/Intel.xml
teeout = yes
# Run benchmarks according to your specific system config
# The variable "command" is the command used by spec
# submit = <system command> $command
#########################################################
# Software information
#########################################################
# Compilers. Using Intel compiler for example
default=default=default=default:
CC = icc
CXX = icpc
FC = ifort
F77 = ifort -f77rtl -fpconstant -intconstant
#########################################################
# Base
#########################################################
default=base=default=default:
OPTIMIZE = -O3 -qopenmp -ipo -xCORE-AVX2 -no-prec-div
COPTIMIZE = -ansi-alias
CXXOPTIMIZE = -ansi-alias
FOPTIMIZE = -align
# Environment variables at runtime
ENV_OMP_PROC_BIND = close
ENV_OMP_PLACES = cores
ENV_OMP_NESTED = FALSE
ENV_OMP_DYNAMIC = FALSE
#################################################################
# Portability flags for each benchmark
# Following flag should not have any impact on performance.
#################################################################
350.md=default=default=default:
FPORTABILITY = -free
367.imagick=default=default=default:
CPORTABILITY = -std=c99
357.bt331=default=default=default:
PORTABILITY = -mcmodel=medium
363.swim=default=default=default:
PORTABILITY = -mcmodel=medium
#################################################################
# Peak
#################################################################
350.md=peak=default=default:
OPTIMIZE = -O3 -qopenmp -ipo -xCORE-AVX2 -ansi-alias -opt-malloc-options=1
FOPTIMIZE = -fp-model fast=2 -no-prec-div -no-prec-sqrt -align array64byte
363.swim=peak=default=default:
OPTIMIZE = -O3 -qopenmp -ipo -xCORE-AVX2 -no-prec-div -ansi-alias -opt-
streaming-stores always -opt-malloc-options=4
#################################################################
# Hardware and software information for the machine under test.
# This information will be extracted for a reportable run.
# An example configuration can be copied from the website
# https://www.spec.org/accel/results/accel_acc.html
#################################################################
company_name = SPEC Tutorial Company
test_sponsor = SPEC Tutorial Sponsor
tester = SPEC Tutorial Tester
license_num = SPEC Tutorial License
machine_name = SPEC Tutorial Machine
hw_vendor = NEC
hw_avail = NOV-2016
hw_cpu = Intel Xeon E5-2650 v4
hw_cpu_mhz = 2200
hw_cpu_max_mhz = 2900
hw_cpu_char000 = Intel Turbo Boost Technology up to 2.9GHz,
hw_cpu_char001 = Hyper-Threading on
hw_disk = SATA, Samsung SM863, 120GB, SSD
hw_fpu = Integrated
hw_memory = 128 GB (8 x 16 GB 2Rx8 PC4-2400T-R)
hw_model = NEC HPC 1812Rg
hw_ncpu = 24 cores, 2 chips, 12 cores/chip (HT on)
hw_nchips = 2
hw_ncores = 24
hw_ncoresperchip= 12
hw_nthreadspercore = 2
hw_ocache = None
hw_pcache = 32 KB I + 32 KB D on chip per core
hw_scache = 256 KB I+D on chip per core
hw_tcache = 30 MB I+D on chip per chip
sw_avail = Feb-2016
sw_compiler000 = C/C++/Fortran: Version 16.0.2.181 of Intel
sw_compiler001 = Parallel Studio XE
sw_file = NFS
sw_os000 = CentOS Linux release 7.3.1611 (Core)
sw_os001 = 3.10.0-514.26.2.el7.x86_64
sw_other = None
############################################################
# MD5 section. It will be created by SPEC automatically.
# It is used by SPEC to check whether an executable if
# available is created using the current compiler and flags
# settings.
#######################################################
Header section• 1st section prior to names section• Usually runspec flags
Named section• Begins w/ “section marker”• One- to four-part string of the form
benchmark[,…]=tuning[,…]
=extension[,…]=machine[,…]: MD5 section• Automatically-generated
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
SPEC OMP Config File (1/8)
June 24, 2018 28
######################################################################
# The header section of the config file. Must appear
# before any instances of "default="
######################################################################
# what to do: build, validate = build + run + check + report
action =
# Number of iterations of a test
iterations =
# Tuning levels: base, peak
tune =
# Dataset size: test, train, ref
size =
• base: flags for all
benchmarks the same• peak: set of
optimizations
individually selected
for that benchmark
• How many times to run
each benchmarks
• e.g. for reportable run = 3
• build: compile benchmarks
• validate: benchmarks are
built if necessary, run and
reports are generated
• Data set sizes (from small to big): test, train, ref
• e.g. test for debugging new
set of compilation options
config file: $SPEC/config/tutISC18-openmp-intel.cfg
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
validate
1
base
test
hands-on
SPEC OMP Config File (2/8)
June 24, 2018 29
######################################################################
# The header section of the config file. Must appear
# before any instances of "default="
######################################################################
# Number of used threads
threads =
# Environment variable will be set using "ENV_*", see the next section
env_vars = 1
# Output format: all, pdf, text, html and so on
output_format =
flagsurl = $[top]/../flagsfile/Intel.xml
teeout = yes
• Different output
formats possible• In $SPEC/results
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
text
config file: $SPEC/config/tutISC18-openmp-intel.cfg
• Sets no of threads (default 1)• Overwrites OMP_NUM_THREADS
48
• Environment settings• ENV_VAR = …
• Apply to build phase
rebuild if any changes
• Description of portability &
tuning options (“Flags File”)
• Information on syntax of flags
and their meanings
• Needed for valid reports• Flags file: http://spec.org/omp2012/
Docs/flag-description.html)
• Var substitution: http://spec.org/
omp2012/Docs/config.html#sectionI.D
Displays the build
commands to screen
hands-on
SPEC OMP Config File (3/8)
June 24, 2018 30
# How to run the benchmarks according to your specific system configuration
# The variable "command" is the command used by spec
# submit = <system command> $command
# e.g., submit = aprun –n 1 $command
# e.g., submit = taskset –c 0-23 $command
• How to execute the benchmarks• Use $command for SPEC command
• Preferred to assign work to processors
• May place benchmarks on desired processors or
benchmark memory on a desired memory unit
• Especially needed for MPI runs
• Example 1: run job on one node
• Example 2: assign job to cores
• Can be used to change the run time environmentsubmit = export ENV_VAR=…; …
no rebuild if any changes occur
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
config file: $SPEC/config/tutISC18-openmp-intel.cfg
SPEC OMP Config File (4/8)
June 24, 2018 31
####################################################################
# Software information
####################################################################
# Compilers. Using Intel compiler for example
CC =
CXX =
FC =
F77 =
############################################################################
# Base
############################################################################
default= =default=default:
OPTIMIZE =
COPTIMIZE = -ansi-alias
CXXOPTIMIZE = -ansi-alias
FOPTIMIZE = -align
• Settings for base runs
• Compiler flags or environment variables
Named sectionbenchmark[,…]=tuning[,…]
=extension[,…]=machine[,…]:
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
default=default=default=default:
icc
icpc
ifort
-O3 -qopenmp -ipo -xCORE-AVX2 -no-prec-div
ifort -f77rtl -fpconstant -intconstant
base
hands-on
config file: $SPEC/config/tutISC18-openmp-intel.cfg
SPEC OMP Config File (5/8)
June 24, 2018 32
# Environment variables at runtime
ENV_OMP_PROC_BIND = close
ENV_OMP_PLACES = cores
ENV_OMP_NESTED = FALSE
ENV_OMP_DYNAMIC = FALSE
############################################################################
# Portability flags for each benchmark
# Following flag should not have any impact on performance.
############################################################################
=default=default=default:
FPORTABILITY = -free
357.bt331=default=default=default:
PORTABILITY = -mcmodel=medium
363.swim=default=default=default:
PORTABILITY = -mcmodel=medium
367.imagick=default=default=default:
CPORTABILITY = -std=c99
Set portability flags if benchmark cannot be
built/ execute correctly w/o these flags
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
Environment variables• Active if env_vars is set to 1 (see prior slides)
• Need to start with ENV_
config file: $SPEC/config/tutISC18-openmp-intel.cfg
350.md
hands-on
SPEC OMP Config File (6/8)
June 24, 2018 33
############################################################################
# Hardware and software information for the machine under test.
# This information will be extracted for a reportable run.
# An example configuration can be copied from the website
# https://www.spec.org/accel/results/accel_acc.html
############################################################################
company_name = SPEC Tutorial Company
test_sponsor = SPEC Tutorial Sponsor
tester = SPEC Tutorial Tester
license_num = SPEC Tutorial License
machine_name = SPEC Tutorial Machine
hw_vendor = NEC
hw_avail = NOV-2016
hw_cpu =
hw_cpu_mhz = 2200
hw_cpu_max_mhz = 2900
HW & SW description
• Needed only for reportable runs• runspec tools captures
information in submission file
• Very detailed information
Information on host
configuration, e.g. CPU
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
hands-onIntel Xeon E5-2650 v4
config file: $SPEC/config/tutISC18-openmp-intel.cfg
SPEC OMP Config File (7/8)
June 24, 2018 34
hw_cpu_char000 = Intel Turbo Boost Technology up to 2.9GHz,
hw_cpu_char001 = Hyper-Threading on
hw_disk = SATA, Samsung SM863, 120GB, SSD
hw_fpu = Integrated
hw_memory = 128 GB (8 x 16 GB 2Rx8 PC4-2400T-R)
hw_model = NEC HPC 1812Rg
hw_ncpu = 24 cores, 2 chips, 12 cores/chip (HT on)
hw_nchips = 2
hw_ncores = 24
hw_ncoresperchip= 12
hw_nthreadspercore = 2
hw_ocache = None
hw_pcache = 32 KB I + 32 KB D on chip per core
hw_scache = 256 KB I+D on chip per core
hw_tcache = 30 MB I+D on chip per chip
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
config file: $SPEC/config/tutISC18-openmp-intel.cfg
SPEC OMP Config File (8/8)
June 24, 2018 35
sw_avail = Feb-2016
sw_compiler000 = C/C++/Fortran: Version 16.0.2.181 of Intel
sw_compiler001 = Parallel Studio XE
sw_file = NFS
sw_os000 = CentOS Linux release 7.3.1611 (Core)
sw_os001 = 3.10.0-514.26.2.el7.x86_64
sw_other = None
############################################################################
# MD5 section. It will be created by SPEC automatically.
# It is used by SPEC to check whether an executable if available is created using
# the current compiler and flags settings.
############################################################################
Information on software
configuration, e.g. compilers
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
MD5 section
• Automatically generated by SPEC tools
• Used to check whether an executable
is created using the current settings
config file: $SPEC/config/tutISC18-openmp-intel.cfg
Run the config file – Base run
• Run in base mode
• Use train data set
• Execute benchmarks 350.md and 363.swim
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 36
$> runspec --config=tutISC18-openmp-intel[-complete] --tune=base --size=train 350 363
hands-on
batch script: $SPEC/config/tutISC18-openmp-intel.sh
$> bsub < tutISC18-openmp-intel.sh
Use your completed file orour provided file *-complete
hands-on
Submit to batch system.
Note: Machine is only
available on day of tutorialRuntime [s] Base Intel Peak Intel Base GNU
350.md 38.4
363.swim 4.40train data set, 48 threads, 2x Intel Broadwell EP @ 2.2 GHz (CLAIX), *24 threads
Flags - From base to peak runs
Portability flags
• Allowed if benchmark cannot be built and execute correctly w/o these flags
• Must be performance neutral
• Base rules except portability flags, i.e. flags may differ from one benchmark to
another (even in base)
• Requirements
• Provided over PORTABILITY flag
• Must be approved by SPEC HPG committee
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 37
https://www.spec.org/omp2012/docs/runrules.html#section2.2.4
links
Flags - From base to peak runs
Base runs (recap from Part A)
• Common set of optimizations & environment settings for all benchmarks
• “baseline”
• single set of switches
• single-pass make process
• high degree of portability, safety, performance
• Must adhere to strict rules
• e.g. same compiler for all modules of a given language
• All flags, options must be the same, e.g., also the level of parallelism
• Only portability switches allowed
• More rules (base & peak) on names, library substitutions, data type sizes, source code changes
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 38
Base optimization rules: https://www.spec.org/omp2012/Docs/runrules.html#section2.3
links
Flags - From base to peak runs
Peak runs
• Set of optimizations individually selected for each benchmark
• e.g. different compilers, flags
• Called “aggressive compilation”
Summary
• Many published results do not contain peak results (often coming from academic
institutions)
• Results submitted by vendors often contain peak results
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 39
Peak optimization rules: https://www.spec.org/omp2012/Docs/runrules.html#section2.4
Published results OMP2012: https://www.spec.org/omp2012/results/omp2012.html
links
Flags - From base to peak runs
• Modifying the config file
• Once you have a config file that runs on your system, it is easy to modify it
• E.g. peak optimizations for better performance
• Example: tuning of two benchmarks
• 350.md• Fortran
• Physics: Molecular Dynamics
• Computational intensive
• 363.swim• Fortran
• Weather Prediction
• Memory bandwidth limited
June 24, 2018 40ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
SPEC OMP Config File – Peak run
June 24, 2018 41
############################################################################
# Peak
############################################################################
default=peak=default=default:
OPTIMIZE = -O3 -qopenmp -ipo -xCORE-AVX2 -no-prec-div
COPTIMIZE = -ansi-alias
CXXOPTIMIZE = -ansi-alias
FOPTIMIZE = -align
# [..] Environment variables
350.md=peak=default=default:
OPTIMIZE=-O3 -qopenmp -ipo -xCORE-AVX2 –ansi-alias -qopt-malloc-options=1
FOPTIMIZE=-fp-model fast=2 -no-prec-div -no-prec-sqrt -align array64byte
363.swim=peak=default=default:
OPTIMIZE=-O3 -qopenmp -ipo -xCORE-AVX2 -ansi-alias -qopt-streaming-stores always
-qopt-malloc-options=4
threads=24
• -qopt-malloc-options: alternate
algorithm for malloc• -fp-model fast=2: aggressive
optimization on FP computations• -no-prec-sqrt: less precise square
root computations/ more performance• -align array64byte: align arrays to
64 Byte• -qopt-streaming-stores always:
use non-temporal stores (write through)
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
FP optimizations
memory optimizations
Run the config file – Peak run
• Run in peak mode
• Use train data set
• Execute benchmarks 350.md and 363.swim
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 42
$> runspec --config=tutISC18-openmp-intel[-complete] --tune=peak --size=train 350 363
hands-on
batch script: $SPEC/config/tutISC18-openmp-intel.sh
$> bsub < tutISC18-openmp-intel.sh
Use your completed file orour provided file *-complete
hands-on
Submit to batch system.
Note: Machine is only
available on day of tutorialRuntime [s] Base Intel Peak Intel Base GNU
350.md 38.4 34.4
363.swim 4.40 4.35*train data set, 48 threads, 2x Intel Broadwell EP @ 2.2 GHz (CLAIX), *24 threads
Switch of compilers
• Rewriting the config file
• Major changes in a config file (e.g. compiler) need more adaption
• Changes in compiler names, optimization flags, portability flags, peak tuning
• Example: from Intel to GNU compiler
• “Mix and match”
1. Start from previous Intel config file (that contains information on your hardware)
2. Use a published GNU config file for first ideas on flags
3. Optimize/ correct flags with respect to your hardware
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 43
SPEC OMP Config File – Different compiler
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 44
#######################################################################
# Software information
#######################################################################
# Compilers. Using GNU compiler for example
default=default=default=default:
CC =
CXX =
FC =
F77 =
#######################################################################
# Base
#######################################################################
default=base=default=default:
OPTIMIZE =
# Environment variables at runtime
[..]
New compiler executables
New base flags
Same or new environment variables
gcc
g++
gfortran
gfortran
config file: $SPEC/config/tutISC18-openmp-gnu.cfg
hands-on
-Ofast –march=native –fopenmp –O3
SPEC OMP Config File – Different compiler
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 45
#######################################################################
# Portability flags for each benchmark
# Following flag should not have any impact on performance.
#######################################################################
350.md=default=default=default:
FPORTABILITY = -ffree-form -fno-range-check
357.bt331=default=default=default:
FPORTABILITY = -mcmodel=medium
363.swim=default=default=default:
FPORTABILITY = -mcmodel=medium
367.imagick=default=default=default:
CPORTABILITY = -std=c99
############################################################################
# Peak
############################################################################
[..]
New portability flags
New peak flags
config file: $SPEC/config/tutISC18-openmp-gnu.cfg
SPEC OMP Config File – Different compiler
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 46
#######################################################################
# Hardware and software informations for the machine under test.
# These informations will be extracted for a reportable run.
# An example configuration can be copied from the website
# https://www.spec.org/omp2012/results/omp2012.html
#######################################################################
# Hardware information [..]
sw_avail = Feb-2016
sw_compiler = C/C++/Fortran:
sw_file = NFS
sw_os000 = CentOS Linux release 7.3.1611 (Core)
sw_os001 = 3.10.0-514.26.2.el7.x86_64
sw_other = None
Same hardware information
New compiler information
config file: $SPEC/config/tutISC18-openmp-gnu.cfg
Version 8.1.0 of GNU
Run the config file – Different compiler
• Run in base mode
• Use train data set
• Execute benchmarks 350.md and 363.swim
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 47
$> runspec --config=tutISC18-openmp-gnu[-complete] --tune=base --size=train 350 363
hands-on
$> bsub < tutISC18-openmp-gnu.sh
Use your completed file orour provided file *-complete
hands-on
Submit to batch system.
Note: Machine is only
available on day of tutorialRuntime [s] Base Intel Peak Intel Base GNU
350.md 38.4 34.4 220
363.swim 4.40 4.35* 4.56train data set, 48 threads, 2x Intel Broadwell EP @ 2.2 GHz (CLAIX), *24 threads
batch script: $SPEC/config/tutISC18-openmp-gnu.sh
Config Files
Use Case:
How to investigate single-GPU performance with SPEC ACCEL?
What we will investigate (as an outlook)…
• SPEC OpenACC with PGI compiler
Not covered here
• SPEC OpenCL
• SPEC OpenMP with target offloading
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 48
Later: Have some (hands-on) time to run SPEC
benchmarks
Choose from OpenMP or OpenACC (whatever
you like)
Later: Make more tests & look at results
SPEC ACCEL Config File (1/5)
June 24, 2018 49
######################################################################
# The header section of the config file. Must appear
# before any instances of "default="
######################################################################
# what to do: build, validate = build + run + check + report
action = validate
# Number of iterations of a test
iterations = 1
# Tuning levels: base, peak
tune = base
# Dataset size: test, train, ref
size = test
# Output format: all, pdf, text, html and so on
output_format = text
flagsurl = $[top]/../flagsfile/pgi.xml
teeout = yes
# How to run the benchmarks according to your specific system configuration
# The variable "command" is the command used by spec:
# submit = <system command> $command
• Nothing new here
• No setting of threads
config file: $SPEC/config/tutISC18-openacc-pgi.cfg
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
SPEC ACCEL Config File (2/5)
June 24, 2018 50
####################################################################
# Software information
####################################################################
# Compilers. Using PGI compiler for example
default=default=default=default:
CC = pgcc
CXX = pg++
FC = pgfortran
#######################################################################
# Portability flags for each benchmark
# Following flag should not have any impact on performance.
#######################################################################
359.miniGhost=default=default=default:
EXTRA_LDFLAGS += -Mnomain
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
config file: $SPEC/config/tutISC18-openacc-pgi.cfg
New compilers PGI for OpenACC
SPEC ACCEL Config File (3/5)
June 24, 2018 51
#######################################################################
# Base
#######################################################################
=base=default=default:
OPTIMIZE = -fast -Mfprelaxed
FOPTIMIZE = -acc -ta=tesla:cc60
COPTIMIZE = -acc -ta=tesla:cc60
#######################################################################
# Peak
#######################################################################
openacc=peak=default=default:
OPTIMIZE = -fast -Mfprelaxed
FOPTIMIZE = -acc -ta=tesla:cc60
COPTIMIZE = -acc -ta=tesla:cc60
363.swim=peak=default=default:
FOPTIMIZE = -acc -ta=tesla:cc60,pinnedpinned: pinned memory usage
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
openacc
config file: $SPEC/config/tutISC18-openacc-pgi.cfg
Base settings for openacc benchmark
Peak settings for openacc benchmark
SPEC ACCEL Config File (4/5)
June 24, 2018 52
############################################################################
# Hardware and software information for the machine under test.
# This information will be extracted for a reportable run.
# An example configuration can be copied from the website
# https://www.spec.org/accel/results/accel_acc.html
############################################################################
company_name = SPEC Tutorial Company
test_sponsor = SPEC Tutorial Sponsor
tester = SPEC Tutorial Tester
license_num = SPEC Tutorial License
machine_name = SPEC Tutorial Machine
hw_vendor = NEC
hw_avail = NOV-2016
hw_cpu =
hw_cpu_mhz = 2200
hw_cpu_max_mhz = 2900
[..]
hw_tcache = 30 MB I+D on chip per chip
HW & SW description
• Nothing new so far
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
Intel Xeon E5-2650 v4
config file: $SPEC/config/tutISC18-openacc-pgi.cfg
SPEC ACCEL Config File (5/5)
June 24, 2018 53
sw_avail = Oct-2017
sw_compiler = PGI Accelerator Fortran/C/C++ Server, Release 17.10
sw_file = NFSv
sw_os000 = CentOS Linux release 7.3.1611 (Core)
sw_os001 = 3.10.0-514.26.2.el7.x86_64
sw_other = NVIDIA CUDA 7.5
sw_state = Multi-user, run level 3
sw_base_ptrsize = 64-bit
hw_accel_connect = PCIe 2.0 16x
hw_accel_desc000 = NVIDIA Tesla P100-SXM2 GPU, 3584 CUDA cores, 1480MHz,
hw_accel_desc001 = 16 GB GDDR5 RAM
hw_accel_ecc = yes
hw_accel_model = Tesla P100
hw_accel_name = NVIDIA Tesla P100
hw_accel_type = GPU
hw_accel_vendor = NVIDIA
sw_accel_driver = NVIDIA UNIX x86_64 Kernel Module 390.46
# MD5 section [..]
PGI compiler
ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
config file: $SPEC/config/tutISC18-openacc-pgi.cfg
CUDA version
Additional hardware
information on accelerator
NVIDIA Pascal GPU
Run the config file – GPU run
• Run base and peak
• Use ref data set (to see difference between base and peak)
• Execute benchmarks 350.md and 363.swim
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 54
$> runspec --config=tutISC18-openacc-pgi-complete --tune=all --size=train 350 363
batch script: $SPEC/config/tutISC18-openacc-pgi.sh
$> bsub < tutISC18-openacc-pgi.sh
hands-on
Submit to batch system.
Note: Machine is only
available on day of tutorial
• Run selected benchmark: 350
• Or: whole suite, e.g., openacc,
opencl, openmp
hands-on
Summary - SPEC OMP2012 & SPEC ACCEL
• SPEC OMP2012 for Host CPU
• Benchmarks implemented using OpenMP for host
• Need to specify #threads
• Processor/core can be chosen using environment variable, such as ENV_OMP_PLACES
• SPEC ACCEL for accelerators
• Including benchmarks implemented using OpenACC, OpenCL and OpenMP target
• Don’t need to specify #threads
• With OpenCL benchmarks: select specific device, modify workgroup sizes
June 24, 2018 55ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 56
Output files
• SPEC runs create results in result subdirectory
• Text files, “.txt”,
• Preview of the result as it would look on the SPEC website
• Log files, “.log”, “.log.debug”
• Verbose output of the benchmark run
• Raw files, “.rsf”,
• Above the “line” are editable fields about the run such as system or software configuration
• Below the “line” are the encoded results. Tampering with the results will corrupt the file.
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 57
Publish results on SPEC website
• Publishing SPEC HPG results helps to get a rich set of different HW, compilers,
configurations, etc.
• But it’s not required
• Note: non SPEC members pay publication fee
• Recap (Part A): Result Submission Process
• Obtain and install the benchmark
• Perform a valid run Adhere to all run rules + create config file + reportable run
• Supply hardware and software description Edit documentation portion of results ((raw) file)
• Submit result for review (and publication) to SPEC HPG
• 2 week review process
• (Define embargo period)
• Use the result as you would like
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 58
https://www.spec.org/omp2012/docs/runrules.html#section4.7
https://www.spec.org/hpg/submitting_results.html
links
Run rules
A published result means
1. Performance observation testing
• Generally no code modifications of provided sources allowed
• Tester supplies compiler, system, config files
• Tester provides description of performance-relevant conditions
2. Declaration of expected performance reproducing
• Observed level of performance obtainable by others (e.g., used by vendors)
• Components (e.g., hardware, OS) obtainable by others
3. Claim about maturity of performance methods
• E.g., correct code generation & improved performance for a class of programs larger then the
SPEC suite
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 59
OpenMP: https://www.spec.org/omp2012/docs/runrules.html
MPI: https://www.spec.org/mpi/docs/runrules.html
ACCEL: https://www.spec.org/accel/docs/runrules.html
Defs: https://www.spec.org/omp2012/Docs/runrules.html#section4.2.1
links
Test Sponsor: entity
sponsoring the testing
(defaults to hardware
vendor) or can be
your university
Tester: entity actually
carrying out the tests
(defaults to test sponsor)
or can be your name
Reportable runs
Create valid/ compliant result: runspec --reportable [..]
• --tune [base|all]
• Entire SPEC suite (no single benchmarks)
• Workload: test, train, ref will be run ref results are taken
• Verification for all three data set sizes
• #iterations = 3 median is taken
• #threads: one fixed number in base (variable per benchmark in peak)
Configuration disclosure (in config file or with rawformat – see next slide)
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 60
Reportable run: https://www.spec.org/omp2012/Docs/runspec.html#section3.1.1
https://www.spec.org/omp2012/Docs/runspec.html#reportablelinks
Preparing a result for submission
• Flags and platform files
• XML files containing detailed descriptions of the compiler flags and platform settings.
• Edit documentation portion of results: rawformat
• Script used to format a raw file into text, html, Postscript, or PDF
• Also performs a submission check to determine result is valid
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 61
$> rawformat outputfile.rsf
$>
$> rawformat –F path/to/flagsfile.xml
Runs offline verification of result (similar to
submission), produces same output as online
Adds flags-file to
the result
Rawformat: https://www.spec.org/omp2012/docs/utility.html#rawformat
links
Useful hint:
Make a backup copy of the
rawfile before editing.
Submitting results to SPEC
• Submission of SPEC results
1. Process your rsf-file through rawformat
to check for anything missing/ faulty
2. Attach your rsf-file to an e-mail to, e.g.,
3. Receive a reply with a sub-file attached
4. For updates, modify the sub-file and
attach to an e-mail to, e.g.,
• Submitted results reviewed before
publication by SPEC committee
• Schedule: 2 weeks until reply (see (3))
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 62
https://www.spec.org/omp2012/docs/runrules.html#section4.7
https://www.spec.org/hpg/submitting_results.html
links
Source: https://www.spec.org/omp2012/results/omp2012.html, as of May 2018
Feedback
ISC Feedback Forms: https://2018.isc-program.com/
“Using the SPEC HPG Benchmarks for Better Analysis and Evaluation of Current
and Future HPC Systems”
Contact
• SPEC Headquarters: [email protected]
• Robert Henschel, Indiana University, USA: [email protected]
• Sandra Wienke, RWTH Aachen University, Germany: [email protected]
• Bo Wang, RWTH Aachen University, Germany: [email protected]
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 63
Or meet us at the
ISC18 show floor:
JARA booth B-1320!
Contents
• Cluster login
• Overview of system requirements
• How to get SPEC benchmarks?
• Benchmark acquisition & licensing
• Download & unpacking
• How to setup SPEC benchmarks?
• Installation
• How to run SPEC benchmarks?
• Benchmark components & workloads
• Runspec & run rules
• Configuration files
• From base to peak runs
• Switch of compiler
• How to publish SPEC benchmark
results?
• Output files
• Reportable runs
• Process of publishing
June 24, 2018 ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks? 64
Hands-On
Follow the instructions on the
hands-on handout
Either complete config files (if notalready done) or use *-complete files
You can test any configuration you like!
• Different flags
• Different runspec options
• Different benchmark suites,…
Additional small tasks are provided
Have a look at the results (result folder)
June 24, 2018 65ISC'18 SPEC Tutorial, Part B: How to get, setup and run SPEC benchmarks?
hands-on
If you have any problems, let us know
immediately! We are happy to help you!
How to interpret and compare SPEC
benchmark resultsRobert Henschel, Sandra Wienke, Bo Wang, Sunita
Chandrasekaran, Jimmy ChengGuido Juckeland, Junjie Li,
Veronica G. Vergara Larrea
http://go.iu.edu/21BQ
Contents
Interpretation of Results
Comparing Results
Use Cases
June 24, 2018 SC'15 SPEC Tutorial, Part D: Interpreting and publishing SPEC results 2
Contents
Interpretation of Results
Comparing Results
Use Cases
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 3June 24, 2018
4
Fair Use
Beyond creating compliant results, how the results can be used is governed by SPEC
The source of the result must be clear
The date of the result must be clear and correct
All SPEC trademarks must referenced
Metrics must be disclosed. □ Derived metrics may be used provided the SPEC metric is given.
Basis of comparison is disclosed (if applicable)
Full fair use rules can be found at: https://www.spec.org/fairuse.html
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
SPEC Score
The result is a SPEC Score, the geometric mean of all benchmark component ratios
to the reference machine.□ Reference machine for SPEC OMP2012 Sun Fire X4140, 2xAMD Opteron 2384, 8 cores, 2 chips, 4 cores/chip, 2.7 GHz
□ SPEC Score of the reference machine is “1”.
Higher is better.
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 5June 24, 2018
SPEC Score
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 6
HPE Superdome Flex (Intel Xeon Gold 6154,
3.00 GHz)
0
20
40
60
80
100
120
140
0 20 40 60 80 100 120
SPEC OMP2012
June 24, 2018
SPEC Score
Lets take a look at base vs. peak:
http://spec.org/omp2012/results/res2015q2/omp2012-20150527-00065.html
Go to OMP2012 results and search the page for „S4TR”, pick the last of the 3
results, the one with the “v3” XEON.
Lets take a look at energy:
http://spec.org/omp2012/results/res2014q4/omp2012-20141021-00060.html
Go to OMP2012 results and search the page for „ R421”, pick the first result.
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 7June 24, 2018
8
Results - http://spec.org/accel/results/accel.html
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
June 22, 2018
SPEC® ACCEL™ OMP ResultCopyright 2015-2017 S tandard Performance Evaluation Corporation
SPECaccel_omp_base = 3.40
SPECaccel_omp_energy_base
=
4.54
SPECaccel_omp_peak = Not Run
SPECaccel_omp_energy_peak
=
--
Colfax International (Test Sponsor: Indiana University)
Xeon Phi 7210
Ninja Developer Platform Pedestal: Liquid Cooled
ACCEL license: 3440A Test date: May-2017Test sponsor: Indiana University Hardware Availability: Aug-2016Tested by: Indiana University Software Availability: Jan-2017
HardwareCPU Name: Intel Xeon Phi 7210
CPU Characteristics: Simultaneous multithreading (SMT) on, Turbo off.
CPU MHz: 1300
CPU MHz Maximum: 1300
FPU: Integrated
CPU(s) enabled: 64 cores, 1 chip, 64 cores/chip, 4 threads/core
CPU(s) orderable: 1 to 1 chip
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per tile (2 cores)
L3 Cache: None
Other Cache: None
Memory: 96 GB (6 x 16 GB 2Rx8 PC4-2400T-REB-11, ECC)+ 16 GB MCDRAM
Disk Subsystem: Intel S3510 SSD 800GB, SATA3
Other Hardware: None
PowerPower Supply: 750W
AcceleratorAccel Model Name: Xeon Phi 7210
Accel Vendor: Intel
Accel Name: Xeon Phi 7210
Type of Accel: CPU
Accel Connection: N/A
Does Accel Use ECC: Yes
Accel Description: Second generation Xeon Phi self-bootable CPU,SMT on, Turbo off, flat DDR4+MCDRAM
Accel Driver: N/A
SoftwareOperating System: CentOS Linux release 7.2.1511 (Core)
3.10.0-327.13.1.e l7.xppsl_1.3.3.151.x86_64
Compiler: Intel Parallel Studio XE 2017 Update 1 for
Linux, Version 17.0.1.132 Bui ld 20161005
File System: ext4
System State: Run level 3 (multi-user with networking)
Other Software: None
June 22, 2018
SPEC® ACCEL™ OMP ResultCopyright 2015-2017 S tandard Performance Evaluation Corporation
SPECaccel_omp_base = 3.40
SPECaccel_omp_energy_base
=
4.54
SPECaccel_omp_peak = Not Run
SPECaccel_omp_energy_peak
=
--
Colfax International (Test Sponsor: Indiana University)
Xeon Phi 7210
Ninja Developer Platform Pedestal: Liquid Cooled
ACCEL license: 3440A Test date: May-2017Test sponsor: Indiana University Hardware Availability: Aug-2016Tested by: Indiana University Software Availability: Jan-2017
HardwareCPU Name: Intel Xeon Phi 7210
CPU Characteristics: Simultaneous multithreading (SMT) on, Turbo off.
CPU MHz: 1300
CPU MHz Maximum: 1300
FPU: Integrated
CPU(s) enabled: 64 cores, 1 chip, 64 cores/chip, 4 threads/core
CPU(s) orderable: 1 to 1 chip
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per tile (2 cores)
L3 Cache: None
Other Cache: None
Memory: 96 GB (6 x 16 GB 2Rx8 PC4-2400T-REB-11, ECC)+ 16 GB MCDRAM
Disk Subsystem: Intel S3510 SSD 800GB, SATA3
Other Hardware: None
PowerPower Supply: 750W
AcceleratorAccel Model Name: Xeon Phi 7210
Accel Vendor: Intel
Accel Name: Xeon Phi 7210
Type of Accel: CPU
Accel Connection: N/A
Does Accel Use ECC: Yes
Accel Description: Second generation Xeon Phi self-bootable CPU,SMT on, Turbo off, flat DDR4+MCDRAM
Accel Driver: N/A
SoftwareOperating System: CentOS Linux release 7.2.1511 (Core)
3.10.0-327.13.1.e l7.xppsl_1.3.3.151.x86_64
Compiler: Intel Parallel Studio XE 2017 Update 1 for
Linux, Version 17.0.1.132 Bui ld 20161005
File System: ext4
System State: Run level 3 (multi-user with networking)
Other Software: None
June 22, 2018
SPEC® ACCEL™ OMP ResultCopyright 2015-2017 S tandard Performance Evaluation Corporation
SPECaccel_omp_base = 3.40
SPECaccel_omp_energy_base
=
4.54
SPECaccel_omp_peak = Not Run
SPECaccel_omp_energy_peak
=
--
Colfax International (Test Sponsor: Indiana University)
Xeon Phi 7210
Ninja Developer Platform Pedestal: Liquid Cooled
ACCEL license: 3440A Test date: May-2017Test sponsor: Indiana University Hardware Availability: Aug-2016Tested by: Indiana University Software Availability: Jan-2017
HardwareCPU Name: Intel Xeon Phi 7210
CPU Characteristics: Simultaneous multithreading (SMT) on, Turbo off.
CPU MHz: 1300
CPU MHz Maximum: 1300
FPU: Integrated
CPU(s) enabled: 64 cores, 1 chip, 64 cores/chip, 4 threads/core
CPU(s) orderable: 1 to 1 chip
Primary Cache: 32 KB I + 32 KB D on chip per core
Secondary Cache: 1 MB I+D on chip per tile (2 cores)
L3 Cache: None
Other Cache: None
Memory: 96 GB (6 x 16 GB 2Rx8 PC4-2400T-REB-11, ECC)+ 16 GB MCDRAM
Disk Subsystem: Intel S3510 SSD 800GB, SATA3
Other Hardware: None
PowerPower Supply: 750W
AcceleratorAccel Model Name: Xeon Phi 7210
Accel Vendor: Intel
Accel Name: Xeon Phi 7210
Type of Accel: CPU
Accel Connection: N/A
Does Accel Use ECC: Yes
Accel Description: Second generation Xeon Phi self-bootable CPU,SMT on, Turbo off, flat DDR4+MCDRAM
Accel Driver: N/A
SoftwareOperating System: CentOS Linux release 7.2.1511 (Core)
3.10.0-327.13.1.e l7.xppsl_1.3.3.151.x86_64
Compiler: Intel Parallel Studio XE 2017 Update 1 for
Linux, Version 17.0.1.132 Bui ld 20161005
File System: ext4
System State: Run level 3 (multi-user with networking)
Other Software: None
12
Results
Metrics
Power disclosure
Detailed results
Ratios
Hardware disclosure
Contents
Interpretation of Results
Comparing results
Use Cases
13ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Comparing Results
For both SPEC OpenMP2012 and SPEC MPI2007, most results are part of a
scalability or comparison study.□ Increasing MPI ranks.
□ Testing different compilers.
□ ...
The next charts are created from published results!
Take a look online....
Note that you cannot compare results between versions or between data sets!
14ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
15
System and Interconnect Comparison
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
• Cray XC30:
- 2x Xeon E5-2697 v2 (24C)
- Cray Aries interconnect
- Cray MPI
- Dragonfly
• Stampede2 node:
- Xeon Phi 7250 (68C)
- Intel Omni-Path interconnect
- Intel MPI
- Fat tree
• NEC HPC1812Rg-2 node:
- 2x Xeon E5-2650 v4 (24C)
- Intel Omni-Path interconnect
- Intel MPI
- Fat tree
• HPE SGI 8600 node:
- 2x Xeon Gold 6148 (40C)
- Dual-rail InfiniBand 4X EDR
- HPE SGI MPI
- Enhanced hypercube
Xeon Phi + Omni-Path
Xeon +Aries
Xeon + Omni-Path
Xeon + Dual InfiniBand
0
20
40
60
80
100
120
0 4 8 12 16 20 24 28 32
SP
EC
Score
Number of Nodes
MPI2007 Medium
Stampede2 64 ranks/node BR2+ 24 ranks/node
NEC HPC1812Rg-2 24 ranks/node SGI 8600 40 ranks/node
June 24, 2018
16
Same Programming Model on Different Hardware
0
0.5
1
1.5
2
2.5
3
SP
EC
Sc
ore
Devices used in SPEC ACCEL OpenCL Submissions
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
17ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
0
1
2
3
4
5
6
7
8
9
AMDRadeon HD
7970
AMD FirePros9150
IntelXeon
E5-2697 v2
IntelXeon
E5-2698 v3
NVIDIAQuadro6000
NVIDIATesla C2070
NVIDIATeslaK20
NVIDIATeslaK40
NVIDIAGeForce
GTX TITAN
NVIDIATeslaK80
NVIDIATeslaP100Intel
NVIDIATeslaP100
OpenPower
SP
EC
Sco
res
Device names
Devices used in SPEC ACCEL OpenACC Submissions
Same Programming Model on Different Hardware
June 24, 2018
18
Compiler Evolution – PGI and Cray OpenACC
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
0
0.5
1
1.5
2
2.5
PGI 14.1 PGI 15.3 PGI 16.4 PGI 17.1 PGI 18.4 Cray-8.2.1 Cray-8.3.1 Cray-8.3.8
SP
EC
Sc
ore
Compilers
SPEC ACCEL OpenACC on IU Cray XK7NVIDIA TESLA K20
June 24, 2018
19
Comparing Different MPI Libraries and OSes
0
5
10
15
20
25
30
32 64 128 256 512
SP
EC
Sc
ore
Core counts
SPEC MPI2007 Medium IBM iDataPlex (Intel Xeon L5420)
Windows MS_MPI-1.0.6 Linux Intel_MPI-3.1 Linux OpenMPI-1.3.1ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
20
SPEC OMP2012
Performance and Energy
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
0
1
2
3
4
5
6
7
8
AMDOpteron2374 HE
KVM
AMDOpteron2374 HENative
IntelXeonL7555Native
IntelXeon
E5-2680 v3Native
SP
EC
Sc
ore
Performance Score Energy Score
June 24, 2018
SPEC ACCEL OpenCL
The effect of ECC (Using results for NVIDIA K40c, base)
21
0%
5%
10%
15%
20%
25%
Ra
tio
of
OC
L b
as
e r
es
ult
s
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
22
Comparing Partial Results – Academic Fair Use
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
• Cray and IBM compilers support OpenMP 4.5 offload to GPUs. This shows the potential of
the Cray compiler but currently only 6 of 15 benchmarks work!
RPeak: KNL-7210 2.60 TFlops
K20 1.17 TFlops Ratio: 2.2x
SPEC Score Speedup
Benchmarks KNL(MCDRAM)
Intel
KNL(DDR4)
Intel
K20
Cray KNL(MCDRAM)
vs K20
KNL(DDR4)
vs K20
503.postencil 1.99 0.700 1.26 1.6x 0.6x
504.polbm 3.42 0.754 0.898 3.8x 0.8x
514.pomriq 2.71 2.72 1.11 2.4x 2.4x
555.pseismic 2.83 1.06 1.43 2.0x 0.7x
560.pilbdc 8.43 1.97 4.61 1.8x 0.4x
570.pbt 27.4 20.2 18.2 1.5x 1.1x
Geometric Mean 2.1x 0.8x
June 24, 2018
23
And Now for Something Totally Different
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x E5-2650 v2 2x E5-2697 v2 2x E5-2680 v3 1x KNL 7210
Rm
ax (
TF
LO
PS
)
HPL
0
1
2
3
4
5
6
7
8
9
2x E5-2650 v2 2x E5-2697 v2 2x E5-2680 v3 1x KNL 7210
SP
EC
Sco
re
SPEC OpenMP 2012
• HPL vs. SPEC OpenMP 2012
• How much information is publically available about Top500 results?
June 24, 2018
Comparing Results – Advanced Search
Lets use the search function on the home page:□ Advanced Search: https://www.spec.org/cgi-bin/osgresults?conf=omp2012
Indiana University OpenPower results, showing compiler, sort by compiler first, trread count
second.
Yes, you could have done that using copy and past, but imagine doing this with
SPEC CPU2006 results!
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 24June 24, 2018
Comparing Results – Advanced Search
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 25
0
1
2
3
4
5
6
7
8
20 40 80 160
SP
EC
Sc
ore
Number of Threads
SPEC OpenMP2012 on IBM S822LC (Power 8 + NVLink)
PGI 17.10 GCC 7.2.0 Clang/Flang 4.0.1
June 24, 2018
Comparing Results
Dump all Records as CSV□ https://www.spec.org/cgi-bin/osgresults?conf=omp2012
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results 26June 24, 2018
Contents
Interpretation of Results
Comparing results
Use Cases
27ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases
28
• System, accelerator and software vendors
• Application developers
• Users and HPC centers
• Researchers
• HPC tool developers
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – Vendors
29
• Marketing
• Drive benchmark development
• To utilize state of the art hardware/software features
• Internal validation suite
• Compiler
• OMP / MPI runtime libraries
• Prepare for RFPs (procurement)
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – Application Developers
30
• Include their application in the benchmark suite
• See results on a lot of different systems.
• Compare hardware and software stack
• Compilers
• Parallel runtimes
• Different versions of processors
• Different interconnects
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – HPC Centers
31
• Include the benchmarks in the RFP (procurement) process
• Use them for performance regression testing
• Hardware
• Software
• System configuration and tuning
• Power consumption
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – Researchers
32
• Scalability studies
• Novel implementations of parallel runtime libraries
• Detailed power consumption studies
• Comparison of parallel programming paradigms
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – HPC Tool Developers
33
• MUST• Implements MPI runtime correctness analysis and reports deadlocks, mismatches in types or collective
arguments and scales to more than 16k MPI ranks.
• SPEC MPI L2007 v2 (ref) up to 2k ranks used to• Evaluation of general tool runtime overhead, i.e., (runtime with tool) / (runtime without tool) • Evaluation of the influence of specific changes in the analysis or tool infrastructure (e.g. guarantee to provide
complete results when the application crashes). • Publications: http://www.itc.rwth-aachen.de/go/id/fddi/lidx/1/file/540356
• Archer / ThreadSanitizer• Data race analysis for OpenMP programs
• SPEC OMP (train) up to 12 threads used to evaluate tool runtime overhead for data race detection.• Publication: http://www.itc.rwth-aachen.de/go/id/fddi/lidx/1/file/706852
• OMPT Interface of Intel/LLVM OMP runtime• OMPT (OpenMP tools interface) implementation in the LLVM/Intel OpenMP runtime
• Requirement by Intel: negligible overhead in the absence of a OMPT tool
• SPEC OMP 2012 (ref) was used for evaluation of the overhead of the OMPT implementation and acceptance test of the OMPT implementation.
ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – MUST (MPI Correctness and Deadlocks)
34ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
Use Cases – OMPT interface in OpenMP runtime
35ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC resultsJune 24, 2018
36ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
Thank you!
ISC Feedback Forms: https://2018.isc-program.com/
“Using the SPEC HPG Benchmarks for Better Analysis and Evaluation of Current
and Future HPC Systems”
Contact
• SPEC Headquarters: [email protected]
• Robert Henschel, Indiana University, USA: [email protected]
• Sandra Wienke, RWTH Aachen University, Germany: [email protected]
• Bo Wang, RWTH Aachen University, Germany: [email protected]
Or meet us at the
ISC18 show floor:
JARA booth B-1320!
June 24, 2018
37ISC‘18 SPEC Tutorial, Part C: Interpreting and Comparing SPEC results
Questions?
Thank you!
June 24, 2018
Conclusion and Wrap-Up
Robert Henschel, Sandra Wienke, Bo Wang, Sunita
Chandrasekaran, Jimmy ChengGuido Juckeland, Junjie Li,
Veronica G. Vergara Larrea
http://go.iu.edu/21BQ
What you have learned
• What are SPEC HPG Benchmark suites
• How to utilize SPEC HPG Benchmarks for research and commercial performance
evaluations, and comparing systems
• How to run SPEC HPG benchmark suites
• How to interpret and publish results
ISC'18 SPEC Tutorial - Part D: Conclusion 2June 24, 2018
Future SPEC HPG Benchmarks – MPI+X
• First hybrid benchmark, posing lots of challenges for run rules and metrics
• “+X” can be anything, including, OpenMP, OpenACC, CUDA, TBB, Kokkos,
PTHREADS, …
• Search program in 2017, benchmark integration workshops happening in 2018
• More than a dozen candidates submitted from 3 continents and 5 different
countries and more to come.
• Monetary incentive of up to $5000 if the application makes it into the final
benchmark.
• Please talk to us if you are interested in contributing a code!
• https://www.spec.org/hpg/search/
ISC'18 SPEC Tutorial - Part D: Conclusion 3June 24, 2018
4
Why and How to Contribute to SPEC HPG
• Download and run SPEC HPG benchmarks
• Use published results in your papers
• Submit results
• Join SPEC HPG
• Result review
• Contribute an application to the MPI+X benchmark
• Help with benchmark development (work with programming model experts)
• Test new benchmark kits on your hardware
• SPEC members can publish results for free
ISC'18 SPEC Tutorial - Part D: ConclusionJune 24, 2018
Benchmark Development Process
• Group effort, with lots of discussions
• Working with experts that are developing the programming model.
• Final decisions are by vote, we strive for consensus
• Technical and infrastructure work
• Find benchmark components and define run rules
• Using SPEC provided tools
• GIT, SPEC harness, “common rules”
• Websites, mailing lists, meeting venues
5ISC'18 SPEC Tutorial - Part D: ConclusionJune 24, 2018
Thank you!
6ISC'18 SPEC Tutorial - Part D: Conclusion
ISC Feedback Forms: https://2018.isc-program.com/
“Using the SPEC HPG Benchmarks for Better Analysis and Evaluation of Current
and Future HPC Systems”
Contact
• SPEC Headquarters: [email protected]
• Robert Henschel, Indiana University, USA: [email protected]
• Sandra Wienke, RWTH Aachen University, Germany: [email protected]
• Bo Wang, RWTH Aachen University, Germany: [email protected]
Or meet us at the
ISC18 show floor:
JARA booth B-1320!
June 24, 2018
Thank you!
Questions?
7ISC'18 SPEC Tutorial - Part D: ConclusionJune 24, 2018