cray aries custom interconnect - national energy … · cray xc30 network the cray xc30 system is...
TRANSCRIPT
![Page 1: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/1.jpg)
![Page 2: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/2.jpg)
Cray Aries Custom Interconnect
2/11/2013 2
![Page 3: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/3.jpg)
Aries vs. Gemini
Feature Benefit
3X increase in sustained injection
bandwidth to 10 GB/sec
Improves the efficiency of a wide range of
communication intensive applications
Large increase in global
bandwidth, from 3X at the low end
to 20X or more at the high end
Benefits applications with complex
communication patterns (unstructured
meshes, adaptive mesh refinement,
search and sort, data mining etc.)
Hardware support for collectives Especially helpful on large jobs (1000s of
cores) which utilize collective operations.
HW support is about a 2X improvement
over our best SW algorithms
3X increase in the rate of small
puts and gets to 120M/sec
Key for new programming environments
(PGAS), important to our DARPA mission
partners and some of our more advances
customers
PCI-Express Gen3 interface Benefits Cray, reduces our dependency
on a particular CPU vendor
3
![Page 4: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/4.jpg)
Cray XC30 Network
● The Cray XC30 system is built around the idea of optimizing
interconnect bandwidth and associated cost at every level
Rank-1
PC Board: ¢¢¢
Rank-2
Passive CU: $
Rank-3
Active Optics: $$$$
4
![Page 5: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/5.jpg)
Compute Blade
4 Compute Nodes
Chassis
Rank 1 Network
16 Compute Blades
No Cables
64 Compute Nodes
Group
Rank 2 Network
Passive Electrical Network
2 Cabinets
6 Chassis
384 Compute Nodes
System
Rank 3 Network
Active Optical Network
Hundreds of Cabinets
Up to 10s of thousands of nodes
Cray XC30 System Building Blocks
5
![Page 6: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/6.jpg)
Cray XC30 Modular Blades
2/11/2013 6
![Page 7: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/7.jpg)
Cray XC30 Compute Blade Architecture
7
![Page 8: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/8.jpg)
Cray XC30 Fully Populated
Compute Blade
8
![Page 9: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/9.jpg)
Cray XC I/O Module
PCIe
Card
Slots
Riser
Assembly
Intel 2600 Series
Processor
Aries 9
![Page 10: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/10.jpg)
Cray XC30 Dragonfly Topology
2/11/2013 10
![Page 11: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/11.jpg)
Cray XC30 Rank1 Network
o Chassis with 16 compute blades
o 128 Sockets
o Inter-Aries communication over
backplane
o Per-Packet adaptive Routing
2/11/2013 11
![Page 12: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/12.jpg)
16 Aries connected
by backplane
“Green Network”
Cascade – Local Electrical Network
2/11/2013
12
4 nodes
connect to a
single Aries
6 backplanes
connected with
copper cables in a 2-
cabinet group:
“Rank-2 Network”
Active optical
cables interconnect
groups
“Rank-3 Network”
2 Cabinet
Group
768 Sockets
![Page 13: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/13.jpg)
Cray XC30 Rank-2 Cabling
● Cray XC30 two-
cabinet group
● 768 Sockets
● 96 Aries Chips
13
![Page 14: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/14.jpg)
Cray XC30 Routing
14
S
D
With adaptive routing
we select between
minimal and non-
minimal paths based
on load
The Cray XC30 Class-
2 Group has sufficient
bandwidth to support
full injection rate for all
384 nodes with non-
minimal routing
M Minimal route
between any two
nodes in a group is
just two hops
Non-minimal route
requires up to four
hops.
R M
M
![Page 15: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/15.jpg)
Cray XC30 – Rank-3 Network
● An all-to-all pattern is wired between the groups using optical cables (blue network)
● The global bandwidth can be tuned by varying the number of optical cables in the group-to-group connections
Example: A 7-group system is interconnected with 21 optical “bundles”. The “bundles”
can be configured between 2 or more cables wide, subject to the group limit.
Group 0 Group 1 Group 2 Group 3 Group 4 Group 5 Group 6
2/11/2013
15
![Page 16: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/16.jpg)
Adaptive Routing over the Blue Network
16
● An all-to-all pattern is wired between the groups
Group 0
Group 1
Group 2
Group 3 Group 4
Assume Minimal path from Group 0 to 3 becomes congested
Traffic can “bounce off” any other intermediate group
Doubles load on network but more effectively utilizes full system bandwidth
![Page 17: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/17.jpg)
Copper & Optical Cabling
Optical
Connections
Copper
Connections
(17) 17
![Page 18: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/18.jpg)
Cray Software
18
![Page 19: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/19.jpg)
Cray Programming Environment Distribution Focus on Performance and Productivity
Programming Languages
Fortran
C
C++
I/O Libraries
NetCDF
HDF5
Optimized Scientific
Libraries
LAPACK
ScaLAPACK
BLAS (libgoto)
Iterative Refinement
Toolkit
Cray Adaptive FFTs (CRAFFT)
FFTW
Cray PETSc (with CASK)
Cray Trilinos (with CASK)
Cray developed
Licensed ISV SW
3rd party packaging
Cray added value to 3rd party
3rd Party Compilers
GNU
Compilers
Cray Compiling Environment
(CCE)
Programming
models
Distributed Memory (Cray MPT)
• MPI
• SHMEM
PGAS & Global View
• UPC (CCE)
• CAF (CCE)
• Chapel
Shared Memory
• OpenMP 3.0
• OpenACC
Python
•CrayPat
• Cray Apprentice2
Tools
Environment setup
Debuggers
Modules
DDT
lgdb
Modules
Debugging Support
Tools
•Abnormal Termination Processing
Performance Analysis
STAT
Scoping Analysis
Reveal
19
![Page 20: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/20.jpg)
Cray Software Ecosystem
CrayPAT
Cray Apprentice2
Cray Iterative
Refinement Toolkit
Cray PETSc, CASK
DVS
20
GNU
Reveal
Cray Linux
Environment
![Page 21: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/21.jpg)
An Adaptive Linux OS optimized specifically
for HPC
21
• No compromise scalability
• Low-Noise Kernel for scalability
• Native Comm. & Optimized MPI
• Application-specific performance tuning and scaling
ESM – Extreme Scalability
Mode
• No compromise compatibility
• Fully standard x86/Linux
• Standardized Communication Layer
• Out-of-the-box ISV Installation
• ISV applications simply install and run
CCM –Cluster Compatibility
Mode Linux as the foundation
Solid, robust, feature rich
Familiar, easy-to-use
CLE run mode is set by the user on a job-by-job basis to provide full flexibility
![Page 22: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/22.jpg)
Cray Operating Systems Focus
22
Performance
• Maximize compute cycles delivered to applications while also providing necessary services
• Lightweight operating system on compute node
• Standard Linux environment on service nodes
• Optimize network performance through close interaction with hardware
• GPU infrastructure to support high performance
Stability and Resiliency
• Correct defects which impact stability
• Implement features to increase system and application robustness
Scalability
• Scale to large system sizes without sacrificing stability
• Provide system management tools to manage complicated systems
![Page 23: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/23.jpg)
Benchmarks
23
![Page 24: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/24.jpg)
Performance comparison: XE-IL vs XC30
2/11/2013 24
Tests (Units) XE-Interlagos XC XC/XE
HPL (Tflops) ~81% ~86% 106%
Star DGEMM (Gflops) ~87% ~102% 117%
STREAMs (Gbytes/s/node) 72 78 108%
RandomRing
(Gbytes/s/rank) ~0.055 ~0.141 256%
Point-to-Point BW
(Gbytes/s) 2.8-5.6 >8.5 157% - 314%
Nearest Node Point-to-
Point Latency (usec) 1.6-2.0 <1.4 116% - 145%
GUPs 2.66 15.6 525%
GFFT (Gflops) 628 2221 354%
HAMR Sort
(GiElements/sec) 9.4 36.6 390%
● Typical XE-IL vs. XC-Sandybridge for 256-512 compute nodes
● Actually results will depend on system size and configurations and problem size and definition
![Page 25: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/25.jpg)
2/11/2013 25
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000
GF
LO
P/s
Cores
WRF NewConus 2.5KM Benchmark
XE6 MC12 2.1GHz Gemini 1.1
XT6 MC12 2.1GHz SeaStar
IBM Power6
Intel Nehalem 6-core 2.93GHz
Intel Nehalem QC 2.8GHz
IBM BG/P
IBM BG/L
Cascade SB8 2.6GHz
- Computation and halo exchange costs only - http://www.mmm.ucar.edu/wrf/WG2/benchv3/
Pete Johnsen, Cray, Inc.
![Page 26: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/26.jpg)
AcuSolve* (CFD) Performance 70M ASMO model
At 16 nodes:
• Cray XE6: 512 Interlagos cores = 331 sec
• Cray XC30: 256 Sandy Bridge cores = 187 sec
Cores
32
64
128
256
512
1024
2048
32 128 512 2048
Ela
psed
tim
e (
sec
.)
AcuSolve ASMO model
Cray XE6
Cray XC30
Lower is
better
* Pre-release version of AcuSolve from Altair 26
![Page 27: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/27.jpg)
LS-DYNA benchmark Two car crash simulation, 2.4M elements, Hybrid parallel
1400
2800
5600
11200
22400
0 1000 2000 3000 4000 5000
Ela
psed
Tim
e (
s)
number of cores
Cascade
XE6
XC30 provides significantly better per core performance
Lower is
better
27
![Page 28: Cray Aries Custom Interconnect - National Energy … · Cray XC30 Network The Cray XC30 system is built around the idea of optimizing interconnect bandwidth and associated cost at](https://reader033.vdocuments.us/reader033/viewer/2022052309/5b355f0d7f8b9a8b4b8d0179/html5/thumbnails/28.jpg)
The End