what does titan tell us about preparing for exascale ... · pdf filewhat does titan tell us...

43
ORNL is managed by UT-Battelle for the US Department of Energy What does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science Oak Ridge Leadership Computing Facility Oak Ridge National Laboratory HPC Day 16 April 2015 West Virginia University Morgantown, WV

Upload: phungkhue

Post on 24-Mar-2018

219 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

ORNL is managed by UT-Battelle

for the US Department of Energy

What does Titan

tell us about

preparing for

exascale

supercomputers?

Jack Wells

Director of Science

Oak Ridge Leadership Computing Facility

Oak Ridge National Laboratory

HPC Day

16 April 2015

West Virginia University

Morgantown, WV

Page 2: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

2

Outline

• U.S. DOE Leadership Computing Program

• Hardware Trends: Increasing Parallelism

• The Titan Project: Accelerating Leadership

Computing

– Application Readiness & Lessons Learned

– Science on Titan

– Industrial Partnerships

• The Summit Project: ORNL’s pre-exascale

supercomputer

– Application Readiness

Page 3: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

3

DOE’s Office of Science

Computation User Facilities

• DOE is leader in open High-Performance Computing

• Provide the world’s most powerful computational tools for open science

• Access is free to researchers who publish

• Boost US competitiveness

• Attract the best and brightest researchersNERSC

Edison is 2.57 PF

OLCF

Titan is 27 PFALCF

Mira is 10 PF

Page 4: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

4

Origin of Leadership Computing Facility

Department of Energy High-End

Computing Revitalization Act of

2004 (Public Law 108-423):

The Secretary of Energy, acting

through the Office of Science, shall

• Establish and operate

Leadership Systems Facilities

• Provide access [to Leadership Systems

Facilities] on a competitive, merit-

reviewed basis to researchers in U.S.

industry, institutions of higher education,

national laboratories and other Federal

agencies

Page 5: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

5

What is the Leadership Computing

Facility (LCF)?

• Collaborative DOE Office of Science user-facility program at ORNL and ANL

• Mission: Provide the computational and data resources required to solve the most challenging problems.

• 2-centers/2-architectures to address diverse and growing computational needs of the scientific community

• Highly competitive user allocation programs (INCITE, ALCC).

• Projects receive 10x to 100x more resource than at other generally available centers.

• LCF centers partner with users to enable science & engineering breakthroughs (Liaisons, Catalysts).

Page 6: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

6

Three primary user programs for

access to LCF

Distribution of allocable hours

60% INCITE30% ALCC

ASCR Leadership Computing Challenge

10% Director’s Discretionary

Page 7: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

7

Innovative and Novel Computational

Impact on Theory and Experiment

Call for Proposals

Contact information

Julia C. White, INCITE Manager

[email protected]

www.doeleadershipcomputing.org

The INCITE program seeks

proposals for high-impact

science and technology

research challenges that

require the power of the

leadership-class systems.

Allocations will be for

calendar year 2016.

April 15 – June 26, 2015

INCITE is an annual, peer-review allocation program that provides

unprecedented computational and data science resources

• 5.8 billion core-hours to be awarded for 2016 on the 27-petaflops Cray XK7 “Titan” and the 10-petaflops IBM BG/Q “Mira”

• Average awards expected to be ~75M and 100M core-hours for Titan and Mira, respectively

• INCITE is open to any science domain

• INCITE seeks computationally intensive, large-scale research campaigns

• Proposal-writing webinars: April and May

Page 8: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

8

INCITE resources for high-impact research

• Center expertise is key to INCITE support model

• Multi-year proposals (1, 2, or 3 years)

– Embrace the long-term effort of high impact science

Argonne LCF Oak Ridge LCF

System IBM Blue Gene/Q Cray XK7

Name Mira Titan

Compute nodes 49,152 18,688

Node architecture PowerPC, 16 coresAMD Opteron, 16 cores

NVIDIA Kepler GPU

Processing cores 786,432 299,008

Memory per node,

(gigabytes)16 32 + 6

Peak performance,

(petaflops)10 27

Page 9: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

9

Hardware scaled from single-core through dual-core to quad-core and dual-socket , 12-core SMP nodes

Scaling applications and system software was the biggest challenge

Cray XT4Dual-Core119 TF

2006 2007 2008

Cray XT3 Dual-Core54 TF

Cray XT4Quad-Core263 TF

ORNL has increased system

performance by 1,000 times 2004-2010

2005

Cray X13 TF

Cray XT3Single-core26 TF

2009

Cray XT5 Systems12-core, dual-socket SMP2.3 PF

Page 10: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

10

Hardware Trend of ORNL’s Systems

1985 - 2013

Page 11: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

11Herb Sutter, Dr. Dobb’s Journal, 2005,

http://www.gotw.ca/publications/concurrency-ddj.htm

The Free Lunch is Over

Moore’s Law continues (green)

But CPU clock rates stopped increasing in 2003 (dark blue)

Power (light blue) is capped by heat dissipation and $$$

Single-thread performance is growing slowly (magenta)

Page 12: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

12

Microprocessor Performance “Expectation Gap”

National Research

Council (NRC) –

Computer Science and

Telecommunications

Board (2012)

Expectation

Gap

Page 13: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

13

Peak FLOPS per Node was #1 Hardware

Requirement in 2009 User Survey

“Preparing for Exascale” OLCF Application

Requirements and Strategy, December 2009

Page 14: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

14

Power consumption of 2.3 PF (Peak) Jaguar: 7 megawatts, equivalent to that of a small city (5,000 homes)

Power is THE problem

Page 15: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

15

Using traditional CPUs

is not economically feasible

20 PF+ system: 30 megawatts (30,000 homes)

Page 16: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

16

Why GPUs? Hierarchical Parallelism

High performance and power efficiency

on path to exascale

• Expose more parallelism through code

refactoring and source code directives

– Doubles CPU performance of many codes

• Use right type of processor for each task

• Data locality: Keep data near processing

– GPU has high bandwidth to local memory

for rapid access

– GPU has large internal cache

• Explicit data management: Explicitly

manage data movement between CPU

and GPU memories

CPU GPU Accelerator

• Optimized for sequential multitasking • Optimized for many

simultaneous tasks

• 10 performance per socket

• 5 more energy-efficient systems

Page 17: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

17

200 cabinets

4,352 ft2 (404 m2)

8.9 MW peak power

SYSTEM SPECIFICATIONS:

• 27.1 PF/s peak performance• 24.5 GPU + 2.6 CPU

• 17.59 PF/s sustained perf. (LINPACK)

• 18,688 compute nodes, each with:• 16-Core AMD Opteron CPU

• NVIDIA Tesla “K20x” GPU

• 32 + 6 GB memory

• 710 TB total system memory

• 32 PB parallel filesystem (Lustre)

• Cray Gemini 3D Torus Interconnect

• 512 Service and I/O nodes

ORNL’s “Titan” Hybrid System:

Cray XK7 with AMD Opteron + NVIDIA Tesla processors

Throwing away 90% of available performance if not using GPUs

Page 18: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

18

Center for Accelerated Application

Readiness (CAAR)

• We created CAAR as part of the Titan project to help prepare applications for accelerated architectures

• Goals:

– Work with code teams to develop and implement strategies for exposing hierarchical parallelism for our users applications

– Maintain code portability across modern architectures

– Learn from and share our results

• We selected six applications from across different science domains and algorithmic motifs

Page 19: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

19

WL-LSMSIlluminating the role of material disorder, statistics, and fluctuations in nanoscale materials and systems.

S3DUnderstanding turbulent combustion through direct numerical simulation with complex chemistry..

NRDFRadiation transport –important in astrophysics, laser fusion, combustion, atmospheric dynamics, and medical imaging –computed on AMR grids.

CAM-SEAnswering questions about specific climate change adaptation and mitigation scenarios; realistically represent features like precipitation patterns / statistics and tropical storms.

DenovoDiscrete ordinates radiation transport calculations that can be used in a variety of nuclear energy and technology applications.

LAMMPSA molecular dynamics simulation of organic polymers for applications in organic photovoltaic heterojunctions , de-wetting phenomena and biosensor applications

Application Readiness for Titan

Page 20: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

20

Application Power Efficiency of the Cray XK7

WL-LSMS for CPU-only and Accelerated Computing

• Runtime Is 8.6X faster for the accelerated code

• Energy consumed Is 7.3X less

o GPU accelerated code consumed 3,500 kW-hr

o CPU only code consumed 25,700 kW-hr

Power consumption traces for identical WL-LSMS runs

with 1024 Fe atoms on 18,561 Titan nodes (99% of Titan)

Page 21: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

21

CAAR Lessons Learned

• Up to 1-3 person-years required to port each code

– Takes work, but an unavoidable step required for exascale

– Also pays off for other systems—the ported codes often run significantly faster CPU-only (Denovo 2X, CAM-SE >1.7X)

• An estimated 70-80% of developer time is spent in code restructuring, regardless of whether using CUDA, OpenCL, OpenACC, …

• Each code team must make its own choice of using CUDA vs. OpenCLvs. OpenACC, based on the specific case—may be different conclusion for each code

• Science codes are under active development—porting to GPU can be pursuing a “moving target,” challenging to manage

• More available flops on the node should lead us to think of new science opportunities enabled—e.g., more DOF per grid cell

• We may need to look in unconventional places to get another ~30X thread parallelism that may be needed for exascale—e.g., parallelism in time

Page 22: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

22

2014 Strategic Science Accomplishments

from Titan

Habib and collaborators used its HACC Code on Titan’s CPU–GPU system to conduct today’s largest cosmological structure simulation at resolutions needed for modern-day galactic surveys.

K. Heitmann, 2014. arXiv.org, 1411.3396

Salman HabibArgonne National Laboratory

Cosmology

Chen and collaborators for the first time performed direct numerical simulation of a jet flame burning dimethyl ether (DME) at new turbulence scales over space and time.

A. Bhagatwala, et al. 2014. Proc. Combust. Inst. 35.

Jacqueline ChenSandia National Laboratory

Combustion

Paul Kent and collaborators performed the first ab initio simulation of a cuprate. They were also the first team to validate quantum Monte Carlo simulations for high-temperature superconductor simulations.

K. Foyevtsova, et al. 2014. Phys. Rev. X 4

Paul KentORNL

SuperconductingMaterials

Researchers at Procter & Gamble (P&G) and Temple University delivered a comprehensive picture in full atomistic detail of the molecular properties that drive skin barrier disruption.

M. Paloncyova, et al. 2014. Langmuir 30

C. M. MacDermaid, et al. 2014. J. Chem. Phys. 141

Michael KleinTemple University

Molecular Science

Chang and collaborators used the XGC1 code on Titan to obtain fundamental understanding of the divertor heat-load width physics and its dependence on the plasma current in present-day tokamak devices.

C. S. Chang, et al. 2014. Proceedings of the 25th Fusion Energy Conference, IAEA, October 13–18, 2014.

C.S. ChangPPPL

Fusion

Page 23: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

23

2014 High Impact Science at the OLCF

Palaeoclimate evidence has suggested that the El Niño Southern Oscillation (ENSO) — the most important driver of interannualclimate variability — has swung from periods of intense variability to relative quiescence.

Reference: Zhengyu Liu, Nature,

Volume 515, 2014

Climate Science

Coarse-grained protein models on leadership computers allows the study of single protein properties, DNA–RNA complexes, amyloid fibril formation and protein suspensions in a crowded environment. Reference: Fabio Sterpone, Chemical Society Reviews,Vol. 43, 2014

Biological Sciences

Using OLCF resources, researchers were able to quantify the shape evolution of a platinum nanoparticle by tracking the propagation of different facets.Reference: Hong-GangLiao, Science,Vol. 345, 2014

Materials Sciences

Using a general circulation model (GCM), the team has shown that allowing Northern Hemisphere ice sheet meltwaterto flow into the ocean generates a bipolar seesaw effect, with the Southern Hemisphere warming at the expense of the Northern Hemisphere.Reference: Christo Buizert, ScienceVol. 345, 2014

Climate Science

Theory, modeling, and simulation can accelerate the process of materials understanding and design by providing atomic level understanding of the underlying physicochemical phenomena.Reference: Bobby Sumpter, Accounts of Chemical Research, Vol. 47, 2014

Materials Sciences

First-principles many-body methods are used to study the spin dynamics and the superconducting pairing symmetry of a large number of iron-based compounds, showing high-temperature superconductors have both dispersive high-energy and strong low-energy commensurate or nearly commensurate spin excitations. Reference: Z. P. Yin, Nature Physics,Vol. 10, 2014

Materials Sciences

Page 24: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

24

35 5

8

1519

21

28

34

0

5

10

15

20

25

30

35

40

2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Nu

mb

er

of

Pro

jects

Year

Number of OLCF Industry Projects Operational Per Calendar Year

Industrial Partnerships:

Accelerating Competitiveness through

Computational Sciences

Page 25: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

25

Human skinbarrier

Global flood maps

Engine cycle-to-cycle variation

Fuel efficient jet engines

Wind turbine resilience

Welding Software

Demonstrated small molecules can have large

and varying impact on skin permeability

depending on their molecular

characteristics—important for

product efficacy and safety

Developed

fluvial and

pluvial high

resolution global

flood maps to

enable

insurance firms

to better price

risk and reduce

loss of life and

property

Developing novel

approach to using

massively parallel,

multiple

simultaneous

combustion cycle

simulations to

address cycle-to-

cycle variations in

spark ignition

engine

Conducting first-of-a-kind high-

fidelity LES computations

of flow in turbomachinerycomponents for

more fuel efficient, next-generation jet

engines

First time simulation of ice formation within million-molecule water droplets is

expanding understanding of freezing at the

molecular level to enhance wind

turbine resilience in cold climates

Evaluating large-scale HPC and

GPU capability of critical welding

simulation software and

further developing & testing weld optimization

algorithm

Innovation through Industrial

Partnerships

Page 26: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

26

Aircraftdesign

Consumer product stability

Gasoline engine injector

Jet engine efficiency

Li-ion batteries

Underhoodcooling

Unexpected

discovery of

multiple solutions

for steady RANS

equations with

separated flow

helps explain why

numerical

modeling

sometimes fails

to capture

maximum lift

Developed

method to

measure impact

of additives,

such as dyes

and perfumes,

on properties of

lipid systems

such as fabric

enhancer and

other formulated

products

Optimizing

multihole gasoline

spray injector

nozzle designs for

better in-cylinder

fuel-air mixture

distributions,

greater fuel

efficiency and

reduced physical

prototypes

Accurate predictions

of atomization of liquid fuel

by aerodynamic forces enhance

combustion stability, improve

efficiency, and reduce emissions

New classes of solid inorganic Li-ion electrolytes

could deliver high ionic

and low electronic conductivity and good

electrochemical stability

Developed a new, efficient and automatic

analytical cooling package

optimization process leading to one of a kind

design optimization of

cooling systems

Innovation through Industrial

Partnerships

Page 27: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

27

CatalysisDesign

innovationAircraftdesign

Industrial fire suppression

Turbo machinery efficiency

Long-haul truck fuel efficiency

Demonstrated biomass as a

viable, sustainable feedstock

for hydrogen production for

fuel cells; showed nickel is a

feasible catalytic alternative to platinum

Accelerating design of shock

wave turbo compressors

for carbon capture and

sequestration

Simulatedtakeoff and landing

scenarios improved

a critical code for estimating characteristics of commercial

aircraft, including lift, drag, and controllability

Developing high-fidelity modeling

capability for fire growth and

suppression; fire losses

account for 30% of U.S. property

loss costs

Simulated unsteady flow in turbo machinery,

opening new opportunities for

design innovation and efficiency improvements.

Simulations reduced by 50%

the time to develop a unique system of add-on

parts that increases fuel

efficiency by 7−12% for long-haul (18-

wheeler) trucks

Innovation through Industrial

Partnerships

Page 28: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

28

Additional lessons from Titan

• Complexity (certainly increased parallelism) is the new norm

• Community needs to work together to help manage this

• Education and training are very important.

• LCF is very active in advanced training, for example– OLCF “Hack-a-thon” on OPENACC Compiler Directives.

• 6 teams chosen, 2 mentors assigned per team, 6 weeks of guided planning and work, 1 final intense week of coding

https://www.olcf.ornl.gov/training-event/hackathon-openacc2014/

Page 29: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

29

Roadmap to Exascale

Titan 27 PF600 TB DRAMHybrid GPU/CPU

2012

2017

2022

OLCF-5: 1 EF20 MW

OLCF-4: 100-250 PF4000 TB memory< 20MW

Our Science requires that we advance computational capability 1000x

over the next decade.

What are the Challenges?

Page 30: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

30

What is CORAL (Partnership for 2017 System)

• CORAL is a Collaboration of Oak Ridge, Argonne, and Lawrence Livermore Labs to acquire three systems for delivery in 2017.

• DOE’s Office of Science (DOE/SC) and National Nuclear Security Administration (NNSA) signed an MOU agreeing to collaborate on HPC research and acquisitions

• Collaboration grouping of DOE labs was done based on common acquisition timings. Collaboration is a win-win for all parties.

– It reduces the number of RFPs vendors have to respond to

– It improves the number and quality of proposals

– It allows pooling of R&D funds

– It strengthens the alliance between SC/NNSA on road to exascale

– It encourages sharing technical expertise between Labs

Page 31: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

31

Accelerating Future DOE Leadership

Systems (“CORAL”)

“Summit” System “Sierra” System

5X – 10X Higher Application Performance

IBM POWER CPUs, NVIDIA Tesla GPUs, Mellanox EDR 100Gb/s InfiniBand

Paving The Road to Exascale Performance

Page 32: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

32

2017 OLCF Leadership System

Hybrid CPU/GPU architecture

Vendor: IBM (Prime) / NVIDIA™ / Mellanox Technologies®

At least 5X Titan’s Application Performance

Approximately 3,400 nodes, each with:

• Multiple IBM POWER9 CPUs and multiple NVIDIA Tesla® GPUs using the NVIDIA Volta architecture

• CPUs and GPUs completely connected with high speed NVLink

• Large coherent memory: over 512 GB (HBM + DDR4)

– all directly addressable from the CPUs and GPUs

• An additional 800 GB of NVRAM, which can be configured as either a burst buffer or as extended memory

• over 40 TF peak performance

Dual-rail Mellanox® EDR-IB full, non-blocking fat-tree interconnect

IBM Elastic Storage (GPFS™) - 1TB/s I/O and 120 PB disk capacity.

Page 33: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

33

How does Summit compare to Titan?

Feature Summit Titan

Application Performance 5-10x Titan Baseline

Number of Nodes ~3,400 18,688

Node performance > 40 TF 1.4 TF

Memory per Node >512 GB (HBM + DDR4) 38GB (GDDR5+DDR3)

NVRAM per Node 800 GB 0

Node Interconnect NVLink (5-12x PCIe 3) PCIe 2

System Interconnect (node

injection bandwidth) Dual Rail EDR-IB (23 GB/s) Gemini (6.4 GB/s)

Interconnect Topology Non-blocking Fat Tree 3D Torus

Processors IBM POWER9

NVIDIA Volta™

AMD Opteron™

NVIDIA Kepler™

File System 120 PB, 1 TB/s, GPFS™ 32 PB, 1 TB/s, Lustre®

Peak power consumption 10 MW 9 MW

Page 34: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

34

Two Tracks for Future Large Systems

Hybrid Multi-Core

• CPU / GPU Hybrid systems

• Likely to have multiple CPUs and

GPUs per node

• Small number of very fat nodes

• Expect data movement issues to be

much easier than previous systems –

coherent shared memory within a

node

• Multiple levels of memory – on

package, DDR, and non-volatile

Many Core

• 10’s of thousands of nodes with

millions of cores

• Homogeneous cores

• Multiple levels of memory – on

package, DDR, and non-volatile

• Unlike prior generations, future

products are likely to be self hosted

Cori at NERSC• Self-hosted many-core system

• Intel/Cray

• 9300 single-socket nodes

• Intel® Xeon Phi™ Knights Landing

(KNL)

• 16GB HBM, 64-128 GB DDR4

• Target delivery date: June, 2016

Summit at OLCF• Hybrid CPU/GPU system

• IBM/NVIDIA

• 3400 multi-socket nodes

• POWER9/Volta

• More than 512 GB coherent

memory per node

• Target delivery date: 2017

ALCF-3 at ALCF• 3rd Generation Intel Xeon Phi

(Knights Hill (KNH)

• > 50,000 compute nodes

• Target delivery date: 2018Edison (Cray): Cray XC30Intel Xeon E%-2695v2 12C 2.4 GHzAries

Page 35: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

System attributes NERSC Now OLCF Now ALCF Now NERSC Upgrade OLCF Upgrade ALCF Upgrade

Name

Planned InstallationEdison TITAN MIRA

Cori

2016

Summit

2017-2018

Aurora

2018-2019

System peak (PF) 2.6 27 10 > 30 150 180

Peak Power (MW) 2 9 4.8 < 3.7 10 13

Total system memory 357 TB 710TB 768TB

~1 PB DDR4 + High

Bandwidth Memory

(HBM)+1.5PB

persistent memory

> 1.74 PB DDR4 +

HBM + 2.8 PB

persistent memory

> 7 PB DRAM and

persistent memory

Node performance (TF) 0.460 1.452 0.204 > 3 > 40 > 15 times Mira

Node processors Intel Ivy Bridge AMD Opteron

Nvidia Kepler

64-bit

PowerPC A2

Intel Knights

Landing many core

CPUs

Intel Haswell CPU

in data partition

Multiple IBM

Power9 CPUs &

multiple Nvidia

Voltas GPUS

Intel Knights Hill

many core CPUs

System size (nodes) 5,600 nodes 18,688 nodes 49,152

9,300 nodes

1,900 nodes in data

partition

~3,500 nodes >50,000 nodes

System Interconnect Aries Gemini 5D Torus Aries Dual Rail EDR-IBIntel Omni-Path

Architecture

File System7.6 PB

168 GB/s,

Lustre®

32 PB

1 TB/s,

Lustre®

26 PB

300 GB/s

GPFS™

28 PB

744 GB/s

Lustre®

120 PB

1 TB/s

GPFS™

150 PB

1 TB/s

Lustre®

ASCR Computing Upgrades At a Glance

CORAL Acquisitions

Page 36: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

36

Center for Accelerated Application

Readiness: Summit

Center for Accelerated Application Readiness (CAAR)

• Performance analysis of community applications

• Technical plan for code restructuring and optimization

• Deployment on OLCF-4

OLCF-4 issued a call for proposals in FY2015 for

application development partnerships between

community developers, OLCF staff and the OLCF

Vendor Center of Excellence.

Page 37: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

37

New CAAR ApplicationsApplication Domain Principal

InvestigatorInstitution

ACME (N) Climate Science David Bader Lawrence Livermore National Laboratory

DIRAC Relativistic Chemistry

Lucas Visscher Free University of Amsterdam

FLASH Astrophysics Bronson Messer Oak Ridge National Laboratory

GTC (NE) Plasma Physics Zhihong Lin University of California – Irvine

HACC(N) Cosmology Salman Habib Argonne National Laboratory

LSDALTON Chemistry Poul Jørgensen Aarhus University

NAMD (NE) Biophysics Klaus Schulten University of Illinois – Urbana Champaign

NUCCOR Nuclear Physics Gaute Hagen Oak Ridge National Laboratory

NWCHEM (N) Chemistry Karol Kowalski Pacific Northwest National Laboratory

QMCPACK Materials Science Paul Kent Oak Ridge National Laboratory

RAPTOR Engineering Joseph Oefelein Sandia National Laboratory

SPECFEM Seismic Science Jeroen Tromp Princeton University

XGC (N) Plasma Physics CS Chang Princeton Plasma Physics Laboratory

Page 38: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

38

Are we seeing the real exascale

challenges?

Advanced simulation

and modeling apps

Scale

Power

Resilience

Memory wall

Conquering Petascale

problems of today

Beware being eaten

alive by the Exascale

problems of tomorrow.

Page 39: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

39

Conclusions

• Leadership computing is for the critically important

problems that need the most powerful compute and data

infrastructure

• Accelerated, hybrid-multicore computing solutions are

performing well on real, complex scientific applications.

– But you must work to expose the parallelism in your codes.

– This refactoring of codes is largely common to all massively

parallel architectures

• The Summit project will extend the hybrid-multicore

supercomputer architecture into the (pre)-exascale era.

• OLCF resources are available to industry, academia, and

labs, through open, peer-reviewed allocation

mechanisms.

Page 40: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

40

Page 41: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

41

National Climate Research Center

(NCRC) – A NOAA and DOE Partnership Project

• NCRC grew out of a 2009 interagency

agreement between NOAA and the

Department of Energy.

• A subsequent work-for-others agreement

directly with ORNL allows NOAA to take

advantage of ORNL’s expertise in acquiring,

deploying, and operating world-class

supercomputers.

• It also provides additional opportunities for

collaborative work between NOAA and

ORNL.

• NOAA uses Gaea to study the planet’s

notoriously complex climate from a variety

of angles.

– Gaea is 1.1 petaflop Cray XE6. The system’s 40

cabinets contain 7,520 16-core AMD Interlagos

processors.

• NOAA is ORNL’s partner in the Titan project.

Page 42: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

42

Questions?

[email protected]

42

Page 43: What does Titan tell us about preparing for exascale ... · PDF fileWhat does Titan tell us about preparing for exascale supercomputers? Jack Wells Director of Science ... Requirements

43

Acknowledgements

This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.