suse high performance computing: it just keeps getting better · 2020-01-16 · the hpc universe is...

29
SUSE High Performance Computing: It just keeps getting better Jay Kruemcke Sr. Product Manager, HPC, ARM, POWER [email protected]

Upload: others

Post on 02-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE High Performance Computing:

It just keeps getting better

Jay Kruemcke

Sr. Product Manager, HPC, ARM, POWER

[email protected]

Page 2: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

The HPC universe is expanding in new ways

2

CAGR 2016-2021:

• 5.6% Supercomputer (>$500K)

• 5.0% Divisional ($250K-$500K)

• 6.3% Departmental ($100K-$250K)

• 6.3% Workgroup (<$100K)

• HPC is a growth market, with a growing

recognition of strategic value

• HPC ROI is very high

• $551 on average revenue per dollar

invested in HPC

• $52 on average profit (or cost savings) per

dollar invested in HPC

• Key use cases:

• HPC in the cloud (incl. HPCaaS)

• Cognitive computing (incl. AI/ML/DL)

• HPDA (High Performance Data Analysis)

• IoT

• Key applications:

• Modeling and simulation

• Data analytics

Source: Hyperion Research, June 2017

SUSE High Performance Computing 2/19/2019 2

Page 3: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

HPC Customer Pain Points

Complexity Maintenance Time to Solution

“My IT staff doesn’t have

time to update and test all

the different software

components.”

• Better management

software is needed, and

deployment approach

needs to be updated to

leverage HPC and cloud

infrastructure

• Stack components provided

by multiple vendors, making

it more challenging to

maintain

“I need to maximize

application performance,

scale workloads, and

minimize overhead.”

• Parallel software is lacking

with many applications

needing a major re-design

• Stack components provided

by multiple vendors, making

managing more challenging

• Segmented into commercial

and scientific, and there is

not enough collaboration

• “Composing a working

HPC environment is

difficult, time-consuming,

requiring experts.”

• Clusters are hard to use

and manage as they

become more complex in

heterogeneous

environments

• Storage access time and

data management are

becoming new bottlenecks

SUSE High Performance Computing

32/19/2019

Page 4: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Key HPC Partnerships

SUSE High Performance Computing 2/19/2019 4

Page 5: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE is the preferred HPE partner for

Linux, HPC, OpenStack and Cloud

Foundry solutions

SUSE technology is embedded on every

HPE ProLiant Server to power the

intelligent provisioning feature

SUSE High Performance Computing 2/19/2019 5

Page 6: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Arm SoC partners driving HPC adoptions in the

modern data center

Catalyst UK initiative with HPE and SUSE

HPE Apollo 70 first SUSE “Yes” certification for

an Arm server

Optimize infrastructure costs with increased

server density on latest 64-bit Arm processors SUSE High Performance Computing 2/19/2019 6

Page 7: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Goal: Propel the Arm HPC ecosystem and exascale computing in the UK

• More than 12,000 Arm-based cores running across three universities• 64 Apollo 70 systems per site• Two 32 core Cavium ThunderX2 processors per system• Running SUSE Linux Enterprise for High Performance Computing

Catalyst UK project:HPE, Arm, SUSE, and three leading UK universities establish one of

the largest Arm-based supercomputer deployments in the world

SUSE High Performance Computing

72/19/2019

Page 8: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Cray Linux Environment (CLE) is based on SUSE Linux

Arm-powered Cray delivered to a UK consortium

Cray has a majority share of the Top500 sites

SUSE High Performance Computing 2/19/2019 8

Page 9: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Isambard – UK Tier 2 HPC service from GW4

• Cray “Scout” XC50 series system

- 10,000+ Armv8 cores – Cavium ThunderX2

- Aries interconnect

- Cray Linux Environment based on SUSE Linux

SUSE High Performance Computing 2/19/2019 9

Page 10: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Scalable system framework in cooperation with OpenHPC, designed to work for small clusters to the largest supercomputers

Scale and balance for compute- and data-intensive applications

Strong platform for AI and visualization

SUSE High Performance Computing 2/19/2019 10

Page 11: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

AI/ML/DL workloads

Jointly define scope of Lenovo HPC stack using SUSE HPC componentry

LiCO adaptation (Lenovo Intelligent Computing Orchestration)

Barcelona

Supercomputing

Center

2/19/2019 SUSE High Performance Computing 11

Page 12: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SuperMUC Petascale system runs SUSE

on Lenovo ThinkSystem

Geophysicists use earthquake

simulation software to investigate

seismic waves beneath Earth’s surface

Calculations involved in this kind of

simulation are so complex that they push

even supercomputers to their limits

SUSE High Performance Computing 122/19/2019

Page 13: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Bright Cluster Manager supports SUSE,

enabling customers to deploy, manage

and monitor SLES clusters using the

familiar Bright interface

Bright Cluster Manager lets users

monitor and build clusters of any size

that are easy to provision, operate,

monitor, manage and scale

SUSE High Performance Computing 2/19/2019 13

Page 14: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE continues to work with NVIDIA

to enable support for the latest

NVIDIA GPU cards – important in

HPC modeling and simulation

NVIDIA’s expertise in programmable

GPUs has led to breakthroughs in

parallel processing which make

supercomputing inexpensive and

widely accessible

SUSE High Performance Computing 2/19/2019 14

Page 15: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Univa and SUSE together

manage containerized HPC and

AI workloads on TSUBAME 3.0

Scaling machine learning for

SUSE Linux containers,

servers, clusters and clouds

with Apache Spark and Univa

SUSE High Performance Computing 2/19/2019 15

Page 16: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Altair makes HPC faster, smarter

& easy to manage with PBS Works™

Altair provides services for software

applications that streamline the

workflow management of compute-

intensive tasks including solvers,

optimization, modeling, visualization

and analytics

SUSE High Performance Computing 2/19/2019 16

Page 17: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Why SUSE Linux for HPC?

• Enterprise Linux with Enterprise support

- Incidents such as Spectre and Meltdown highlight the need quick

response to address system vulnerabilities

• More than just an OS - HPC software included and supported

- SLE HPC includes popular HPC software such as slurm and OpenMPI

• Aggressively priced subscriptions

- SUSE Linux for HPC priced for large and small HPC configurations

• Proven track record in HPC

- 50% of the Top 100 are running SUSE Linux or SLES-based OS

SUSE High Performance Computing 172/19/2019

Page 18: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE Linux Enterprise HPC Continuum

• SUSE Linux Enterprise for HPC (X86 and ARM)

- Fully supported by SUSE

• HPC Module (part of SUSE Linux Enterprise HPC)

- Fully supported through your SUSE HPC subscription

- Content inspired by OpenHPC

• PackageHub

- SUSE curated, community supported packages https://packagehub.suse.com/

• OpenSUSE LEAP

- Free, community supported Linux

- Free Developer subscriptions

- SUSE enablement for Azure, AWS Cloud

• Related Products

- SUSE Enterprise Storage

- SUSE Manager

- SUSE OpenStack Cloud SUSE High Performance Computing 2/19/2019 18

Page 19: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE Linux Enterprise Server for ARM

SUSE High Performance Computing

Offering commercial Linux support for ARM AArch64 since November 2016

20182017

SLES 15•Now available (X86, ARM, Power, system z)

•Bi-modal: traditional and CaaSP•Simplified management•Kernel 4.12•Toolchain gcc 7+

SLES 12 for ARM (SP2) • Initial commercial release AArch64•SoC: Cavium, Xilinx, AMD, others•Focus on solution enablement•Kernel 4.4•Toolchain gcc 6.2.1

SLES 12 HPC Module•Supported HPC packages•Subset of OpenHPC• Initially includes 13 packages slurm, pdsh, hwloc, etc.

SLES 12 for ARM (SP3) •Second SUSE release for AArch64•Additional SoC enablement•Expand to early adopters•Kernel 4.4•Toolchain gcc 6.2.1 -> gcc 7

SLES 12 HPC Module•New packages mpich,hdf5, munge, mv`apich2, numpy, papi, openblas, openmpi, netcdf, SCALapack, …

Q3Q1 Q2 Q4Q3Q4 Q2 Q4Q1

SUSE Enterprise Storage 5•Ceph software defined storage•X86 and ARM

SLES for ARM Raspberry Pi•Commercial support focused on IoT

SLES 12 SP4•Additional Arm enablement

SLES 12 HPC Module•Additional HPC packages•Nagios, adios, metis, ocr, R, scalasc,, ….

2/19/2019 19

Page 20: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE Linux Enterprise HPC offerings

• Available for X86 and Arm HPC clusters

• Extended Service Pack Overlap Support (ESPOS)

• Long Term Service Pack Support (LTSS)

• Simple, one price per cluster node

• Significantly reduced list prices

• Support for smaller cluster sizes

• New product – SLE HPC 15

- Separate from general purpose SLES

SUSE High Performance Computing 202/19/2019

Page 21: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE Linux HPC Module

MUNGE

ScaLAPACK

genders

• All packages supported by SUSE via SUSE Linux Enterprise HPC

• Available for x86 and Arm-based platforms

• Flexible release schedule

• SLE 12 and SLE HPC 15

SUSE High Performance Computing 212/19/2019

Page 22: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE Linux Enterprise HPC Module

All packages supported by SUSE

- Support included in the SLE HPC Subscription

Easy installation via zypper or Yast

Available for X86 and ARM platforms

beginning with SLES 12 SP2

Flexible release schedule. Releases are

independent of Service Pack schedule

•Simplifying access to supported HPC software

* Note: A separate support agreement is required for Icinga2

Package HPC Module1Q17

HPC Module4Q17

HPC Module1Q18

HPC ModuleSLES 12

HPC Module

SLE HPC15conman 0.2.7 0.2.8 0.2.8 0.2.8

cpuid (X86) 20151017 20170122 20170122 20170122 20170122

fftw 3.3.6 3.3.6 3.3.6

ganglia 3.7.2 3.7.2 3.7.2

ganglia-web 3.7.2 3.7.2 3.7.2

genders 1.2.2 1.2.2 1.2.2

GCC 6.2.1 7.3.1 7.3.1 7.3.1

hdf5 1.10.1 1.10.1 1.10.1

hwloc 1.11.5 1.11.8 1.11.8 1.11.8

Icinga2* 2.8.2 2.8.2 n/a

lua-lmod 6.5.11 7.6.1 7.6.1 7.6.1

memkind (X86) 1.1.0 1.1.0 1.6.0

mpiP 3.4.1 3.4.1 3.4.1

mrsh 2.12 2.12 2.12

munge 0.5.12 0.5.12 0.5.13

mvapich2 2.2 2.2.13 2.2.13 2.2.13

netcdf 4.4.1.1 4.4.1.1 4.6.1

netcdf-cxx 4.3.0 4.3.0 4.3.0

netcdf-fortran 4.4.4 4.4.4 4.4.4

numpy 1.13.3 1.13.3 1.14.0

openblas 0.2.20 0.2.20 0.2.20

openmpi 1.10.7 1.10.7 2.1.3

papi 5.5.1 5.5.1 5.5.1 5.5.1

pdsh 2.31 2.33 2.33 2.33 2.33

petsc 3.7.6 3.7.6 3.8.3

phdf5 1.10.1 1.10.1 1.10.1

powerman 2.3.24 2.3.24 Base OS

prun 1.0 1.0 1.0

rasdaemon 0.5.7 0.5.7 Base OS

ScaLAPACK 2.0.2 2.0.2 2.0.2

slurm 16.05.8 17.02.09 17.02.10 17.02.10 17.11.5

Note: SLE 15 customers must use the SLE HPC subscription toaccess the HPC Module packages on SLE 15

SUSE High Performance Computing

222/19/2019

Page 23: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Installing the HPC Module

SUSE High Performance Computing

232/19/2019

Page 24: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Enterprise User

SUSE PackageHub

• High-quality, up-to-date packages delivered by

openSUSE Factory

• Easy to install via zypper or yast

• Built and maintained by the community of users

• Approved and curated by SUSE

• No additional charge

•Community Supported Packages for SLES

About 1000 packages available for X86-64

More than 500 packages available for ARM

SUSE Package HubUpstream packages

Package Category

clustershell Administrative

robinhood Administrative

singularity Runtime

TensorFlow ML Framework

Caffe2 Coming soonSUSE High Performance Computing

242/19/2019

Page 25: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SLES HPC lifecycle Roadmap*

SUSE High Performance Computing

SLES 12 HPC SP5

SLES 12 HPC SP5 LTSS

SLES 12 HPC SP5SLES 12 HPC SP5

ESPOS

2017 2018 2019 2020 2021 2022 20252023 2024

SLES 12 HPC SP3 LTSS

SLES 12 HPC SP3 ESPOS

SLES 12 HPC SP3 FCS

Sept 2017

SLES 12 HPC

”Normal” SP overlap

SLES 12 HPC SP4 LTSS

SLES 12 HPC SP4 ESPOS

SLES 12 HPC SP4 FCS

4Q 2018

SLES 12 HPC

”Normal” SP overlap

SLE HPC 15 ESPOS

SLE HPC 15 FCSQ2 2018

SLE HPC 15

”Normal” SP overlap

SLE HPC 15 SP2

SLE HPC 15 SP2

SLE HPC 15 SP2 LTSS

SLE HPC 15 SP2 ESPOS

SLE HPC 15 SP1 LTSS

SLE HPC 15 SP1 ESPOS

SLE HPC 15 SP1 FCS

Q2 2019

SLE HPC 15 SP1

”Normal” SP overlap

HPC Moduledeliveries

2/19/2019

*NOTE: All future dates are estimates for illustration purposes and are not intended as committed dates.

SLE HPC 15 LTSS

Page 26: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

Other HPC related SUSE Products

SUSE High Performance Computing

SUSE OpenStack CloudCompute nodes for Arm 64 coming

SUSE Enterprise StorageX86-64 and Arm 64 since early 2017

SUSE ManagerManaged node for Arm 64 available

262/19/2019

Page 27: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE Enterprise Storage Solution for HPCMost Common Use Case as Tier 2 Storage

Low Latency Storage (Lustre,

XFS, NFS etc)

HPC Compute Cluster

SUSE Enterprise Storage

• Use Cases:• Primary Storage (Certain Use Cases)• Nearline or Archival Storage • Home Directories

• Certified with HPE Data Management Framework (DMF) and iRODS**: Coming Soon

SUSE High Performance Computing

272/19/2019

Page 28: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE + CLE59%

bullx15%

Ubuntu4%

Red Hat22%

• Represents 116 supercomputers in

the top 500 list

• Over half of the paid Linux OS in the

top 500 are SUSE

HPC Top 500 Analysis – Paid OS System Share

SUSE High Performance Computing 282/19/2019

Page 29: SUSE High Performance Computing: It just keeps getting better · 2020-01-16 · The HPC universe is expanding in new ways 2 CAGR 2016-2021: • 5.6% Supercomputer (>$500K) • 5.0%

SUSE High Performance Computing

•SLES for HPC Solution• Comprehensive range of Linux operating system offerings at multiple

price points

• Simple, one price per cluster node pricing model

• HPC Module with many supported HPC packages

• Competitive pricing

• Multiple service life options

• Full enablement for X86-64 and ARM based HPC clusters

• Additional open-source packages via PackageHub and OpenSUSE