balancing performance and portability with containers in ... · ornl%ismanaged%byut2battelle%...
TRANSCRIPT
![Page 1: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/1.jpg)
ORNL is managed by UT-Battelle for the US Department of Energy
Oak Ridge National LaboratoryComputing and Computational Sciences Directorate
Thomas Naughton, Lawrence Sorrillo, Adam Simpson and Neena Imam
Oak Ridge National Laboratory
Balancing Performance and Portability with Containers in HPC: An OpenSHMEM Example
August 8, 2017
OpenSHMEM 2017 Workshop, Annapolis, MD, USA
![Page 2: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/2.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Introduction
• Growing interest in methods to support user driven software customizations in an HPC environment– Leverage isolation mechanisms & container tools
• Reasons– Improve productivity of users– Improve reproducibility of computational experiments
![Page 3: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/3.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Containers
• What’s a Container?– “knobs” to tailor classic UNIX fork/exec model– Method for encapsulating an execution environment– Leverages operating system (OS) isolation mechanisms • Resource namespaces & control groups (cgroups)
Example: Process gets “isolated” view of running processes (PIDs), and a control group that restricts it to 50% of CPU
• Container Runtime– Coordinate OS mechanisms– Provide streamlined (simplified) access for user– Initialization of “container” & interfaces to attach/interact
![Page 4: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/4.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Container Motivations
• Compute mobility– Build application with software stack and carry to compute– Customize execution environment to end-user preferences
• Reproducibility– Run experiments with the same software & data– Archive experimental setup for reuse (by self or others)
• Packaging & Distribution– Configure applications with possibly complex software dependencies using packages that are “best” for user
– Benefits User and Developer productivity
![Page 5: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/5.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Challenges
• Integration into HPC ecosystem– Tools & Container Runtimes for HPC
• Accessing high performance resources– Take advantage of advanced hardware capabilities– Typically in form of device drivers / user-level interfaces• Example: HPC NIC and associated communication libraries
• Methodology and best practices– Methods for balancing portability & performance– Establish “best practices” for building/assembling/using
![Page 6: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/6.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Docker
• Container technology comprised of several parts– Engine: User interface, Storage Drivers, Network overlays,
Container Runtime– Registry: Image server (public & private)– Swarm: Machine and Control (daemon) overlay– Compose: Orchestration interface/specification
• Tools & specifications– Same specification format for all layers (consistency)– Rich set of tools for image management/specification• e.g., DockerHub, Dockerfile, Compose, etc.
https://www.docker.com
![Page 7: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/7.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Docker Container Runtime
• runc & containerd– Low-level runtime for container creation/management– Donated as reference implementation for OCI (2015)• Open Container Initiative – runtime & image specs.• Formerly libcontainer in Docker
– Used with containerd as of Docker-1.12
containerd runc
container rootfs container.json
Image-Layer-x Image-Layer-y Image-Layer-z
Container Image
Container Instance
![Page 8: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/8.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Singularity Container Runtime
• Container runtime for “compute mobility”– Code developed “from scratch” by Greg Kurtzer– Started in early 2016 (http://singularity.lbl.gov)
• Created with HPC site/use awareness– Expected practices for HPC resources (e.g., network, launch)– Expected behavior for user permissions
• Adding features to gain parity with Docker– Importing Docker images– Creating a Singularity-Hub
![Page 9: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/9.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Docker• Software stack– Several components• dockerd, swarm, registry, etc.
– Rich set of tools & capabilities
• Networking– Supports full network isolation– Supports pass-through “host”
• Storage– Full storage abstraction
• Example: devicemapper, AUFS
Singularity• Software stack– Small code base– Follows more traditional HPC patterns• User permissions• Existing HPC launchers
• Networking– No network isolation
• Storage– No storage abstraction
![Page 10: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/10.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Docker• Images– Advanced read/write capabilities (layers)• Copy-on-Write (COW) for “container layer”
• Image creation– Specification file (Dockerfile)
• Image sharing– DockerHub is central location for sharing images
Singularity• Images– Single image “file” (or ‘rootfs’ dir)• No COW or container layers
• Image creation– Specification file– Supports Docker image import
• Image sharing– SingularityHub emerging as basis to build/share images
![Page 11: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/11.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Example: MPI Application
• MPI – Docker– Run SSH daemon in containers– Span nodes using Docker networking (“virtual networking”)– Fully abstracted from host– MPI entirely within container
• MPI – Singularity– Run SSH daemon on host– Span nodes using host network (no network isolation)– Support network at host/MPI-RTE layer– MPI split between host/container
![Page 12: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/12.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Evaluation
• Objective: Evaluate viability of using same image on “developer” system and then on “production” system– Use an existing OpenSHMEM benchmark
• Container Image– Ubuntu 17.04• Recent release & not available on production system
– Select Open MPI’s implementation of OpenSHMEM• Directly available from Ubuntu
– Select Graph500 as demonstration application• Using OpenSHMEM port
![Page 13: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/13.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Adapting Images
• Two general approachesa) Customize image with requisite softwareb) Dynamically load requisite software
• Pro/Con– Over customization may break portability– Loading at runtime may not be viable in all cases
![Page 14: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/14.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Image Construction Procedure
• Create Docker image for Graph500– Useful for initial testing on development machine– Docker not available on production systems
• Create Singularity image for Graph500– Directly import Docker image, or– Bootstrap image from Singularity definition file
• Customize for production – Later we found we also had to add a few directories for bind mounts on production machine
– Add few changes to environment variable via the “/environment” file in Singularity-2.2.1
– Add ‘munge’ library for authentication
![Page 15: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/15.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Cray Titan Image Customizations
• (Required) Directories for runtime bind mounts– /opt/cray– /var/spool/alps– /var/opt/cray– /lustre/atlas– /lustre/atlas1– /lustre/atlas2
• Other customizations for our tests– apt-get install libmunge2 munge– mkdir -p /ccs/home/$MY_TITAN_USERNAME– Edits to “/environment” file (next slide)
![Page 16: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/16.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Singularity /environment# Appended to /environment file in container image
# On Cray, extend LD_LIBRARY_PATH with host CRAY Libs if test -n "$CRAY_LD_LIBRARY_PATH";; then
export PATH=$PATH:/usr/local/cuda/bin# Add Cray specific library pathsexport LD_LIBRARY_PATH=$CRAY_LD_LIBRARY_PATH:/opt/cray/sysutils/1.0-
1.0502.60492.1.1.gem/lib64:/opt/cray/wlm_detect/1.0-1.0502.64649.2.2.gem/lib64:/usr/local/lib:/lib64/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib:/usr/local/cuda/lib64:$LD_LIBRARY_PATHfi
# On Cray, Also add the host OMPI Libsif test -n "$CRAY_OMPI_LD_LIBRARY_PATH";; then
# Add OpenMPI/2.0.2 librariesexport LD_LIBRARY_PATH=$CRAY_OMPI_LD_LIBRARY_PATH:$LD_LIBRARY_PATH# Apparently these are needed on TitanOMPI_MCA_mpi_leave_pinned=0OMPI_MCA_mpi_leave_pinned_pipeline=0
fi
![Page 17: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/17.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Setup
• Software– Graph500-OSHMEM• D’Azevedo 2015 [1]
– Singularity 2.2.1– Ubuntu 17.04• OpenMPI v2.0.2with OpenSHMEM enabled(Using oshmem: spml/yoda)
• GCC 5.4.3– Cray Linux Environment (CLE) 5.2 (Linux 3.0.x)
• Testbed Hardware– 4 Node, 64 core testbed– 10GigE
• Production Hardware– Titan @ OLCF• 16 cores per node• Used 2-64 nodes in tests
– Gemini network
[1] E. D'Azevedo and N. Imam, Graph 500 in OpenSHMEM, OpenSHMEM Workshop 2015, http://www.csm.ornl.gov/workshops/openshmem2015/documents/talk5_paper_graph500.pdf
![Page 18: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/18.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Test Cases
1. “Native” – a non-container case• Establish baseline for Graph500 without container stuff
2. “Singularity” – container setup to use Host network• Leverage host communication libraries (Cray Gemini)• Inject host comm. libs into container via LD_LIBRARY_PATH
3. “Singularity-NoHost” – #2 minus host libs• Run standard container (self-contained comm. libs)• Not likely to have super high performance comm. libs as the containers are built on commodity machines– i.e., no Cray Gemini headers/libs in container
![Page 19: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/19.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Example Invocations
IMAGE_FILE=$PROJ_DIR/naughton/images-singularity/graph500-oshmem.imgEXE=/benchmarks/src/graph500-oshmem/mpi/graph500_shmem_one_sided
oshrun -np 128 --map-by ppr:2:node --bind-to core \singularity exec $IMAGE_FILE $EXE 20 16
EXE=$PROJ_DIR/naughton/graph500/mpi/graph500_shmem_one_sided
oshrun -np 128 --map-by ppr:2:node --bind-to core \$EXE 20 16
Native (no-container)
Singularity (container)
![Page 20: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/20.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Development Cluster – Graph500 BFS
• Roughly same performance– Native (no-container) & Singularity– 2 hosts @ scale=20
11.99.0
31.7
13.5
5.6
10.88.6
29.8
13.4
4.9
0
5
10
15
20
25
30
35
40
1,(1ppr) 2,(1ppr) 4,(2ppr) 8,(4ppr) 16,(8ppr)
Time,(sec)
Num,Processes(ProcessPerHost)
UB4,2Hosts,? Graph500,BFS,mean_time(scale:,20,,edges:,16,,skip?valiate)
ub4?native ub4?singularity
![Page 21: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/21.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Titan – Graph500 BFS
• Roughly same performance– Native (no-container) & Singularity– Left: 2 hosts @ scale=16 with uGNI BTL– Right: 16 hosts & 64 hosts @ scale=20 with uGNI BTL
17.623.2
9.0
17.3
23.9
8.7
051015202530
64,(16hosts,,4ppr) 64,(64hosts,,1ppr) 128,(64hosts,,2ppr)
Time,(sec)
Num,Processes(NumHosts,,ProcessesPerHost)
Titan,A Graph500,BFS,mean_time(scale:20,,edges:16,,skipAvalidate,,uGNI)
titanAnative titanAsingularity
14.5
10.28.1
4.93.1
14.0
10.07.8
4.73.0
0246810121416
1,(2hosts,,1ppr) 2,(2hosts,,1ppr) 4,(2hosts,,2ppr) 8,(2hosts,,4ppr) 16,(2hosts,,8ppr)
Time,(sec)
Num,Processes(NumHosts,,ProcessesPerHost)
Titan,A Graph500,BFS,mean_time(scale:16,,edges:16,,skipAvalidate,,uGNI)
titanAnative titanAsingularity
![Page 22: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/22.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Titan – Graph Construction & Generation
• Graph construction & generation times roughly same– Native (no-container), Singularity, and Singularity-noHost– 64 hosts @ scale=20 with uGNI BTL
22.525.8
22.525.5
22.525.4
051015202530
64*(64hosts,*1ppr) 128*(64hosts,*2ppr)
Time*(sec)
Num*Processes(NumHosts,*ProcessPerHost)
Titan*? Graph500*graph_generation(scale:20,*edges:16,*skip?validate,*uGNI)
titan?native titan?singularity titan?singularityNoHost
2.62.9
2.62.93.0 3.0
0
1
2
3
4
64)(64hosts,)1ppr) 128)(64hosts,)2ppr)
Time)(sec)
Num)Processes(NumHosts,)ProcessPerHost)
Titan)? Graph500)construction_time(scale:20,)edges:16,)skip?validate,)uGNI)
titan?native titan?singularity titan?singularityNoHost
![Page 23: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/23.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Titan – Graph500 BFS mean_time
• Much worse performance when using host library (uGNI)– Native (no-container), Singularity (with uGNI), Singularity-noHost– 64 hosts @ scale=20 with uGNI BTL
23.2
9.0
23.9
8.7
1.1 1.105
1015202530
64,(64hosts,,1ppr) 128,(64hosts,,2ppr)
Time,(sec)
Num,Processes(NumHosts,,ProcessesPerHost)
Titan,A Graph500,BFS,mean_time(scale:20,,edges:, 16,,skipAvalidate,,uGNI)
titanAnative titanAsingularity titanAsingularityNoHost
![Page 24: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/24.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Titan – Native comparison
• OMPI-oshmem uGNI (spml/yoda) worse than CraySHMEM– All native (no containers)– Suggests something wrong with our OMPI-oshmem uGNI setup ???
23.2
9.06.9
4.0
0
5
10
15
20
25
64*(64hosts,*1ppr) 128*(64hosts,*2ppr)
Time*(sec)
Num*Processes(NumHosts,*ProcessPerHost)
Titan*@ Graph500**BFS*mean_time(scale:20,*edges:16,*skip@validate)
titan@native crayshmem@native
![Page 25: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/25.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Titan – Graph500 BFS mean_time
• Roughly same when disable uGNI BTL– Native (no uGNI), Singularity (no uGNI), Singularity-noHost– 64 hosts @ scale=20 without uGNI BTL
1.10.9
1.00.9
1.1 1.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
64)(64hosts,)1ppr) 128)(64hosts,)2ppr)
Time)(sec)
Num.)Processes(NumHosts,) ProcessPerHost)
Titan)> Graph500)BFS)mean_time(scale:20,)edges:16,)skip>validate,)TCP/"no(uGNI")
titan>native>NOugni titan>singularity>NOugni titan>singularityNoHost>NOugni
![Page 26: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/26.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Observations (1)• Graph500– Native & Singularity had roughly same performance
• Both uGNI & TCP BTLs
– Singularity-NoHost had better performance in our tests due to problems with OMPI-2.0.2’s OSHMEM with uGNI
• Open MPI’s (v2.0.2) OpenSHMEM– Good: Maybe the only OSHMEM included in a Linux distro– Good: General TCP BTL was stable for testing and showed decent performance in our Graph500 tests at scale=20 on 64 nodes
– Bad: Cray Gemini (uGNI) BTL is not stable with OSHMEM interface (using MCA spml/yoda)
![Page 27: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/27.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Observations (2)• Singularity in production– Performance was consistent between native & singularity– Required to customize image with dirs for bind mounts– Inconvenient to push full image for all edits to the image• Example: change ‘/environment’ file require full re-upload
• Note: Singularity-2.3 may have improved this, e.g., not need ‘sudo’ for “copy” and can set env via ‘SINGULARITYENV_xxx’.
– User-defined bind mounts disabled on older CLE kernel– Not able to use Cray SHMEM for container case• Note: Can not create image on system, and defeats purpose of portable container (not run on devel cluster)
![Page 28: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/28.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Future Work
• Run with revised OpenSHMEM configuration– Use different component in Open MPI’s SPML framework• Disable Yoda (spml/yoda)• Enable UCX (spml/ucx)
– Determine root cause of unexpected performance with uGNI
• Scale-up tests on production system– Perform larger node/core count MPI and OSHMEM tests on Titan
![Page 29: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/29.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
Summary
• Containers in HPC– Motivations
• Productivity of users• Reproducibility of experiments
– Overview of two key container runtimes• Singularity & Docker
• Evaluation on Titan– Graph500 benchmark to investigate performance of OpenSHMEMapplication with Singularity based containers on production machine
– Identified basic set of image edits for use on Titan– Consistent performance between Native and Singularity
• Note: Identified unexpected slowness (unexplained) when using uGNI BTL
![Page 30: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/30.jpg)
Computational Research & Development Programs
UNCLASSIFIED // FOR OFFICIAL USE ONLY
This work was supported by the United StatesDepartment of Defense (DoD) and used resourcesof the Computational Research and DevelopmentPrograms at Oak Ridge National Laboratory.
Acknowledgements
![Page 31: Balancing Performance and Portability with Containers in ... · ORNL%ismanaged%byUT2Battelle% fortheUSDepartmentofEnergy Oak Ridge National Laboratory Computing and Computational](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec95b8d8c0173649011ce38/html5/thumbnails/31.jpg)
Questions?
Computational Research & Development Programs