emerging technologies in productivity for hpc · analysis & viz 5 code development !...
TRANSCRIPT
Sandia National Laboratories is a multi-mission laboratory managed and operated !by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned !subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s !National Nuclear Security Administration under contract DE-NA0003525. SAND2018-2551 PE!
Emerging Technologies in Productivity for HPC
Rob Hoekstra
PSAAP III Pre-Proposal Conference,March 14, 2018
1
ComplexityUP,ProductivityDOWN
! HWismorecomplex
! Softwarestackismorecomplex
! ProgrammingEnvironment/Modelismorecomplex
! Execution/OperationsEnvironmentismorecomplex
! AllthesefactorscannegativelyimpactPRODUCTIVITY
2
ProductivityhasbeendecliningrapidlyintheHPCenvironment
! Dramaticincreaseincomplexityofalgorithmsandapplicationscoupledwithadramaticincreaseincomplexity,diversityandscaleofHWandexecutionenvironments
! ANDCS/CSEresearchonproductivitypayslittleattentiontoourHPC-specificproblems(therearecounter-examplessuchasIDEAS)
3
EvenworseforourMissionCodes
! Complexity,sizeanddependenciesofourcodesiswellaboveaverageevenintheHPCcommunity
! Verification/validationrequirementscreateamuchhigherbarforincorporationofnewcapabilitywhetheritbephysics,algorithmsorperformanceoptimization
! Andtomakeitworse,leadership-classplatformsenvironments(SWstack,etc.)areoftenbemoreimmature/fragilethanaverage
4
Bottlenecks
! Codedevelopment
! Codecorrectness/testing
! Platformspecifictuning/optimization
! Problemsetup
! JobExecution&Steering
! Analysis&Viz
5
CodeDevelopment
! MPI/Fortrancode
! C++,hierarchicalparallelconstructs,layereddependencies
6
! Teko: BlockSegregatedPreconditioning
! Anasazi: ParallelEigensolvers! Belos: ParallelBlockIterativeSolvers(FT/CA)
! MueLU: Multi-LevelPreconditioning! Ifpack2: ILUSubdomainPreconditioning
! ShyLU: HybridMPI/ThreadedLinearSolver! Tpetra: DistributedLinearAlgebra! Zoltan2 LoadBalancing&Partitioning
! Kokkos: HeterogeneousNodeParallelKernels
Applications&AdvancedAnalysisComponents
Back-ends:OpenMP,pthreads,Cuda,Qthreads,...
Anasazi:Eigensolvers Belos:LinearSovlers
Muelu:MultiGridPreconditioners
ShyLU/Basker:DirectSolversIfpack2:SubdomainPreconditioners
Tpetra:ScalableLinearAlgebra
KokkosSparseLinearAlgebra
KokkosContainersZoltan2:LoadBalance/
Partitioning KokkosCore/Kernels
Testing/Verification
! “Eyeball”Norm
! Largeverificationtestsuites,non-reproducibility,etc.
7
Performancetuning/optimization
! PRINTF(stillfallbacktothismanytimes:)
! Performanceanalysisand“divination”
8
ProblemSetup
! CardDeck
! Complexworkflowwithgeometry/meshing,etc.
9
Cubit Hex Meshing Capability
10
housinggeometryhas13
‘volumes’
decomposedintomeshablevolumes
Cubit Journal file – 6200 lines long!Manually constructed!
800+ manually specified webcuts defined!1500+ geometry cleanup commands!
500+ meshing commands!13 volumes to 500 webcut volumes!
1000+ hours of tedium!
Turn around time: !9 months!
Jobexecution/steering
! C:>Runapp.exe
! Complexworkflowsofmulti-physics,multiplecodes,steering,datacollection
11
Containerized Workflow
Setup Simulation Application
Analytics
Visualization
Machine Learning
Workflow driver
Ensemble or Iterative Workflow
Abstracted Storage Interface
Abstracted System Monitoring
HPC system
Common Model Inputs
Analysis/Viz
! Quantity=X
! Complexdataflows/vizpackages/UQ/validation
12
Areasofopportunity! WhatarefutureHPC“HighProductivity”ProgrammingModels?
! WhatarefutureHPC“HighProductivity”DevelopmentEnvironments?
! WhatarefutureHPC“HighProductivity”Runtime/ExecutionEnvironments?
AND
! Isthereamorecoherentunificationofdesigntime,compiletimeandruntimeenvironments/tools?
ProgrammingModels
! WhatarefutureHPC“HighProductivity”ProgrammingModels?
! PortabilityAbstractions! AsyncMulti-Tasking
! DSLs! Component-baseddevelopment
14
RAJA!Legion A Data-Centric Parallel Programming System
DARMA!
Charm++ !Uintah!
DevelopmentEnvironment! WhatarefutureHPC“HighProductivity”DevelopmentEnvironments?
! IDEs! Auto-tuning! Higher-levellanguages/scripting! Opencompilerenvironments
! Automatedtesting
! CSESWEngineering“BestPractices”
15
Software Productivity for Extreme-scale
Science!Methodologies for Software!Productivity!
Use Cases: Terrestrial Modeling!
Extreme-scale Scientific Software Development Kit
(xSDK)!
IDEAS: Interoperable Design of Extreme-scale Application Software • Project began in Sept 2014 as ASCR/BER partnership to improve
application software productivity, quality, and sustainability
Resources: https://ideas-productivity.org/resources, featuring
• WhatIs and HowTo docs: concise characterizations & best practices
• What is Software Configuration? " How to Configure Software
• What is CSE Software Testing? " What is Version Control?
• What is Good Documentation? " How to Write Good Documentation
• How to Add and Improve Testing in a CSE Software Project
• How to do Version Control with Git in your CSE Project …. More under development
www.ideas-productivity.org !
Software Productivity for Extreme-scale
Science!Methodologies
for Software!Productivity!
Use Cases: Terrestrial Modeling!
Extreme-scale Scientific Software Development Kit
(xSDK)!
PIs: Michael Heroux (SNL) and Lois Curfman McInnes (ANL)!
Co-PIs: David Bernholdt (ORNL), Todd Gamblin (LLNL), Osni Marques (LBNL), David Moulton (LANL), Boyana Norris (Univ of Oregon)!
Runtime/ExecutionEnvironment
! WhatarefutureHPC“HighProductivity”Runtime/ExecutionEnvironments?
! Workflows
! Tasking! MachineLearning! ProblemSetup
! Containers
17
Containerized Workflow
Setup Simulation Application
Analytics
Visualization
Machine Learning
Workflow driver
Ensemble or Iterative Workflow
Abstracted Storage Interface
Abstracted System Monitoring
HPC system
Common Model Inputs
ProductivityimprovementcanbeacommonthreadinPSAAPcenteractivities
! “Focus”onproductivityenhancingtechnologiesthatarehighlysynergisticwithothergoals! Workflows! ProgrammingModels/Environments
! MachineLearning
! Component-basedApproaches
! TellushowyourcenterwillleverageresearchintheseareaswillhaveabigpositiveimpactonPRODUCTIVITY.
18
Questions?
19Michael Heroux 2017 DOE CSGF Meeting!
20
Visible Progress (Writing code, computing results)
100 %
0 % Planning
Recoding and Porting to new Platforms
Percent Effort
Time Midlife Effort Profile
Endlife Effort Profile
Adapted from Software Project Survival Guide, Steve McConnell
Code-and-Fix Development Approach
Early Effort Profile
21
Visible Progress (Writing code, computing results)
100 %
0 %
Planning
Recoding and Porting to new Platforms
Percent Effort
Time
Simple Planned Development Approach
Midlife Effort Profile
Ongoing Effort Profile
Early Effort Profile