advanced high performance computing workshop hpc 201
DESCRIPTION
Advanced High Performance Computing Workshop HPC 201. Charles J Antonelli Mark Champe Seth Meyer LSAIT ARS September, 2014. Roadmap. Flux review Globus Connect Advanced PBS Array & dependent scheduling Tools GPUs on Flux Scientific applications R, Python, MATLAB - PowerPoint PPT PresentationTRANSCRIPT
Advanced High Performance
Computing WorkshopHPC 201
Charles J AntonelliMark Champe
Seth MeyerLSAIT ARS
September, 2014
cja 2014 2
RoadmapFlux review
Globus Connect
Advanced PBSArray & dependent scheduling
Tools
GPUs on Flux
Scientific applicationsR, Python, MATLAB
Parallel programming
Debugging & profiling
9/14
cja 2014 5
A Flux node
12-40 Intel cores
48 GB – 1 TB RAM
Local disk
9/14
8 GPUs (GPU Flux)
Each GPU contains 2,688 GPU cores
cja 2014 6
Programming Models
Two basic parallel programming modelsMessage-passingThe application consists of several processes running on different nodes and communicating with each other over the network
Used when the data are too large to fit on a single node, and simple synchronization is adequate
“Coarse parallelism” or “SPMD”
Implemented using MPI (Message Passing Interface) libraries
Multi-threadedThe application consists of a single process containing several parallel threads that communicate with each other using synchronization primitives
Used when the data can fit into a single process, and the communications overhead of the message-passing model is intolerable
“Fine-grained parallelism” or “shared-memory parallelism”
Implemented using OpenMP (Open Multi-Processing) compilers and libraries
Both
9/14
cja 2014 7
Using Flux
Three basic requirements:A Flux login accountA Flux allocationAn MToken (or a Software Token)
Logging in to Fluxssh flux-login.engin.umich.eduCampus wired or MWirelessOtherwise:
VPN
ssh login.itd.umich.edu first
9/14
cja 2014 8
Cluster batch workflow
You create a batch script and submit it to PBS
PBS schedules your job, and it enters the flux queue
When its turn arrives, your job will execute the batch script
Your script has access to all Flux applications and data
When your script completes, anything it sent to standard output and error are saved in files stored in your submission directory
You can ask that email be sent to you when your jobs starts, ends, or aborts
You can check on the status of your job at any time,or delete it if it’s not doing what you want
A short time after your job completes, it disappears from PBS
9/14
cja 2014 9
Loosely-coupled batch script
#PBS -N yourjobname#PBS -V#PBS -A youralloc_flux#PBS -l qos=flux#PBS -q flux#PBS –l procs=12,pmem=1gb,walltime=00:05:00#PBS -M youremailaddress#PBS -m abe#PBS -j oe
#Your Code Goes Below:cat $PBS_NODEFILEcd $PBS_O_WORKDIRmpirun ./c_ex01
9/14
cja 2014 10
Tightly-coupled batch script
#PBS -N yourjobname#PBS -V#PBS -A youralloc_flux#PBS -l qos=flux#PBS -q flux#PBS –l nodes=1:ppn=12,mem=4gb,walltime=00:05:00#PBS -M youremailaddress#PBS -m abe#PBS -j oe
#Your Code Goes Below:cd $PBS_O_WORKDIRmatlab -nodisplay -r script
9/14
cja 2014 11
GPU batch script
#PBS -N yourjobname#PBS -V#PBS -A youralloc_flux#PBS -l qos=flux#PBS -q flux#PBS –l nodes=1:gpus=1,walltime=00:05:00#PBS -M youremailaddress#PBS -m abe#PBS -j oe
#Your Code Goes Below:cat $PBS_NODEFILEcd $PBS_O_WORKDIRmatlab -nodisplay -r gpuscript
9/14
cja 2014 12
Copying dataThree ways to copy data to/from Flux
Use scp from login server:scp flux-login.engin.umich.edu:hpc201cja/example563.png .
Use scp from transfer host:scp flux-xfer.engin.umich.edu:hpc201cja/example563.png .
Use Globus Online
9/14
cja 2014 13
Globus OnlineFeatures
High-speed data transfer, much faster than SCP or SFTP
Reliable & persistent
Minimal client software: Mac OS X, Linux, Windows
GridFTP EndpointsGateways through which data flow
Exist for XSEDE, OSG, …
UMich: umich#flux, umich#nyx
Add your own client endpoint!
Add your own server endpoint: contact [email protected]
More informationhttp://cac.engin.umich.edu/resources/login-nodes/globus-gridftp 9/14
cja 2014 14
Job Arrays• Submit copies of identical jobs• Invoked via qsub –t:
qsub –t array-spec pbsbatch.txt
Where array-spec can be
m-n
a,b,c
m-n%slotlimit
e.g.
qsub –t 1-50%10 Fifty jobs, numbered 1 through 50,
only ten can run simultaneously
• $PBS_ARRAYID records array identifier
149/14
cja 2014 15
Dependent scheduling
• Submit job to become eligible for execution at a given time
• Invoked via qsub –a:qsub –a [[[[CC]YY]MM]DD]hhmm[.SS] …
qsub –a 201412312359 j1.pbs j1.pbs becomes eligible one minute before New Year’s Day 2015
qsub -a 1800 j2.pbsj2.pbs becomes eligible at six PM today (or tomorrow, if submitted after six PM)
159/14
cja 2014 16
Dependent scheduling
• Submit job to run after specified job(s) • Invoked via qsub –W:
qsub -W depend=type:jobid[:jobid]…
Where depend can beafter Schedule this job after jobids have startedafterany Schedule this job after jobids have finishedafterok Schedule this job after jobids have finished with no errorsafternotok Schedule this job after jobids have finished with errors
qsub first.pbs # assume receives jobid 12345qsub –W afterany:12345 second.pbsSchedule second.pbs after first.pbs completes
169/14
cja 2014 17
Dependent scheduling
• Submit job to run before specified job(s)• Requires dependent jobs to be scheduled first• Invoked via qsub –W:
qsub -W depend=type:jobid[:jobid]…
Where depend can bebefore jobids scheduled after this job startsbeforeanyjobids scheduled after this job completesbeforeok jobids scheduled after this job completes with no errorsbeforenotok jobids scheduled after this job completes with errorson:N wait for N job completions
qsub –W on:1 second.pbs # assume receives jobid 12345qsub –W beforeany:12345 first.pbsSchedule second.pbs after first.pbs completes
179/14
cja 2014 18
Troubleshootingshowq [-r][-i][-b][-w user=uniq] # running/idle/blocked jobs
qstat -f jobno # full info incl gpu
qstat -n jobno # nodes/cores where job running
diagnose -p # job prio and components
pbsnodes # nodes, states, properties
pbsnodes -l # list nodes marked down
checkjob [-v] jobno # why job jobno not running
mdiag -a # allocs & users (flux)
freenodes # aggregate node/core busy/free
mdiag -u uniq # allocs for uniq (flux)
mdiag -a alloc_flux # cores active, alloc (flux)
9/14
cja 2014 20
Scientific Applications
R (incl snow and multicore)
R with GPU (GpuLm, dist)
SAS, Stata
Python, SciPy, NumPy, BioPy
MATLAB with GPU
CUDA Overview
CUDA C (matrix multiply)
9/14
cja 2014 21
PythonPython software available on Flux
EPDThe Enthought Python Distribution provides scientists with a comprehensive set of tools to perform rigorous data analysis and visualization.https://www.enthought.com/products/epd/
biopythonPython tools for computational molecular biologyhttp://biopython.org/wiki/Main_Page
numpyFundamental package for scientific computinghttp://www.numpy.org/
scipyPython-based ecosystem of open-source software for mathematics, science, and engineeringhttp://www.scipy.org/
9/14
cja 2014 23
Debugging with GDB
Command-line debuggerStart programs or attach to running programsDisplay source program linesDisplay and change variables or memoryPlant breakpoints, watchpointsExamine stack frames
Excellent tutorial documentationhttp://www.gnu.org/s/gdb/documentation/
239/14
cja 2014 24
Compiling for GDBDebugging is easier if you ask the compiler to generate extra source-level debugging information
Add –g flag to your compilationicc –g serialprogram.c –o serialprogramormpicc –g mpiprogram.c –o mpiprogram
GDB will work without symbolsNeed to be fluent in machine instructions and hexadecimal
Be careful using –O with –g
Some compilers won’t optimize code when debugging
Most will, but you sometimes won’t recognize the resulting source code at optimization level -O2 and higher
Use –O0 –g to suppress optimization
249/14
cja 2014 25
Running GDBTwo ways to invoke GDB:
Debugging a serial program:gdb ./serialprogram
Debugging an MPI program:mpirun -np N xterm -e gdb ./mpiprogram
This gives you N separate GDB sessions, each debugging one rank of the program
Remember to use the –X or –Y option to ssh when connecting to Flux, or you can’t start xterms there
9/14
cja 2014 26
Useful GDB commandsgdb exec start gdb on executable execgdb exec core start gdb on executable exec with core file corel [m,n] list sourcedisas disassemble function enclosing current instructiondisas func disassemble function funcb func set breakpoint at entry to funcb line# set breakpoint at source line#b *0xaddr set breakpoint at address addri b show breakpointsd bp# delete beakpoint bp#r [args] run program with optional argsbt show stack backtracec continue execution from breakpointstep single-step one source linenext single-step, don’t step into functionstepi single-step one instructionp var display contents of variable varp *var display value pointed to by varp &var display address of varp arr[idx] display element idx of array arrx 0xaddr display hex word at addrx *0xaddr display hex word pointed to by addrx/20x 0xaddr display 20 words in hex starting at addri r display registersi r ebp display register ebpset var = expression set variable var to expressionq quit gdb
9/14
cja 2014 27
Debugging with DDTAllinea’s Distributed Debugging Tool is a comprehensive graphical debugger designed for the complex task of debugging parallel code
Advantages includeProvides GUI interface to debugging
Similar capabilities as, e.g., Eclipse or Visual Studio
Supports parallel debugging of MPI programsScales much better than GDB
9/14
cja 2014 28
Running DDTCompile with -g:mpicc –g mpiprogram.c –o mpiprogram
Load the DDT module:module load ddt
Start DDT:ddt mpiprogram
This starts a DDT session, debugging all ranks concurrently
Remember to use the –X or –Y option to ssh when connecting to Flux, or you can’t start ddt there
http://arc.research.umich.edu/flux-and-other-hpc-resources/flux/software-library/
http://content.allinea.com/downloads/userguide.pdf
9/14
cja 2014 29
Application Profiling with MAP
Allinea’s MAP Tool is a statistical application profiler designed for the complex task of profiling parallel code
Advantages includeProvides GUI interface to profiling
Observe cumulative results, drill down for details
Supports parallel profiling of MPI programs
Handles most of the details under the covers
9/14
cja 2014 30
Running MAPCompile with -g:mpicc –g mpiprogram.c –o mpiprogram
Load the MAP module:module load ddt
Start MAP:ddt mpiprogram
This starts a MAP sessionRuns your program, gathers profile data, displays summary statistics
Remember to use the –X or –Y option to ssh when connecting to Flux, or you can’t start ddt there
http://content.allinea.com/downloads/userguide.pdf
9/14
cja 2014 31
Resourceshttp://arc.research.umich.edu/flux-and-other-hpc-resources/flux/software-library/
U-M Advanced Research Computing Flux pages
http://arc.research.umich.edu/resources-services/flux/U-M Advanced Research Computing Flux pages
http://cac.engin.umich.edu/CAEN HPC Flux pages
http://www.youtube.com/user/UMCoECACCAEN HPC YouTube channel
For assistance: [email protected] by a team of people including unit support staffCannot help with programming questions, but can help with operational Flux and basic usage questions
9/14
cja 2014 32
References1. Supported Flux software,
http://arc.research.umich.edu/flux-and-other-hpc-resources/flux/software-library/, (accessed June 2014)
2. Free Software Foundation, Inc., “GDB User Manual,” http://www.gnu.org/s/gdb/documentation/ (accessed June 2014).
3. Intel C and C++ Compiler 14 User and Reference Guide, https://software.intel.com/en-us/compiler_14.0_ug_c (accessed June 2014).
4. Intel Fortran Compiler 14 User and Reference Guide,https://software.intel.com/en-us/compiler_14.0_ug_f(accessed June 2014).
5. Torque Administrator’s Guide, http://docs.adaptivecomputing.com/torque/4-2-8/torqueAdminGuide-4.2.8.pdf (accessed June 2014).
6. Submitting GPGPU Jobs, https://sites.google.com/a/umich.edu/engin-cac/resources/systems/flux/gpgpus (accessed June 2014).
9/14