minnesota supercomputing institute · © 2013 regents of the university of minnesota. all rights...
TRANSCRIPT
© 2013 Regents of the University of Minnesota. All rights reserved.
Minnesota SupercomputingInstitute
© 2013 Regents of the University of Minnesota. All rights reserved.
Introduction to Job Submission and Scheduling
Andrew Gustafson
© 2013 Regents of the University of Minnesota. All rights reserved.
Interacting with MSI Systems
© 2013 Regents of the University of Minnesota. All rights reserved.
Connecting to MSISSH is the most reliable connection method
Linux and Mac users can use the terminal command:ssh login.msi.umn.edu
Windows users will need to use an ssh capable program, like PuTTY or Cygwin.
SSH connections must first connect to login.msi.umn.edu. From there you can connect to other systems.
For graphical connections use NICE:www.nice.msi.umn.edu
© 2013 Regents of the University of Minnesota. All rights reserved.
MSI Computing Environment
MSI systems are primarily Linux compute clusters running CentOS.
Software is managed via a module system.
Jobs are scheduled via a queueing system.
Home directories are unified across systems.
© 2013 Regents of the University of Minnesota. All rights reserved.
Machine Architecture: Cluster
Source: http://en.wikipedia.org/wiki/Cluster_%28computing%29
© 2013 Regents of the University of Minnesota. All rights reserved.
○ Mesabi■ About 17,700 total cores, on Intel Haswell processors.■ 24 cores and 62gb per node in the large primary queues.■ Special queues with large memory (up to 1TB), and GPUs.■ Allows node sharing: good for both small and large jobs.■ mesabi.msi.umn.edu
○ Itasca■ About 9,000 total cores, on Intel Nehalem processors.■ 8 cores and 22gb per node in the large primary queue.■ Special queues with larger memory and 16 cores per node.■ itasca.msi.umn.edu
○ Lab Server■ About 500 total cores, on older hardware.■ For interactive, or small single node jobs.■ 8 cores and 15gb per node in the primary queue.■ lab.msi.umn.edu
Clusters at MSI
© 2013 Regents of the University of Minnesota. All rights reserved.
Clusters at MSI
Mesabi LabItasca
Login
First connect to login.msi.umn.edu,then connect to a cluster.
Must be on-campus, or using the VPN.https://it.umn.edu/virtual-private-network-vpn
© 2013 Regents of the University of Minnesota. All rights reserved.
Home Directories
Home directories are unified across all Linux systems.
Each group has a disk quota which can be viewed with the command: groupquota
Panasas ActivStor 14: 3.01PB storage, capable of 30 GB/sec read/write, and 270,000 IOPS
© 2013 Regents of the University of Minnesota. All rights reserved.
Loading SoftwareSoftware modules are used to alter environmental variables, in order to make software available. MSI has hundreds of software modules.
Description Command Example
See all available modules: module avail module avail
Load a module: module load module load matlab/2015a
Unload a module: module unload module unload matlab/2015a
Unload all modules: module purge module purge
See what a module does: module show module show matlab/2015a
List currently loaded modules: module list module list
Module Commands:
© 2013 Regents of the University of Minnesota. All rights reserved.
Job SchedulingOn MSI systems, calculations are performed within “jobs”. A job is a planned calculation that will run for a specified time length on a specified set of hardware.
There are two types of job:1. Non-interactive (vast majority)2. Interactive
The job scheduler front-end is called the Portable Batch System (PBS).
Jobs start in your home directory with no modules loaded.
© 2013 Regents of the University of Minnesota. All rights reserved.
Job ScriptsTo submit a non-interactive job, first make a
PBS job script.
Example:#!/bin/bash -l#PBS -l walltime=8:00:00,nodes=3:ppn=8,pmem=1000mb#PBS -m abe#PBS -M [email protected]
cd ~/program_directorymodule load intelmodule load ompi/intelmpirun -np 24 program_name < inputfile > outputfile
© 2013 Regents of the University of Minnesota. All rights reserved.
Job SubmissionTo submit a job script use the command:
qsub -q queuename scriptname
A list of queues available on different systems can be found here:https://www.msi.umn.edu/queues
Submit jobs to a queue which is appropriate for the resources needed.
Resources to consider when choosing a queue:● Walltime● Total cores and cores per node● Memory● Special hardware (GPUs, etc)
© 2013 Regents of the University of Minnesota. All rights reserved.
Job SubmissionTo view queued jobs use the commands:
qstat -u usernameshowq -w user=username
For detailed information:checkjob -v jobnumber
To cancel a submitted job use the command:qdel jobnumber
© 2013 Regents of the University of Minnesota. All rights reserved.
Interactive Jobs
Nodes may be requested for interactive use using the command:
qsub -I -X -l walltime=1:00:00,nodes=1:ppn=8,mem=2gb
The job waits in the queue like all jobs, and when it begins the terminal returns control.
© 2013 Regents of the University of Minnesota. All rights reserved.
Service Units (SUs)
Jobs on the high performance computing (HPC) systems consume Service Units (SUs), which roughly correspond to processor time.
Each research group is given a service unit allocation at the beginning of the year. To view the number of service units remaining use the command: acctinfo
If a group is using service units faster than the "fairshare target", then the group's jobs will have lower queue priority.
© 2013 Regents of the University of Minnesota. All rights reserved.
Simple Parallelization: BackgroundingMost easily done with single node jobs.
#!/bin/bash -l#PBS -l walltime=8:00:00,nodes=1:ppn=8,pmem=1000mb#PBS -m abe#PBS -M [email protected]
cd ~/job_directorymodule load example/1.0./program1.exe < input1 > output1 &./program2.exe < input2 > output2 &./program3.exe < input3 > output3 &./program4.exe < input4 > output4 &./program5.exe < input5 > output5 &./program6.exe < input6 > output6 &./program7.exe < input7 > output7 &./program8.exe < input8 > output8 &wait
© 2013 Regents of the University of Minnesota. All rights reserved.
Simple Parallelization: Job ArraysWorks best on Mesabi.
Template Job Script, template.pbs:#!/bin/bash -l#PBS -l walltime=8:00:00,nodes=1:ppn=8,pmem=1000mb#PBS -m abe#PBS -M [email protected]
cd ~/job_directorymodule load example/1.0./program.exe < input$PBS_ARRAYID > output$PBS_ARRAYID
Submit an array of 10 jobs:qsub -t 1-10 template.pbs
© 2013 Regents of the University of Minnesota. All rights reserved.
Minnesota Supercomputing Institute
The University of Minnesota is an equal opportunity educator and employer. This PowerPoint is available in alternative formats upon request. Direct requests to Minnesota Supercomputing Institute, 599 Walter library, 117 Pleasant St. SE,
Minneapolis, Minnesota, 55455, 612-624-0528.
Web: www.msi.umn.edu
Email: [email protected]
Telephone: (612) 626-0802