minnesota supercomputing institute · © 2013 regents of the university of minnesota. all rights...

Post on 23-Aug-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2013 Regents of the University of Minnesota. All rights reserved.

Minnesota SupercomputingInstitute

© 2013 Regents of the University of Minnesota. All rights reserved.

Introduction to Job Submission and Scheduling

Andrew Gustafson

© 2013 Regents of the University of Minnesota. All rights reserved.

Interacting with MSI Systems

© 2013 Regents of the University of Minnesota. All rights reserved.

Connecting to MSISSH is the most reliable connection method

Linux and Mac users can use the terminal command:ssh login.msi.umn.edu

Windows users will need to use an ssh capable program, like PuTTY or Cygwin.

SSH connections must first connect to login.msi.umn.edu. From there you can connect to other systems.

For graphical connections use NICE:www.nice.msi.umn.edu

© 2013 Regents of the University of Minnesota. All rights reserved.

MSI Computing Environment

MSI systems are primarily Linux compute clusters running CentOS.

Software is managed via a module system.

Jobs are scheduled via a queueing system.

Home directories are unified across systems.

© 2013 Regents of the University of Minnesota. All rights reserved.

Machine Architecture: Cluster

Source: http://en.wikipedia.org/wiki/Cluster_%28computing%29

© 2013 Regents of the University of Minnesota. All rights reserved.

○ Mesabi■ About 17,700 total cores, on Intel Haswell processors.■ 24 cores and 62gb per node in the large primary queues.■ Special queues with large memory (up to 1TB), and GPUs.■ Allows node sharing: good for both small and large jobs.■ mesabi.msi.umn.edu

○ Itasca■ About 9,000 total cores, on Intel Nehalem processors.■ 8 cores and 22gb per node in the large primary queue.■ Special queues with larger memory and 16 cores per node.■ itasca.msi.umn.edu

○ Lab Server■ About 500 total cores, on older hardware.■ For interactive, or small single node jobs.■ 8 cores and 15gb per node in the primary queue.■ lab.msi.umn.edu

Clusters at MSI

© 2013 Regents of the University of Minnesota. All rights reserved.

Clusters at MSI

Mesabi LabItasca

Login

First connect to login.msi.umn.edu,then connect to a cluster.

Must be on-campus, or using the VPN.https://it.umn.edu/virtual-private-network-vpn

© 2013 Regents of the University of Minnesota. All rights reserved.

Home Directories

Home directories are unified across all Linux systems.

Each group has a disk quota which can be viewed with the command: groupquota

Panasas ActivStor 14: 3.01PB storage, capable of 30 GB/sec read/write, and 270,000 IOPS

© 2013 Regents of the University of Minnesota. All rights reserved.

Loading SoftwareSoftware modules are used to alter environmental variables, in order to make software available. MSI has hundreds of software modules.

Description Command Example

See all available modules: module avail module avail

Load a module: module load module load matlab/2015a

Unload a module: module unload module unload matlab/2015a

Unload all modules: module purge module purge

See what a module does: module show module show matlab/2015a

List currently loaded modules: module list module list

Module Commands:

© 2013 Regents of the University of Minnesota. All rights reserved.

Job SchedulingOn MSI systems, calculations are performed within “jobs”. A job is a planned calculation that will run for a specified time length on a specified set of hardware.

There are two types of job:1. Non-interactive (vast majority)2. Interactive

The job scheduler front-end is called the Portable Batch System (PBS).

Jobs start in your home directory with no modules loaded.

© 2013 Regents of the University of Minnesota. All rights reserved.

Job ScriptsTo submit a non-interactive job, first make a

PBS job script.

Example:#!/bin/bash -l#PBS -l walltime=8:00:00,nodes=3:ppn=8,pmem=1000mb#PBS -m abe#PBS -M sample_email@umn.edu

cd ~/program_directorymodule load intelmodule load ompi/intelmpirun -np 24 program_name < inputfile > outputfile

© 2013 Regents of the University of Minnesota. All rights reserved.

Job SubmissionTo submit a job script use the command:

qsub -q queuename scriptname

A list of queues available on different systems can be found here:https://www.msi.umn.edu/queues

Submit jobs to a queue which is appropriate for the resources needed.

Resources to consider when choosing a queue:● Walltime● Total cores and cores per node● Memory● Special hardware (GPUs, etc)

© 2013 Regents of the University of Minnesota. All rights reserved.

Job SubmissionTo view queued jobs use the commands:

qstat -u usernameshowq -w user=username

For detailed information:checkjob -v jobnumber

To cancel a submitted job use the command:qdel jobnumber

© 2013 Regents of the University of Minnesota. All rights reserved.

Interactive Jobs

Nodes may be requested for interactive use using the command:

qsub -I -X -l walltime=1:00:00,nodes=1:ppn=8,mem=2gb

The job waits in the queue like all jobs, and when it begins the terminal returns control.

© 2013 Regents of the University of Minnesota. All rights reserved.

Service Units (SUs)

Jobs on the high performance computing (HPC) systems consume Service Units (SUs), which roughly correspond to processor time.

Each research group is given a service unit allocation at the beginning of the year. To view the number of service units remaining use the command: acctinfo

If a group is using service units faster than the "fairshare target", then the group's jobs will have lower queue priority.

© 2013 Regents of the University of Minnesota. All rights reserved.

Simple Parallelization: BackgroundingMost easily done with single node jobs.

#!/bin/bash -l#PBS -l walltime=8:00:00,nodes=1:ppn=8,pmem=1000mb#PBS -m abe#PBS -M sample_email@umn.edu

cd ~/job_directorymodule load example/1.0./program1.exe < input1 > output1 &./program2.exe < input2 > output2 &./program3.exe < input3 > output3 &./program4.exe < input4 > output4 &./program5.exe < input5 > output5 &./program6.exe < input6 > output6 &./program7.exe < input7 > output7 &./program8.exe < input8 > output8 &wait

© 2013 Regents of the University of Minnesota. All rights reserved.

Simple Parallelization: Job ArraysWorks best on Mesabi.

Template Job Script, template.pbs:#!/bin/bash -l#PBS -l walltime=8:00:00,nodes=1:ppn=8,pmem=1000mb#PBS -m abe#PBS -M sample_email@umn.edu

cd ~/job_directorymodule load example/1.0./program.exe < input$PBS_ARRAYID > output$PBS_ARRAYID

Submit an array of 10 jobs:qsub -t 1-10 template.pbs

© 2013 Regents of the University of Minnesota. All rights reserved.

Minnesota Supercomputing Institute

The University of Minnesota is an equal opportunity educator and employer. This PowerPoint is available in alternative formats upon request. Direct requests to Minnesota Supercomputing Institute, 599 Walter library, 117 Pleasant St. SE,

Minneapolis, Minnesota, 55455, 612-624-0528.

Web: www.msi.umn.edu

Email: help@msi.umn.edu

Telephone: (612) 626-0802

top related