joint high performance computing exchange (jhpce) cluster ... · that schedules programs (jobs) -...

Joint High Performance Computing Exchange (JHPCE) Cluster Orientation

http://www.jhpce.jhu.edu/

© 2016, Johns Hopkins University. All rights reserved.

Schedule- Introductions – who are we, who are you?- Terminology- Logging in and account setup- Basics of running programs on the cluster- Details – limits and resources- Examples


Who we are:- JHPCE – Joint High Performance Computing Exchange

- Director: Dr. Fernando J. Pineda- Technology Manager: Mark Miller- Systems Engineer: Jiong Yang

- Beyond this class, when you have questions:- http://www.jhpce.jhu.edu - lotsofgoodFAQinfo- [email protected]

- Systemissues(passwordresets/diskspace)- Monitoredbythe3peopleabove

- [email protected] Applicationissues(R/SAS/perl...)- MonitoredbydozensofapplicationSMEs- Allvolunteers

- Othersinyourlab- WebSearch


Who are you?- Name- Department- How do you plan on using the cluster?- Will you be accessing the cluster from a Mac or a

Windows system?- What is your experience with Unix?- Any experience using other clusters?


What is a cluster?- A collection of many powerful computers that can be

shared with many users.


Node (Computer) Components- Each computer is called a “Node”- Each node, just like a

desktop/laptop has- RAM- CPU

- Cores- Disk space

- Unlike desktop/laptop systems, nodes do not make use of a display/mouse – they are used from a command line interface

- Range in size from a large pizza box to a long shoe box


The JHPCE cluster components- Joint High Performance Computing Exchange –

JHPCE - pronounced “gypsy”.- Fee for service – nodes purchased by various PIs.- Used for a wide range of Biostatistics – gene

sequence analysis, population simulations, medical treatment analysis.

- Common applications: R, SAS, Stata, perl, python ...- 72 Nodes – 11 racks of equipment- Nodes range from having 2 CPUs to 4 CPUs, with

between 24 and 64 cores. In total we have nearly 3000 cores.

- RAM ranges from 128 GB to 512 GB- Disk space

- Locally attached “scratch” disk space from 250GB to 6TB- Network Attached Storage (NFS) – 4000 TB


How do programs get run on the compute nodes?- We use a product called “Sun Grid Engine” (SGE)

that schedules programs (jobs)- Jobs are assigned to slots as they become available

and meet the resource requirement of the job- Jobs are submitted to queues

- Shared Queue- Designated queues

- The cluster nodes can also be used interactively.


Why would you use a cluster?- You need resources not available on your local laptop- You need to run a program (job) that will run for a

very long time- You need to run a job that can make use of parallel

computing


How do you use the cluster?- The JHPCE cluster is accessed using SSH.- Use ssh to login to “jhpce01.jhsph.edu”

- For Mac users, you can use ssh from Terminal Window. Also need Xquartz for X Window applications:

http://xquartz.macosforge.org/landing/

- For PC users, you need to install and ssh client – such as MobaXterm or Cygwin, Putty and Winscp :

http://mobaxterm.mobatek.net/

http://www.cygwin.comhttp://www.chiark.greenend.org.uk/~sgtatham/putty/download.htmlhttp://winscp.net

JHPCE user-model“compute farm”

of compute nodes (hosts)Login serverjhpce01/02

4 cores each

40Gbps

workstations,desktops &laptops

...

...

11/2016

direct attachedlocal disks (internal)

( /tmp, /scratch/temp )

storage farm(NFS/LFS exported file systems)

/thumper2ZFS/NFS

16TB useableSunfire X4500

/amber3ZFS/NFS

100 TB useableSun 7210

(plus expansion)

/dcs01ZFS/NFS

1080TB raw800TB usable

8 SM847E26-RJBOD1 JBODs

with 450 WD Red 3TB disks

40G

bssw

itche

s

2904compute cores

40 Gbps

/dcl01Lustre


20 SM847E26-RJBOD1 JBODs with

450 WD Red 4TB disks and 450 WD

Red 6TB disks

M.Miller,F.Pineda,M.Newhouse

1 Gbps1 Gbps

Transfer serverjhpce-transfer01

HoRNet

10 Gbps

/users, /legacy, /starter02ZFS/NFS

240TB raw 160TB usable1 SM847E26-RJBOD1 JBOD with 40

WD Red 6TB disks and SSDs for L2ARC and SLOG


Example 1 – Logging in- Bring up Terminal

- Run: ssh <USER>@jhpce01.jhsph.edu

- 2 Factor authentication

- Shell prompt


Lab 1 - Logging In- For Mac Users:

- BringupaTerminal- Run:ssh <USER>@jhpce01.jhsph.edu- LoginwiththeinitialVerificationCodeandPasswordthatweresenttoyou

- For PC Users:- LaunchMobaXterm- clickonthe“Sessions”iconintheupperleftcorner- Onthe“Sessionsettings”screen,clickon“SSH”- Enter“jhpce01.jhsph.edu”asthe“Remotehost”. Clickonthe“Specifyusername”

checkbox,andenteryourJHPCEusernameinthenextfield. Thenclickthe“OK”button.- LoginwiththeinitialVerificationCodeandPasswordthatweresenttoyou.- Ifdialogwindowspopup,click"Cancel"whenpromptedforanotherVerificationCode,or

click"No"or“Donotaskthisagain”whenpromptedtosaveyourpassword.


Lab 1 - Logging In - cont- Change your password with the “passwd” command. You

will be prompted for your current Initial Password, and then prompted for a new password.

- Setup 2 factor authentication- http://jhpce.jhu.edu/knowledge-base/how-to/2-factor-authentication/- 1)Onyoursmartphone,bringupthe"GoogleAuthenticator"app- 2)OntheJHPCEcluster,run"auth_util"- 3)In"auth_util",useoption"5"todisplaytheQRcode(youmayneedtoresizeyourssh

window)- 4)ScantheQRcodewiththeGoogleAuthenticatorapp- 5)In" auth_util",useoption"6"toexitfrom"auth_util"

- Logout of the cluster by typing "exit".- Login to the cluster again with 2 factor authentication


Lab 1 - Logging In - cont- Note "Emergency Scratch Codes"

- Note: 100 GB limit on home directory. Home directories are backed up, but other storage areas may not be.

- (optional) Setup ssh keys


General Linux/Unix CommandsNavigating Unix:- ls

- ls -l- ls -al

- pwd- cd- man

Looking at files:- more- cat

Changing files witheditors:- nano- vi/emacs

Commands in example script:- date- echo- hostname- sleep- control-C- grep- wc

- Redirection> (Redirect Output to a file)| (pipe - Redirect Output to program)

Good overviews of Linux: http://korflab.ucdavis.edu/Unix_and_Perl/unix_and_perl_v3.1.1.htmlhttps://www.codecademy.com/learn/learn-the-command-line


Submitting jobs to the queue with Sun Grid Engine (SGE)- qsub – allows you to submit a batch job to the cluster- qrsh – allows you establish an interactive session

- qstat – allows you to see the status of your jobs



4 cores each

40Gbps


...

...

11/2016




/thumper2ZFS/NFS


/amber3ZFS/NFS


(plus expansion)

/dcs01ZFS/NFS




40G

bssw

itche

s

2904compute cores

40 Gbps

/dcl01Lustre




Red 6TB disks


1 Gbps1 Gbps


HoRNet

10 Gbps





Lab 2 - Using the clusterExample 2a – submitting a batch job

cd class-scriptsqsub -cwd script1qstat

examine results files

Example 2b – using an interactive session

qrsh

Note current directory


Never run a job on the login node!- Always use “qsub” or qrsh” to make use of the

compute nodes- Jobs that are found running on the login node may be

killed at will- If you are going to be compiling programs, do so on a

compute node via qrsh. - Even something as simple as copying large files

should be done via qrsh. The compute nodes have 10Gbps connections to the storage systems and jhpce01 only has a 1Gbps connection.


Other useful SGE commandsqstat – shows information about running jobs you are running

qhost – shows information about the nodes

qdel – deletes your job

qacct – shows information about completed jobs


A few options for “qsub”• run the job in the current working directory (where qsub was

executed) rather than the default (home directory)-cwd

• send standard output (error) stream to a different file-o path/filename-e path/filename

• receive notification via email when your job completes-m e -M [email protected]

F.Pineda,M.Newhouse


File size limitations- There is a default file size limitation of 10GB for all

jobs.- If you are going to be creating files larger than 10GB

you need to use the “h_fsize” option.

- Example:$ qsub -l h_fsize=50G myscript.sh


Embedding options in your qsub script- You can supply options to qsub in 2 ways:

- Onthecommandlineqsub -cwd -l h_fsize=100G -m e -M [email protected] script1.sh

- Settingthemupinyourbatchjobscript.Linesthatstartwith“#$”areinterpretedasoptionstoqsub$ head script1-resource-request.sh#$ -l h_fsize=100G#$ -cwd#$ -m e#$ -M [email protected]

$ qsub script1-resource-request.sh


Requesting resources in qsub or qrsh- By default, when you submit a job, or run qrsh, you are

allotted 1 slot and 2GB of RAM- Note .sge_request file- You can request more RAM by setting mem_free and h_vmem.

- mem_free – isusedtosettheamountofmemoryyourjobwillneed.SGEwillplaceyourjobonanodethathasatleastmem_free RAMavailable.

- h_vmem – isusedtosetahighwatermarkforyourjob.Ifyourjobusesmorethanh_vmem RAM,yourjobwillbekilled.Thisistypicallysettobethesameasmem_free.

- Examples:qsub -l mem_free=10G,h_vmem=10G job1.sh

orqrsh -l mem_free=10G,h_vmem=10G


Estimating RAM usage- No easy formula until you’ve actually run something- A good place to start is the size of the files you will be

reading in- If you are going to be running many of the same type

of job, run one job, and use “qacct -j <jobid>” to look at “maxvmem”.

MultipleslotsforMulti-threadedjobs

• Use the -pe local N option to request multiple slots on acluster node for a multi-threaded job or session

•The mem_free and h_vmem limits need to be divided by the number of cores/slots requested.

Examples:- A job which will use 8 cores and need 40GB of RAMqsub -cwd -pe local 8 -l mem_free=5G,h_vmem=5G myscript.sh

- A job which will use 10 cores and need 120GB of RAM:qsub -cwd -pe local 10 -l mem_free=12G,h_vmem=12G myscript2.sh

Note:Thelocal parallelenvironmentisaconstructnameparticulartoourcluster…andimpliesthatyourrequestisforthatmanyslotsonasingleclusternode.

F.Pineda,M.Newhouse


Types of parallelism1. Embarrassingly parallel …http://en.wikipedia.org/wiki/Embarrassingly_parallel

2. Multi-core (or multi-threaded) – a single job usingmultiple CPU cores via program threads on a singlemachine (cluster node). Also see discussion of fine-grained vs coarse-grained parallelism athttp://en.wikipedia.org/wiki/Parallel_computing

3. Many CPU cores on many nodes using a Message Passing Interface (MPI) environment. Not used much on the JHPCE Cluster.

F. Pineda, M. Newhouse


“How many jobs can I submit?”As many as you want … but this should be less than 10000

BUT, only a limited number of “slots” will run. The rest will have thequeue wait state ‘qw’ and will run as your other jobs finish.In SGE, a slot generally corresponds to a single cpu-core.

Currently, the maximum number of standard queue slots per user is 200.There is also a 512GB RAM limit per user.

The maximum number of slots per user may change depending on theavailability of cluster resources or special needs and requests.

There are dedicated queues for stakeholders which may have customconfigurations.

F.Pineda,M.Newhouse


Lab 3Running R on the cluster: - In $HOME/class-scripts/R-demo, note 2 files – Script file and R file

- Submit Script file- qsub -cwd plot1.sh

- Run R commands interactively


Lab 4Transferring files to the cluster

- Transfer results back$ sftp [email protected]

Or use WinSCP, Filezilla, mobaxterm…

- Look at resulting PDF file using viewer on laptop


ModulesModules for R, SAS, Mathematica . . .- module list- module avail- module load- module unload


Lab 5Running RStudio

- X Windows Setup- ForWindows,Mobaxterm hasandXserverbuiltintoit- ForMac,youneedtohavetheXquartz programinstalled(which

requiresareboot),andyouneedtoaddthe"-X"optiontossh:$ ssh -X [email protected]

- Start up Rstudio$ qrsh$ module load rstudio$ rstudio

- Look at pdf from example 4 via xpdf


Other queues- “shared” queue – default queue

- “download” queues- qrsh -l rnet

- (jhpce-transfer01.jhsph.edu )

- “math” queue- qrsh -l math

- “sas” queue- qrsh -l sas


Lab 6Running Stata and SAS

- SAS example:Batch:

$ cd $HOME/class-scripts/SAS-demo$ ls$ cat sas-demo1.sh$ cat class-info.sas$ qsub sas-demo1.sh

Interactive: $ qrsh -l sas$ sas

- Stata example:Batch:

$ cd $HOME/class-scripts/stata-demo$ ls$ cat stata-demo1.sh$ cat stata-demo1.do$ qsub stata-demo1.sh

Interactive:$ qrsh$ stata

or $ xstata

Note – The name of the program and script need not be the same, but it is good practice to keep them the same when possible. Note – Extensions sometimes are meaningful. SAS doesn't care, but Stata programs need to have ".do" as the extension. It is good practice for human readability.


Summary- Review

- GetfamiliarwithLinux- Usessh toconnecttoJHPCEcluster- Useqsub andqrsh tosubmitjobs- Neverrunjobsontheloginnode- Helpfulresources

- http://www.jhpce.jhu.edu/- [email protected] - System issues- [email protected] - Application issues

- What to do next- MakenoteofyourGoogleAuthenticatorscratchcodes(option

2in"auth_util")- Setupssh keysifyouwillbeaccessingtheclusterfrequently

https://jhpce.jhu.edu/knowledge-base/authentication/login/- Playnicewithothers– thisisasharedcommunitysupported

system.


Thanks for attending! Questions?


General Linux/Unix Commands- ls

- ls -l- ls -al

- pwd- cd- * (Wildcard)

- date- echo- hostname- sleep- control-C

- cp- rm- mv

- mkdir- rmdir

- more, cat- nano, vi (Editors)

- grep- wc- who- man

- Redirection> (Redirect Output to a file)| (pipe - Redirect Output to program)

Good overviews of Linux: http://korflab.ucdavis.edu/Unix_and_Perl/unix_and_perl_v3.1.1.htmlhttps://www.codecademy.com/learn/learn-the-command-line



4 cores each

40Gbps


...

...

11/2016




/thumper2ZFS/NFS


/amber3ZFS/NFS


(plus expansion)

/dcs01ZFS/NFS




40G

bssw

itche

s

2904compute cores

40 Gbps

/dcl01Lustre




Red 6TB disks


1 Gbps1 Gbps


HoRNet

10 Gbps





Example of staging data to $TMPDIR###################### stage the input data###################### (1) copy 628MB of compressed data

mkdir $TMPDIR/datacp ~/data/test.tar.Z $TMPDIR/data/.cd $TMPDIR/data

# (2) decompress and untar the data

tar -xzvf test.tar.Zdu -ch $TMPDIR/data | grep total

###################### Do the processing###################### in this case just move the data to a # temporary output directory and tar it again

mkdir $TMPDIR/outputmv $TMPDIR/data $TMPDIR/outputcd $TMPDIR/outputtar -czf results.tar.Z data

###################### save the results of the computation and quit#####################

cp results.tar.Z $HOME/.exit 0

F.Pineda,M.Newhouse

joint high performance computing exchange (jhpce) cluster ... · that schedules programs (jobs) -...

Documents