i.e. running scripts on yellowstone’s dav nodes, aka ... · if you run microsoft windows, you may...
TRANSCRIPT
Using the CMIP Analysis Platform
i.e. running scripts on Yellowstone’s DAV nodes, aka Geyser and Caldera
Brief survey● Who has already used Yellowstone?
● Another supercomputer?
● Knows what is one of LSF, slurm, PBS?
● Is familiar with ssh, X-tunneling and/or VNC?
● Who is familiar with the DRS syntax?
● Who is familiar with environment modules (i.e. lmod or “module load x”)?
The speakerDavide Del Vento, PhD in PhysicsUser Services Section, Software Engineer
http://www2.cisl.ucar.edu/user-support/user-services-section
Best way to reach me is via ExtraView Tickets, which youcan create by sending email to
The CMIP Analysis Platform (aka CMIPAP)● The Platform is made of various High Performance Computing resources
available at NCAR, most notably:
○ High performance disks mounted as Parallel Filesystem○ Computing and visualization nodes (48 of them)
■ often referred as DAV (Data Analysis and Visualization) nodes■ 16 nodes (caldera) equipped with large memory (128GB/node) and two GPUs■ 16 nodes (geyser) equipped with huge memory (1TB/node) and one GPU■ 16 nodes (pronghorn) equipped with large memory (128GB/node) and no GPUs
○ High speed IB interconnect between nodes ○ High speed network to the outside world (two 40GB links)
Preamble - getting an account● Everybody here should already have an account (yubikey)
● If you don’t have an account, you can request one specifically for the CMIPAP
● Everybody who has an account can use CMIPAP with their existing allocation
● or, they can request a specific CMIPAP allocation
● To request a NEW dataset to be added, you need to have a CMIPAP allocation
Accounts? Allocations? What?? ● An account is your ability to login into a machine. It is
Personal, just yours, and tied to the yubikey.
● The main production machine we have at NCAR is theYellowstone supercomputer, which has some DAV nodes, namedGeyser and Caldera. The account does not distinguish among ys, gy & ca
● An allocation (sometimes also referred as project number) is your ability to submit and run jobs on a machine. The allocation does distinguish between ys and DAV nodes -- it is often shared by a research team.
Logging in● Access to any HPC resource at NCAR requires ssh
● MindTerm, OpenSSH, PuTTY are some ssh clients
● If you run Microsoft Windows, you may also need a X server, such as Xming or Cygwin/X - or TurboVNC
● If you run Mac OS X, you may also need XQuartz - or TurboVNC
● Linux laptops almost always come with a good X server preconfigured
Try login
Available datasets● Datasets are continuously added
● Check if what you need is there
● Feel free to request if not
● We are learning how to best serve you!!
● Currently organized with theDRS syntax (plans to change)
Directory structure guided by the DRS Syntax
● Once logged into CMIPAP, all the CMIP data is in the directory
/glade/p/CMIP
● Which for now contains only CMIP5 data, organized like the website table as:
/glade/p/CMIP/<activity>/<product>/<institute>/<model>/<experiment>/<frequency>/<realm>
● And then further down the path
/glade/p/CMIP/…/<realm>/<MIP-table>/<ensemble-member>/<version-num>/<variable-name>
DRS what? An example/glade/p/CMIP/<activity>/<product>/<institute>/<model>/<experiment>/<frequency>/<realm>
<activity> = CMIP5<product> = output1<institute> = NCAR<model> = CCSM4<experiment> = rcp85 # RCP 8.5<frequency> = mon<realm> = atmos
…/<realm>/<MIP-table>/<ensemble-member>/<version-num>/<variable-name>
<MIP-table> = Amon<ensemble-member> = r1i1p1<version-num> = v20130830<variable-name> = pr
/glade/p/CMIP/CMIP5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r1i1p1/v20130830/pr/<CMORfilename>
Feedback sought● The CMIP AP is a work in progress
● This exercise is to get ready for CMIP 6, when data volumes will be prohibitive
● Help us help you, by giving feedback
● E.g. use ETH naming conventions
Proposed ETH-inspired re-organization● Currently <CMOR-files> are contained into a directory such as:
/glade/p/CMIP/CMIP5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r1i1p1/v20130830/pr/
● We have some dataset taken from ETH organized as:
/glade/p/CMIP/CMIP5/ETH/cmip5/rcp85/Amon/pr/CCSM4/r1i1p1
● i.e. besides the prefix the directory structure is
<experiment>/<MIP table>/<variable-name>/<product>/<model-and-institute>/<ensemble-member>/
Environment
● Default shell is tcsh, you can switch to bash from SAM
● USS maintains an extensive list of software on CMIPAP, including several versions of most packages
● Everything in the environment is decided by “loading modules”, e.g:○ choosing which compiler to use○ enabling optional software, e.g. python, matlab, nco, R, vapor, and many more
module listmodule avmodule load
● Switching compiler may change other modules too, e.g. netcdf
Running jobs● Once logged into Yellowstone, you are
using one of the “login nodes”, sharedwith hundreds of other users
● You may NOT run anything that is “heavy” on these nodes (and if you do our watchdog will kill your session)
● You should run anything in the Yellowstone compute nodes, or DAV nodes
● To do so, you will use a queuing system named LSF, and the bsub command
● A job so run can be interactive or batch
Compiling● Make sure you are on an adequate resource
○ Login node ok, but use of geyser is recommended
● Load the appropriate compiler module
● Load the appropriate libraries you need(e.g. netcdf)
● Remove the unnecessary modules you may have loaded
● Proceed as usual
Running serial, interactive jobs on CMIPAPshortcuts
● For running serial (1-core) jobs you may use these scripts
● Use gogeyser or gocaldera if you would like to specify the allocation (project #)
● Use execgy or execca if you would like to use the default allocation
● In both cases you may need to resource your dotfile
Running jobs, the complete story● Manually call the bsub command
● bsub can take arguments from the command line or a file
● For example, execca is equivalent to
bsub -Is -q caldera -n1 -PSCSG0001 -W24:00 "$SHELL"
● Let’s see what it looks like on the command line
Running jobs, the complete story● Most important arguments
bsub -Is -q caldera -n1 -PSCSG0001 -W24:00 "$SHELL"
-Is = Interactive job (omit for non-interactive)-q caldera = Queue name (geyser, caldera, yellowstone, small)-n1 = Number of taks (1 = serial, 16 or 32 for exclusive jobs)-PSCSG0001 = Project number-W24:00 = Wallclock limit"$SHELL" = The command you want to run
Running jobs, LSF file#!/bin/tcsh#BSUB -P project_code # project code#BSUB -W 01:00 # wall-clock time (hrs:mins)#BSUB -n 64 # number of tasks in job #BSUB -R "span[ptile=16]" # run 16 MPI tasks per node#BSUB -J job_name # job name#BSUB -o job_name.%J.out # output file name note %J#BSUB -e job_name.%J.err # error file name note %J#BSUB -q queue_name # queue
mpirun.lsf ./myjob.exe
VNC● Desktop “sharing” program, used to see the
CMIPAP desktop in a window on your machine
● Requires TurboVNC installation on your machine
● Plus the setup of an ssh tunnel
● Run the vncserver_submit -P project_codecommand and follow the instructions
Some of the Analysis and Visualization Software● CDAT - Climate Data Analysis
Tools● CDO - Climate Data Operators● Ferret● GEMPAK - GEneral Meteorology
PAcKage● Gnuplot● GrADS - Grid Analysis and
Display System● IDL - Interactive Data Language● ImageMagick● Mathematica● MATLAB and toolboxes● NCAR Graphics● NCL - NCAR Command
Language
● Ncview● Octave● OpenGrADS● ParaView● R● VAPOR● Vis5D● VisIt● VTK - Visualization ToolKit● Ghostscript● NCO - netCDF Operators● WGRIB / WGRIB2● XDiff● Xxdiff● Python
ipython notebook v2.1.0● Jupyter notebooks not available yet,
but it will be...