i.e. running scripts on yellowstone’s dav nodes, aka ... · if you run microsoft windows, you may...

Using the CMIP Analysis Platform

i.e. running scripts on Yellowstone’s DAV nodes, aka Geyser and Caldera

Brief survey● Who has already used Yellowstone?

● Another supercomputer?

● Knows what is one of LSF, slurm, PBS?

● Is familiar with ssh, X-tunneling and/or VNC?

● Who is familiar with the DRS syntax?

● Who is familiar with environment modules (i.e. lmod or “module load x”)?

The speakerDavide Del Vento, PhD in PhysicsUser Services Section, Software Engineer

http://www2.cisl.ucar.edu/user-support/user-services-section

Best way to reach me is via ExtraView Tickets, which youcan create by sending email to

[email protected]



mailto:[email protected]

mailto:[email protected]

The CMIP Analysis Platform (aka CMIPAP)● The Platform is made of various High Performance Computing resources

available at NCAR, most notably:

○ High performance disks mounted as Parallel Filesystem○ Computing and visualization nodes (48 of them)

■ often referred as DAV (Data Analysis and Visualization) nodes■ 16 nodes (caldera) equipped with large memory (128GB/node) and two GPUs■ 16 nodes (geyser) equipped with huge memory (1TB/node) and one GPU■ 16 nodes (pronghorn) equipped with large memory (128GB/node) and no GPUs

○ High speed IB interconnect between nodes ○ High speed network to the outside world (two 40GB links)

Preamble - getting an account● Everybody here should already have an account (yubikey)

● If you don’t have an account, you can request one specifically for the CMIPAP

● Everybody who has an account can use CMIPAP with their existing allocation

● or, they can request a specific CMIPAP allocation

● To request a NEW dataset to be added, you need to have a CMIPAP allocation

Accounts? Allocations? What?? ● An account is your ability to login into a machine. It is

Personal, just yours, and tied to the yubikey.

● The main production machine we have at NCAR is theYellowstone supercomputer, which has some DAV nodes, namedGeyser and Caldera. The account does not distinguish among ys, gy & ca

● An allocation (sometimes also referred as project number) is your ability to submit and run jobs on a machine. The allocation does distinguish between ys and DAV nodes -- it is often shared by a research team.

Logging in● Access to any HPC resource at NCAR requires ssh

● MindTerm, OpenSSH, PuTTY are some ssh clients

● If you run Microsoft Windows, you may also need a X server, such as Xming or Cygwin/X - or TurboVNC

● If you run Mac OS X, you may also need XQuartz - or TurboVNC

● Linux laptops almost always come with a good X server preconfigured

Try login

Available datasets● Datasets are continuously added

● Check if what you need is there

● Feel free to request if not

● We are learning how to best serve you!!

● Currently organized with theDRS syntax (plans to change)

Directory structure guided by the DRS Syntax

● Once logged into CMIPAP, all the CMIP data is in the directory

/glade/p/CMIP

● Which for now contains only CMIP5 data, organized like the website table as:

/glade/p/CMIP/<activity>/<product>/<institute>/<model>/<experiment>/<frequency>/<realm>

● And then further down the path

/glade/p/CMIP/…/<realm>/<MIP-table>/<ensemble-member>/<version-num>/<variable-name>

DRS what? An example/glade/p/CMIP/<activity>/<product>/<institute>/<model>/<experiment>/<frequency>/<realm>

<activity> = CMIP5<product> = output1<institute> = NCAR<model> = CCSM4<experiment> = rcp85 # RCP 8.5<frequency> = mon<realm> = atmos

…/<realm>/<MIP-table>/<ensemble-member>/<version-num>/<variable-name>

<MIP-table> = Amon<ensemble-member> = r1i1p1<version-num> = v20130830<variable-name> = pr

/glade/p/CMIP/CMIP5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r1i1p1/v20130830/pr/<CMORfilename>

Feedback sought● The CMIP AP is a work in progress

● This exercise is to get ready for CMIP 6, when data volumes will be prohibitive

● Help us help you, by giving feedback

● E.g. use ETH naming conventions

Proposed ETH-inspired re-organization● Currently <CMOR-files> are contained into a directory such as:

/glade/p/CMIP/CMIP5/output1/NCAR/CCSM4/rcp85/mon/atmos/Amon/r1i1p1/v20130830/pr/

● We have some dataset taken from ETH organized as:

/glade/p/CMIP/CMIP5/ETH/cmip5/rcp85/Amon/pr/CCSM4/r1i1p1

● i.e. besides the prefix the directory structure is

<experiment>/<MIP table>/<variable-name>/<product>/<model-and-institute>/<ensemble-member>/

Environment

● Default shell is tcsh, you can switch to bash from SAM

● USS maintains an extensive list of software on CMIPAP, including several versions of most packages

● Everything in the environment is decided by “loading modules”, e.g:○ choosing which compiler to use○ enabling optional software, e.g. python, matlab, nco, R, vapor, and many more

module listmodule avmodule load

● Switching compiler may change other modules too, e.g. netcdf

Running jobs● Once logged into Yellowstone, you are

using one of the “login nodes”, sharedwith hundreds of other users

● You may NOT run anything that is “heavy” on these nodes (and if you do our watchdog will kill your session)

● You should run anything in the Yellowstone compute nodes, or DAV nodes

● To do so, you will use a queuing system named LSF, and the bsub command

● A job so run can be interactive or batch

Compiling● Make sure you are on an adequate resource

○ Login node ok, but use of geyser is recommended

● Load the appropriate compiler module

● Load the appropriate libraries you need(e.g. netcdf)

● Remove the unnecessary modules you may have loaded

● Proceed as usual

Running serial, interactive jobs on CMIPAPshortcuts

● For running serial (1-core) jobs you may use these scripts

● Use gogeyser or gocaldera if you would like to specify the allocation (project #)

● Use execgy or execca if you would like to use the default allocation

● In both cases you may need to resource your dotfile

Running jobs, the complete story● Manually call the bsub command

● bsub can take arguments from the command line or a file

● For example, execca is equivalent to

bsub -Is -q caldera -n1 -PSCSG0001 -W24:00 "$SHELL"

● Let’s see what it looks like on the command line

Running jobs, the complete story● Most important arguments

bsub -Is -q caldera -n1 -PSCSG0001 -W24:00 "$SHELL"

-Is = Interactive job (omit for non-interactive)-q caldera = Queue name (geyser, caldera, yellowstone, small)-n1 = Number of taks (1 = serial, 16 or 32 for exclusive jobs)-PSCSG0001 = Project number-W24:00 = Wallclock limit"$SHELL" = The command you want to run

Running jobs, LSF file#!/bin/tcsh#BSUB -P project_code # project code#BSUB -W 01:00 # wall-clock time (hrs:mins)#BSUB -n 64 # number of tasks in job #BSUB -R "span[ptile=16]" # run 16 MPI tasks per node#BSUB -J job_name # job name#BSUB -o job_name.%J.out # output file name note %J#BSUB -e job_name.%J.err # error file name note %J#BSUB -q queue_name # queue

mpirun.lsf ./myjob.exe

VNC● Desktop “sharing” program, used to see the

CMIPAP desktop in a window on your machine

● Requires TurboVNC installation on your machine

● Plus the setup of an ssh tunnel

● Run the vncserver_submit -P project_codecommand and follow the instructions

Some of the Analysis and Visualization Software● CDAT - Climate Data Analysis

Tools● CDO - Climate Data Operators● Ferret● GEMPAK - GEneral Meteorology

PAcKage● Gnuplot● GrADS - Grid Analysis and

Display System● IDL - Interactive Data Language● ImageMagick● Mathematica● MATLAB and toolboxes● NCAR Graphics● NCL - NCAR Command

Language

● Ncview● Octave● OpenGrADS● ParaView● R● VAPOR● Vis5D● VisIt● VTK - Visualization ToolKit● Ghostscript● NCO - netCDF Operators● WGRIB / WGRIB2● XDiff● Xxdiff● Python

ipython notebook v2.1.0● Jupyter notebooks not available yet,

but it will be...

i.e. running scripts on yellowstone’s dav nodes, aka ... · if you run microsoft windows, you may...

Documents