linux clusters institute · 2 best practices •logbook - maintain on your management node....

21
1 Linux Clusters Institute: 3 rd Party Software Management Instructor: Timothy Bouvet Title: System Engineer NCSA Email: [email protected] 3 rd Party Software Management Topics: Best Practices Software Request Environment Modules Advanced Topics: Software Management Tools Software Maintenance 4-8 August 2014 2

Upload: others

Post on 13-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

1

Linux Clusters Institute:3rd Party Software

Management

Instructor: Timothy Bouvet

Title: System Engineer NCSA

Email: [email protected]

3rd Party SoftwareManagement

Topics:

• Best Practices• Software Request• Environment Modules

Advanced Topics:

• Software Management Tools• Software Maintenance

4-8 August 2014 2

Page 2: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

2

Best Practices

• Logbook - Maintain on your management node.

• Documentation - User pages, admin only.

• Test System - Create a test node or vm.

• Installs - User vs Root.

• 3rd Party Software - Deploy on a global filesystem.

• Sources - Keep in a common source directory.

• Build Directories – keep around for reference.

4-8 August 2014 3

Online Logbook

4-8 August 2014 4

Page 3: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

3

User Documentation

4-8 August 2014 5

Admin Documentation

4-8 August 2014 6

Page 4: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

4

Software Request

• Software Request1. Staff or User requests a package.

• Evaluate the request1. Useful for many users or a few?

2. Used where? Compute, login, mom or a combination?

3. Do we want to deploy and support this software?

4. Yes – forward to admin , No – assist user deployment

• Admin – rpm vs source install1. Is an rpm available from repos or web search?

2. Yes – Update/Install rpm(easiest to maintain).

3. No – Source install in a global file system.

Note: Develop a process for your site or institution

4-8 August 2014 7

Environment Modules

•What are modules? - dynamic way to support multiple versions of software like compilers.

•Why use modules? - easiest way to select versions of software and set user default environment.

•Who uses modules? - CRAY, Large HPC Centers, DOE to name a few.

•Uses tcl - tool command language for modulefiles.

•Managed Software - software installed in non default locations (source packages, relocatable rpms, srpms)

•Installing - Source vs RPM

4-8 August 2014 8

Page 5: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

5

Source Install

As Root: mkdir /global mode 775; chgrp to your $USER group

As $USER(SAFER): mkdir /global/src ; cd /global/src/

wget http://sourceforge.net/projects/modules/files/latest/download

tar -xf modules-3.2.10.tar.gz; cd modules-3.2.10

./configure --prefix=/global --with-etc-path=/global/Modules/etc

make; make install (does not copy all needed files to prefix)

mkdir /global/Modules/etc; mkdir /global/modulefiles

cp /global/Modules/3.2.10/modulefiles/null /global/modulefiles/

cp /global/src/modules-3.2.10/etc/global/profile.modules and csh.modules to /global/Modules/etc/ (location for testing)

4-8 August 2014 9

Source Install Cont.

cd /global/Modules/; ln -sf 3.2.10 default (Required to work)

To test add to your .bashrc or .cshrc:

$HOME/.bashrc: . /global/Modules/etc/profile.modules

$HOME/.cshrc: source /global/Modules/etc/csh.modules

Enable for all as root: (On all systems and images, note rename)

cp /global/Modules/etc/profile.modules /etc/profile.d/modules.sh

cp /global/Modules/etc/csh.modules /etc/profile.d/modules.csh

Note: INSTALL guide - ignore run add.modules as each user to setup $USER/.dot files and as root for default /etc/skel/.dot files to enable modules. Modifying user .dot files is a bad practice.

4-8 August 2014 10

Page 6: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

6

Source Modules.sh

#----------------------------------------------------------------------#

# system-wide profile.modules #

# Initialize modules for all sh-derivative shells #

#----------------------------------------------------------------------#

trap "" 1 2 3

#Exempt root env from global file systemif [ `id -u` != 0 ] ; then

case "$0" in

-bash|bash|*/bash) . /global/Modules/default/init/bash ;;

-ksh|ksh|*/ksh) . /global/Modules/default/init/ksh ;;

-zsh|zsh|*/zsh) . /global/Modules/default/init/zsh ;;

*) . /global/Modules/default/init/sh ;; # sh and default for scripts

esac

module use /global/modulefiles #add our search path

module load modules #add module man pages etc.module load null #add default modulesfitrap 1 2 3

4-8 August 2014 11

Source Modules.csh

#----------------------------------------------------------------------## system-wide csh.modules ## Initialize modules for all csh-derivative shells #

#----------------------------------------------------------------------#

# Exempt root env from global file system

if ("`id -u`" != "0") thenif ($?tcsh) then

source /global/Modules/default/init/tcshelse

source /global/Modules/default/init/cshendif

module use /global/modulefiles #add our search path

module load modules #add module man pages etc.module load null #add default modules

endif

4-8 August 2014 12

Page 7: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

7

Source Moduleversioning

4-8 August 2014 13

RPM Install

Check repos for environment modules/Modules rpm

• Install the rpm on all systems and images

• Installs /etc/profile.d/module.sh module.csh

• If this is an upgrade rpm install it should save module.sh & modules.csh to x.rpmsave in /etc/profile.d

• Default module search path: /usr/share/Modules/init/.modulespath (don’t use, lost on rpm update)

• mkdir /global/modulefiles - for our modulefiles

• cp /usr/share/Modules/modulefiles/null /global/modulefiles/

4-8 August 2014 14

Page 8: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

8

RPM Modules.sh#----------------------------------------------------------------------#

# system-wide profile.modules #

# Initialize modules for all sh-derivative shells #

#----------------------------------------------------------------------#

trap "" 1 2 3

case "$0" in

-bash|bash|*/bash) . /usr/share/Modules/init/bash ;;

-ksh|ksh|*/ksh) . /usr/share/Modules/init/ksh ;;

-zsh|zsh|*/zsh) . /usr/share/Modules/init/zsh ;;

*) . /usr/share/Modules/init/sh ;; # sh and default for scripts

esac

#Exempt root env from global file systemif [ `id -u` != 0 ] ; thenmodule use /global/modulefiles #add our search pathmodule load null #add default modulesfitrap 1 2 3

4-8 August 2014 15

RPM Modules.csh

#----------------------------------------------------------------------## system-wide csh.modules ## Initialize modules for all csh-derivative shells ##----------------------------------------------------------------------#

if ($?tcsh) thensource /usr/share/Modules/init/tcsh

elsesource /usr/share/Modules/init/csh

endif# Exempt root env from global file systemif ("`id -u`" != "0") then

module use /global/modulefiles #add our search pathmodule load null #add default modulesendif

4-8 August 2014 16

Page 9: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

9

How does the module command work?

The command "module" is an alias for something like:

sh: module () { eval `/some/path/modulecmd sh $*` }

csh: eval `/some/path/modulecmd csh !*

where the modulecmd outputs valid shell commands to stdout which manipulates the shell's environment. Any text that is meant to be seen by the user must be sent to stderr.

4-8 August 2014 17

Source: http://sourceforge.net/p/modules/wiki/FAQ/#how-do-i-capture-the-module-command-output

Module Commands

•add/load - adds a module to your shell

• rm/unload - removes module from your shell

•avail - lists available module

• list - currently loaded modules

•display/show - what a modules will do to shell

•help - modulefile or module help information

• switch/swap - swap one module for another

•use/unuse - add/remove modulefile search path

4-8 August 2014 18

Page 10: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

10

How to capture the module command output?

Since the module command is essentially an eval the visible output to the screen must necessarily be sent to stderr.

Output to file:sh: module avail 2> spoolfile

csh: module avail >&! spoolfile

Pipe output:

sh: module avail 2>&1|grep gcccsh: ( module avail ) | & grep gcc

NOTE: module avail gcc – will list only gcc versions available

4-8 August 2014 19

Source: http://sourceforge.net/p/modules/wiki/FAQ/#how-do-i-capture-the-module-command-output

Module Load

4-8 August 2014 20

tbouvet@h2ologin4:~> which gcc

/usr/bin/gcc (native gcc)

tbouvet@h2ologin4:~> module load gcc/4.8.2

tbouvet@h2ologin4:~> which gcc

/opt/gcc/4.8.2/bin/gcc

tbouvet@h2ologin4:~> echo $PATH

/opt/gcc/4.8.2/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib64/jvm/jre/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/sbin:/usr/sbin

NOTE: Loads in front of shell search path if the modulefile uses prepend-path directives.

Page 11: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

11

Module Unload

4-8 August 2014 21

tbouvet@h2ologin4:~> which gcc

/opt/gcc/4.8.2/bin/gcc

tbouvet@h2ologin4:~> module unload gcc/4.8.2

tbouvet@h2ologin4:~> echo $PATH

/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib64/jvm/jre/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/sbin:/usr/sbin

tbouvet@h2ologin4:~> which gcc

/usr/bin/gcc (native gcc)

NOTE: When a modulefile is unloaded, append-path and prepend-path become remove-path.

Module Avail

4-8 August 2014 22

Page 12: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

12

Module ListCurrently loaded modules

4-8 August 2014 23

Module Show

4-8 August 2014 24

tbouvet@h2ologin4:~> module show gcc-------------------------------------------------------------------

/opt/modulefiles/gcc/4.8.2:

conflict gcc

prepend-path PATH /opt/gcc/4.8.2/bin

prepend-path MANPATH /opt/gcc/4.8.2/snos/share/man

prepend-path INFOPATH /opt/gcc/4.8.2/snos/share/info

prepend-path LD_LIBRARY_PATH /opt/gcc/4.8.2/snos/lib64

prepend-path LD_LIBRARY_PATH /opt/gcc/gmp/4.3.2/lib

prepend-path LD_LIBRARY_PATH /opt/gcc/mpfr/2.4.2/lib

prepend-path LD_LIBRARY_PATH /opt/gcc/mpc/0.8.1/lib

setenv GCC_PATH /opt/gcc/4.8.2

setenv GCC_VERSION 4.8.2

setenv GNU_VERSION 4.8.2

-------------------------------------------------------------------

Page 13: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

13

Module Help

4-8 August 2014 25

tbouvet@bwbsles:~> module help modules

----------- Module Specific Help for 'modules' --------------

modules - loads the modules software & application environment

This adds /global/Modules/3.2.10/* to several of the environment variables.

Version 3.2.10

Module Swap

4-8 August 2014 26

tbouvet@h2ologin4:~> module list

Currently Loaded Modulefiles:

1) java/jdk1.7.0_45 3) swtools 5) gsissh/6.2p2

2) globus/5.2.4 4) gcc/4.8.2

tbouvet@h2ologin4:~> module swap gcc/4.8.2 gcc/4.9.0

tbouvet@h2ologin4:~> module list

Currently Loaded Modulefiles:

1) java/jdk1.7.0_45 3) swtools 5) gsissh/6.2p2

2) globus/5.2.4 4) gcc/4.9.0

Page 14: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

14

Module Use

4-8 August 2014 27

tbouvet@bwbsles:~> env|grep MODULEPATH

MODULEPATH=/usr/share/Modules:/usr/share/Modules/modulefiles

tbouvet@bwbsles:~> module use /global/modulefiles

tbouvet@bwbsles:~> env|grep MODULEPATH

MODULEPATH=/global/modulefiles:/usr/share/Modules:/usr/share/Modules/modulefiles

tbouvet@bwbsles:~> module avail

------------------------ /global/modulefiles ---------------------

null test2/2.4.3(default) test2/2.4.4

NOTE: Adds to $MODULEPATH search path

Module Unuse

4-8 August 2014 28

tbouvet@bwbsles:~> env|grep MODULEPATH

MODULEPATH=/global/modulefiles:/usr/share/Modules:/usr/share/Modules/modulefiles

tbouvet@bwbsles:~> module unuse /global/modulefiles

tbouvet@bwbsles:~> env|grep MODULEPATH

MODULEPATH=/usr/share/Modules:/usr/share/Modules/modulefiles

NOTE: Removes from $MODULEPATH search path

Page 15: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

15

Modulefile Structure

4-8 August 2014 29

Modulefile GCCexample from the web

4-8 August 2014 30

#%Module1.0

proc ModulesHelp { } {

global dotversion

puts stderr "\tGCC 4.6.2 (gcc, g++, gfortran)”

}

module-whatis "GCC 4.6.2 (gcc, g++, gfortran)"

conflict gcc (if true modulefile will not load)prepend-path PATH /packages/gcc/4.6.2/bin

prepend-path LD_LIBRARY_PATH /packages/gcc/4.6.2/lib64

prepend-path LIBRARY_PATH /packages/gcc/4.6.2/lib64

prepend-path MANPATH /packages/gcc/4.6.2/man

setenv CC gcc

setenv CXX g++

setenv FC gfortran

setenv F77 gfortran

setenv F90 gfortran

Page 16: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

16

Writing Modulefiles

4-8 August 2014 31

•Use modules modulefile as a template

• Include build directories in modulefile:$PATH /bin /sbin, $MANPATH /man /share/man,

$LD_LIBRARY_PATH $LD_RUN_PATH /lib /lib64,

$INFOPATH /share/info

•Man modulefile - helpful man page

•Web search “writing modulefiles”

•Use system with modulefiles as examples

•Test modulefile for deployment functionality

Module load use.own

4-8 August 2014 32

tbouvet@h2ologin4:~> module load use.own

tbouvet@h2ologin4:~> module avail null

--------------- /u/staff/tbouvet/privatemodules------------------------

null

NOTE: Will create privatemodules directory and add null

modulefile. An empty modulefile directory will not show up.

Use this directory to create and test modulefiles.

Page 17: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

17

Hiding Modulefiles

4-8 August 2014 33

Incorrect Method:

h2ologin4:/opt/cray/modulefiles/perftools-lite # chmod 750 6.1.0

tbouvet@h2ologin4:~> module avail perftools-lite

utility.c(2173):ERROR:50: Cannot open file '/opt/cray/modulefiles/perftools-lite/6.1.0' for 'reading’

Correct Method:

h2ologin4:/opt/cray/modulefiles/perftools-lite # mv 6.1.0 .6.1.0

tbouvet@h2ologin4:~> module avail perftools-lite

----------------------------------- /opt/cray/modulefiles -----------------------------

perftools-lite/6.1.1 perftools-lite/6.1.3(default) perftools-lite/6.2.0

perftools-lite/6.1.2 perftools-lite/6.1.4

NOTE: perftools-lite/6.1.0 is now invisible

Modulefile Default

4-8 August 2014 34

tbouvet@h2ologin4:~> module avail gcc

------------------ /opt/modulefiles -----------------------------

gcc/4.6.3 gcc/4.7.3 gcc/4.8.2(default) gcc/4.9.0

tbouvet@h2ologin4:~> cat /opt/modulefiles/gcc/.version

#%Module

set ModulesVersion "4.8.2”

NOTE: .version file in modulefiles/gcc/ controls default gcc

that will be loaded by module load gcc. Edit .version

file to change the default. Place a .version file in all your

modulefile deployment directories as a good practice.

Page 18: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

18

MetamodulesLoads a suite of modules

4-8 August 2014 35

Module Questions?

4-8 August 2014 36

Page 19: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

19

Software ManagementTools

Structure for 3rd party software installs, rebuilds, utilize environment modules

• Swtools - https://www.olcf.ornl.gov/center-projects/swtools/product of the National Center for Computational Sciences (NCCS) at Oak Ridge National Laboratory (ORNL) (Used by NCSA:Blue Waters for 3rd Party Software Management)

• Smithy - https://github.com/AnthonyDiGirolamo/smithyproduct of the National Center for Computational Sciences (NCCS) at Oak Ridge National Laboratory (ORNL)

Based on swtools and homebrew PMS for Mac OS XGenerates modulefile for your builds.

• Easybuild -https://pypi.python.org/pypi/easybuild

software build and installation framework written in Python that allow you to install software in a structured, repeatable and robust way. Generates modulefile for your builds.

4-8 August 2014 37

SWTools

Python code developed to manage third-party software installations and rebuilds in a structured manner.

Used at large sites with many software Installers

For global installs by $USER in a common group

Ensures consistency of installations

Generates html files suitable for web pages

Provides for quick rebuild or upgrade of package by group

Requires modulefiles to be generated by hand for builds

4-8 August 2014 38

Page 20: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

20

Software Maintenance

• Recommend maintaining five versions of packages.one production, one testing, three down-rev

• Need to determine 3rd party software usage?Poll Users with questionnaire or

Install altd library tracking utility (ORNL)http://sourceforge.net/projects/xalt/files/altd.tar.gz/download

Analyze usage for licensing, removal, upgrade decisions.

Focus user support on widely used modules.

• Use MOTD or email to notify users of software changes.

4-8 August 2014 39

ALTD

• Is a wrapper around the linker and job launcher.

•Tracks library usage by job and user.

•We “module load altd“ into users PATH.

• It Populates a MySQL database.

•Python script used to mine data in MySQL for reports.

•Blue Waters uses ALTD to monitor module usage.

•Mainly used on CRAY systems at HPC centers.

•Can be installed on any machine.

4-8 August 2014 40

Page 21: Linux Clusters Institute · 2 Best Practices •Logbook - Maintain on your management node. •Documentation - User pages, admin only. •Test System - Create a test node or vm. •Installs

21

Altd Report

ALTD Usage from 2013-04-01 to2013-04-31• =======• Top 10 Library usage, ranked by

number of linking instances

• mpt/5.6.1 19004• libsci/12.0.00 17950• cudatoolkit/5.0.35.102 17630

• fftw/3.3.0.1 14568• papi/5.0.1 14125• hdf5/1.8.8 4073

• acml/5.0.0 4037• tpsl/1.3.00 4028• mpt/5.4.5 351

• hdf5-parallel/1.8.9 305

• =======

Top 10 Compiler usage, ranked bynumber of linking instances

• gcc/4.7.2 27919

• gcc/4.4.4 4209

• cce/8.1.4 4102

• gcc/4.6.3 1229

• gcc/4.6.2 1066

• pgi/12.10.0 586

• pgi/11.10.0 77

• cce/8.1.6 32

• pgi/13.3.0 14

• pgi/12.9.0 14

4-8 August 2014 41

Questions?Thank You!

[email protected]

4-8 August 2014 42