Modules

Overview:

  • Teaching: 10 min
  • Exercises: 5 min

Questions

  • How is software managed on a HPC system?
  • How do I access the managed software?

Objectives

  • Be able to use modules to access software.

module

On a high-performance computing system, it is often the case that no software is loaded by default. If we want to use a software package, we will need to “load” it ourselves.

Before we start using individual software packages, however, we should understand the reasoning behind this approach. The three biggest factors are:

  • software incompatibilities;
  • versioning;
  • dependencies.

Software incompatibility is a major headache for programmers. Sometimes the presence (or absence) of a software package will break others that depend on it. Two of the most famous examples are Python 2 and 3 and C compiler versions. Python 3 famously provides a python command that conflicts with that provided by Python 2. Software compiled against a newer version of the C libraries and then used when they are not present will result in a nasty 'GLIBCXX_3.4.20' not found error, for instance.

Software versioning is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allow a set of researchers to prevent software versioning issues from affecting their results.

Dependencies are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (Fastest Fourer Transform in the West) software library available for it to work.

modules

Environment modules are the solution to these problems. A module is a self-contained description of a software package - it contains the settings required to run a software packace and, usually, encodes required dependencies on other software packages.

On most HPC systems there will be software built and managed by the sysadmin team, and typically this will be done with software modules.

We can interact with the software modules with the module command (again man module will give you a full description of the options).

listing the loaded modules

From the module manual (man module) find the command to list the currently loaded modules.

Solution

Listing available modules

In order to see the modules available we can use the command:

jupyter-user:$module avail
--------------------------------------------- /usr/share/Modules/modulefiles ----------------------------------------------
dot  module-git  module-info  modules  null  use.own

------------------------------------------------- /usr/share/modulefiles --------------------------------------------------
mpi/openmpi-x86_64  pmi/pmix-x86_64

There are minimal modules available on this cluster container (don't worry, Nimbus has plenty).

Loading modules

We can see what a module contains with the follwoing command:

jupyter-user:$ module whatis mpi/openmpi-x86_64
mpi/openmpi-x86_64: Description: The Open MPI Project is an open source MPI-3 implementation.
mpi/openmpi-x86_64: Homepage: https://www.open-mpi.org/
mpi/openmpi-x86_64: URL: https://www.open-mpi.org/

We see the mpi/openmpi-x86_64 module conatins the OpenMPI implementation of the Message Passing Interface (MPI) standard, with the libraries and mpi compiler wrappers, such as mpif90 and mpicc.

To see the effect of loading the module lets first use the which command to see if it can find the mpi fortran compiler mpif90:

jupyter-user:$which mpif90
/usr/bin/which: no mpif90 in (/data/jupyter-user/.local/bin:/data/jupyter-user/bin:/usr/share/Modules/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)

We can see which has looked in all the standard places in our $PATH and has been unable to find the mpif90 compiler.

Now lets load the module and try again:

jupyter-user:$module load mpi/openmpi-x86_64
jupyter-user:$which mpif90
/usr/lib64/openmpi/bin/mpif90

Now we can see which can find the wrapper without any issues. Loading the module has updated our environment so that the location of the compiler (and the dependencies) is included in the paths searched.

If we issue the command

jupyter-user:$echo $PATH

We can now see the directory paths being searched - we can see, by comapring to our error message above, that /usr/lib64/openmpi/bin has been added.

Unloading modules

In order to remove a module we can issue the command:

jupyter-user:$ module unload mpi/openmpi-x86_64

unload

Try unloading the mpi module now and inspect the $PATH environment variable with the echo command again. How has it changed? What is the outcome of which mpif90 now?

Solution

Purge

If we wished to remove all loaded modules we could issue the command:

jupyter-user:$module purge

This is useful if there are lots of dependencies for a particular piece of software also stored in modules (which is typically the case on Nimbus.. This way you don't have to type every module out in an module unload statement before loading a new piece of software, just purge them all and do a fresh module load on the new software.

purge

Try loading several of the available modules, and purge them. Confirm they are all removed by listing the currently loaded modules.

Key Points

  • List loaded software with module list
  • See what modules are available with module load
  • Load software with module load softwareName
  • Unload modules with module unload <name>
  • Purge all modules with module purge
  • The module system handles software versioning and package conflicts for you automatically.