module
¶On a high-performance computing system, it is often the case that no software is loaded by default. If we want to use a software package, we will need to “load” it ourselves.
Before we start using individual software packages, however, we should understand the reasoning behind this approach. The three biggest factors are:
Software incompatibility is a major headache for programmers. Sometimes the presence (or absence) of a software package will break others that depend on it. Two of the most famous examples are Python 2 and 3 and C compiler versions. Python 3 famously provides a python command that conflicts with that provided by Python 2. Software compiled against a newer version of the C libraries and then used when they are not present will result in a nasty 'GLIBCXX_3.4.20' not found error, for instance.
Software versioning is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allow a set of researchers to prevent software versioning issues from affecting their results.
Dependencies are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (Fastest Fourer Transform in the West) software library available for it to work.
modules
¶Environment modules are the solution to these problems. A module is a self-contained description of a software package - it contains the settings required to run a software packace and, usually, encodes required dependencies on other software packages.
On most HPC systems there will be software built and managed by the sysadmin team, and typically this will be done with software modules.
We can interact with the software modules with the module
command (again man module
will give you a full description of the options).
In order to see the modules available we can use the command:
jupyter-user:$module avail
--------------------------------------------- /usr/share/Modules/modulefiles ----------------------------------------------
dot module-git module-info modules null use.own
------------------------------------------------- /usr/share/modulefiles --------------------------------------------------
mpi/openmpi-x86_64 pmi/pmix-x86_64
There are minimal modules available on this cluster container (don't worry, Nimbus has plenty).
We can see what a module contains with the follwoing command:
jupyter-user:$ module whatis mpi/openmpi-x86_64
mpi/openmpi-x86_64: Description: The Open MPI Project is an open source MPI-3 implementation.
mpi/openmpi-x86_64: Homepage: https://www.open-mpi.org/
mpi/openmpi-x86_64: URL: https://www.open-mpi.org/
We see the mpi/openmpi-x86_64
module conatins the OpenMPI implementation of the Message Passing Interface (MPI) standard, with the libraries and mpi compiler wrappers, such as mpif90
and mpicc
.
To see the effect of loading the module lets first use the which
command to see if it can find the mpi fortran compiler mpif90
:
jupyter-user:$which mpif90
/usr/bin/which: no mpif90 in (/data/jupyter-user/.local/bin:/data/jupyter-user/bin:/usr/share/Modules/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
We can see which
has looked in all the standard places in our $PATH
and has been unable to find the mpif90
compiler.
Now lets load the module and try again:
jupyter-user:$module load mpi/openmpi-x86_64
jupyter-user:$which mpif90
/usr/lib64/openmpi/bin/mpif90
Now we can see which
can find the wrapper without any issues. Loading the module has updated our environment so that the location of the compiler (and the dependencies) is included in the paths searched.
If we issue the command
jupyter-user:$echo $PATH
We can now see the directory paths being searched - we can see, by comapring to our error message above, that /usr/lib64/openmpi/bin
has been added.
In order to remove a module we can issue the command:
jupyter-user:$ module unload mpi/openmpi-x86_64
If we wished to remove all loaded modules we could issue the command:
jupyter-user:$module purge
This is useful if there are lots of dependencies for a particular piece of software also stored in modules (which is typically the case on Nimbus.. This way you don't have to type every module out in an module unload
statement before loading a new piece of software, just purge them all and do a fresh module load
on the new software.
module list
module load
module load softwareName
module unload <name>
module purge