MPSD Software manager ===================== .. contents:: This repository provides the ``mpsd-software`` tool which is used to install package sets and toolchains on the `MPSD HPC cluster <https://computational-science.mpsd.mpg.de/docs/mpsd-hpc.html>`__. It can also be used to install the software on other machines, such as Linux laptops and desktops. This can be useful to work - on a local machine - with the same software environment, for example to debug a problem. Note that this tool is under development, and the recommended way to install and use as well as the user interface may change. This document will be kept up-to-date in any case. Quick start ----------- To install, for example, the ``foss2022a-serial`` toolchain: 1. Install this mpsd-software-manager Python package. The recommended way is to use ``pipx`` to that this tool is available independent from the use of any other Python environments:: $ pipx install git+https://gitlab.gwdg.de/mpsd-cs/mpsd-software-manager 2. Navigate to the location in your file system where you would like to store your "MPSD software instance" that contains the compiled software. Once compiled, the location cannot be changed. For example:: $ cd /home/user/mpsd-software 3. Initiate the installation at this location using:: $ mpsd-software init Future calls of the `mpsd-software` command need to be executed from this "mpsd-software-root" directory or in one of its subdirectories. (The above command creates a hidden file ``.mpsd-software-root`` to tag the location for as the root of the installation. All compiled files, logs etc are written in or below this subdirectory.) 4. From the same directory, run the command to install the ``foss2022a-serial`` toolchain:: $ mpsd-software install dev-23a foss2022a-serial This will take some time (up to several hours depending on hardware). 5. To see the installation status, and the required ``module use`` command line to activate the created modules, try the ``status`` command:: $ mpsd-software status dev-23a Installed toolchains (dev-23a): - cascadelake foss2022a-serial [module use /home/user/mpsd-software/dev-23a/cascadelake/lmod/Core] 6. To compile Octopus, source the provided configure script, for example ``foss2022a-serial-config.sh``, as `explained here <https://computational-science.mpsd.mpg.de/docs/mpsd-hpc.html#loading-a-toolchain-to-compile-octopus>`__). The configure scripts are located in ``dev-23a/spack-environments/octopus``:: $ ls -1 dev-23a/spack-environments/octopus foss2021a-cuda-mpi-config.sh foss2021a-mpi-config.sh foss2021a-serial-config.sh foss2022a-cuda-mpi-config.sh foss2022a-mpi-config.sh Documentation ------------- More detailed documentation that goes beyond the `Quick Start`_ section. Package sets and toolchains ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Package sets are a combination of particular versions of multiple software packages (such as anaconda3, or gcc and fftw). In the way the SSU Computational Science provides software on the MPSD HPC cluster, and for the Octopus continuous integration services, these package sets are compiled together (using Spack). - Toolchains are a particular type of package sets: - the choice of software packages (typically a compiler and scientific computing libraries) and their versions follows the Easybuild toolchains (such as the `FOSS toolchains <https://docs.easybuild.io/common-toolchains/#common_toolchains_foss>`__). - all packages grouped together in a toolchain can be loaded together using the ``module load`` command. Example: the ``foss2022a-serial`` tool chain provides (in spack notation):: - gcc@11.3.0 - binutils@2.38+headers+ld - fftw@3.3.10+openmp~~mpi - openblas@0.3.20 - in addition to the Easybuild-driven choice of packages, there are additional packages included in each toolchain which support the build of Octopus within these toolchains. For ``foss2022a-serial`` these packages include:: - libxc@5.2.3 # octopus-dependencies: - gsl@2.7.1 - sparskit@develop # 2021.06.01 - nlopt@2.7.0 - libgd@2.2.4 # 2.3.1 - libvdwxc@0.4.0~~mpi - nfft@3.2.4 - berkeleygw@2.1~~mpi~scalapack - python@3.9.5 - cgal@5.0.3 # 5.2 - hdf5@1.12.2~mpi - etsf-io@1.0.4 MPSD software releases ~~~~~~~~~~~~~~~~~~~~~~ As `explained in the MPSD HPC documentation <https://computational-science.mpsd.mpg.de/docs/mpsd-hpc.html#software>`__, we label software releases available on the HPC using a naming scheme of the year (such as ``23``) and a letter starting from ``a``. There is an exception that the first available software version is ``dev-23a`` (starting with ``dev-`` to indicate this was a development prototype). At the moment (June 2023), there is only one release (that is ``dev-23a``). For each MPSD software release, multiple toolchains and package sets are available:: $> mpsd-software available dev-23a MPSD software release dev-23a, AVAILABLE for installation are Toolchains: foss2021a-cuda-mpi foss2021a-mpi foss2021a-serial foss2022a-cuda-mpi foss2022a-mpi foss2022a-serial Package sets: global (octopus@12.1, octopus@12.1) global_generic (anaconda3@2022.10) Prerequisites ~~~~~~~~~~~~~ What needs to be installed for the installation to succeed? The ``mpsd-software-manager`` Python package. - This needs a recent Python (3.9 or later). - Install via pip or pipx. Pipx commands are: - to install: ``pipx install git+https://gitlab.gwdg.de/mpsd-cs/mpsd-software-manager`` - to update: ``pipx upgrade mpsd-software-manager`` - to uninstall: ``pipx uninstall mpsd-software-manager`` - Requirements to be able to run `spack <spack.readthedocs.io>`__ - Please check https://spack.readthedocs.io/en/latest/getting_started.html#system-prerequisites - The installation is only expected to work for x86 architectures at the moment. - The installation is only expected to work on Linux at the moment (i.e. not on OSX). Requirements for particular toolchains and package sets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ``foss*-serial`` should compile with the dependencies outlined above - ``foss*-mpi`` currently needs linux header files installed (to compile the ``knem`` package) - ``foss*-cuda-mpi`` (proably as `*-mpi, needs testing TODO`) Finding the Octopus configure wrapper ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For each Octopus toolchain, there is an Octopus configure wrapper available. The wrapper essentially calls the configure script with the right parameters, and library locations for the current toolchain. Once the toolchain is loaded, the variable ``$MPSD_OCTOPUS_CONFIGURE`` contains that path. The path can also be seen using the ``module show TOOLCHAIN_NAME`` command. For example:: $ mpsd-software install dev-23a foss2022a-mpi $ module use ~/mpsd-software/dev-23a/cascadelake/lmod/Core $ module show toolchains/foss2022a-mpi ... depends_on("cgal/5.0.3") depends_on("hdf5/1.12.2") setenv("MPSD_OCTOPUS_CONFIGURE","~/mpsd-software/dev-23a/spack-environments/octopus/foss2022a-mpi-config.sh") $ module load toolchains/foss2022a-mpi $ echo $MPSD_OCTOPUS_CONFIGURE ~/mpsd-software/dev-23a/spack-environments/octopus/foss2022a-mpi-config.sh Working example ~~~~~~~~~~~~~~~ There is an `example <https://github.com/mpsd-computational-science/octopus-with-mpsd-software>`__ compilation that shows the complete compilation cycle (including compilation of Octopus) using the ``foss2022a-serial`` toolchain. Frequently asked questions -------------------------- - Can I install the ``mpsd-software-manager`` package in a Python virtual environment? Yes. ``pipx`` is probably more convenient, but you can create your own Pyton virtual environment and install the ``mpsd-software-manager`` in that as a regular Python package:: python3 -m venv venv . venv/bin/activate pip install git+https://gitlab.gwdg.de/mpsd-cs/mpsd-software-manager You just need to activate that Python virtual environment before being able to use the tool. - Does the command write anything outside the mpsd-software-root directory? No. All changes to disk take place in and below the mpsd-software-root directory (which is the one in which the ``mpsd-software`` command is called). - How can I uninstall the mpsd-software? For now, the easiest is to delete the ``mpsd-software-root`` directory. You can probably delete just a release subdirectory (such as ``dev-23a``) if you have multiple release subdirectories installed and you only want to delete one. (Untested.) - How long does the compilation take? This depends on the hardware. A few hours are typical per toolchain. If a second toolchain is compiled in the same MPSD software instance and the same MPSD release it is likely to be faster, in particular if the same compiler is used (and thus the compiler does not need to be re-compiled for the second toolchain). - How much disk storage do I need? A toolchain needs of the order of 5GB on disk. The second or third toolchain (in the same MPSD software instance) will use less additional space, as libraries and tools are re-used where possible. - Can I have more than one MPSD software instance? Yes. We call "MPSD software instance" all the compiled software that is stored in and below a "mpsd-software-root" directory (see instructions above). It is possible to install multiple MPSD software instances on the same computer (just in different (not nested) directories. This makes it possible to experiment with toolchains etc. Development ----------- Developers documentation is available at development.rst.