Skip to content
Snippets Groups Projects
Commit 01c0a408 authored by Ashwin Kumar Karnad's avatar Ashwin Kumar Karnad
Browse files

Merge branch 'add-deploy-instructions' into 'main'

Add deploy instructions

See merge request !124
parents 32677218 bba61aac
No related branches found
No related tags found
1 merge request!124Add deploy instructions
Pipeline #450538 passed
......@@ -340,3 +340,7 @@ Setting the flags for all loaded modules should work via: ::
export LDFLAGS=`echo ${LIBRARY_PATH:+:$LIBRARY_PATH} | sed -e 's/:/ -Wl,-rpath=/g'`
*Note* An intel toolchain may at some point also show up in the list. It might however not be functional yet so expect that loading that module might fail.
Deployment
----------
Instructions to deploy the toolchains on the MPSD HPC cluster are available at deployment.rst.
Overview
========
The toolchains are configured to be used as the replacement for
`EasyBuild <https://easybuild.readthedocs.io/en/latest/>`__ toolchains.
These toolchains are used for the `octopus
buildbot <https://octopus-code.org/buildbot>`__ and the `MPSD HPC
cluster <https://computational-science.mpsd.mpg.de/docs/mpsd-hpc.html>`__.
The heterogeneous HPC cluster has a mix of various microarchitectures,
and the toolchains must be compiled for each microarchitecture. Here is
a table of the type of nodes, and the microarchitectures:
+----------------------+---------------------------------+----------------------------+------------------------------------+
| partition | microarchitecture | example node | toolchains to compile |
+======================+=================================+============================+====================================+
| public | sandybridge | mpsd-hpc-ibm-022 | (foss/intel)-(serial/mpi) |
+----------------------+---------------------------------+----------------------------+------------------------------------+
| bigmem | broadwell | mpsd-hpc-hp-002 | (foss/intel)-(serial/mpi) |
+----------------------+---------------------------------+----------------------------+------------------------------------+
| gpu | skylake_avx512 (cuda_arch = 70) | mpsd-hpc-gpu-002 | (foss/intel)-(serial/mpi/cuda-mpi) |
+----------------------+---------------------------------+----------------------------+------------------------------------+
| accelerated-tentacle | cascadelake (cuda_arch = 75) | mpsd-accelerated-tentacle1 | compile only on gpu-ayyer |
+----------------------+---------------------------------+----------------------------+------------------------------------+
| gpu-ayyer | cascadelake (cuda_arch = 70) | mpsd-hpc-gpu-004 | (foss/intel)-(serial/mpi/cuda-mpi) |
+----------------------+---------------------------------+----------------------------+------------------------------------+
| powerpc | power8le | mpsd-srv-ppc-001 | (foss)-(serial/mpi) |
+----------------------+---------------------------------+----------------------------+------------------------------------+
| valgrind | westmere | mpsd-srv-ibm-001 | (foss)-(serial) |
+----------------------+---------------------------------+----------------------------+------------------------------------+
.. note::
Note that the powerpc, accelerated-tentacle and valgrind partitions are not in slurm,
but the rest of the partition names here represent the corresponding names in slurm.
.. warning::
Be aware that the cuda toolchain for accelerated-tentacle must be compiled on `gpu-ayyer` partition,
as the cuda arch for the `accelerated-tentacle` is newer than the `gpu-ayyer` partition.
Steps to compile toolchains
===========================
All the steps are done via the functional account ``mpsddeb``.
Prepare the system
------------------
The building of toolchains require some system packages to be installed.
These packages are usually already installed in the MPSD HPC via FAI.
Here are some of the packages that are required:
.. code:: bash
$ sudo apt install build-essential autoconf automake libtool
# linux kernel headers
$ sudo apt install linux-headers-$(uname -r) # might not work on some systems and need to be inspected manually
Block the node
--------------
You are advised to block the node in slurm to prevent other users from
using the node via slrum, while you are compiling the toolchains. You
can do this via the command:
::
salloc -N 1 -p <partition> --exclusive --time=12:00:00
where ``<partition>`` is the partition. Remember to unblock the node after you
are done compiling the toolchains. The maximum time you can block the
node is 12 hours. If you need more time, you can request for more time
via a simple submission script (`for
eg <https://computational-science.mpsd.mpg.de/docs/mpsd-hpc.html#example-batch-scripts>`__).
Install the Toolchains
----------------------
1 ) Set the location of the toolchain installation directory:
You must specify the location of the toolchain installation directory.
On the HPC this is usually ``/opt_mpsd/$MPSD_OS``.
This step is only required once to set the location of the toolchain.
.. code:: shell
$ cd /opt_mpsd/linux-debian11
$ mpsd-software init # mark the current directory as a software installation directory (only for the first time)
2 ) List the available software releases (upstream) and the available toolchains for a chosen release:
.. code:: shell
$ mpsd-software available # list the available software releases
Available MPSD software releases:
dev-23a
$ mpsd-software available dev-23a
MPSD software release dev-23a, AVAILABLE for installation are
Toolchains:
foss2021a-cuda-mpi
foss2021a-mpi
foss2021a-serial
foss2022a-cuda-mpi
foss2022a-mpi
foss2022a-serial
Package sets:
global (octopus@12.1, octopus@12.1)
global_generic (anaconda3@2022.10)
.. code:: shell
$ mpsd-software install dev-23a foss2022a-mpi foss2022a-serial
Release dev-23a is prepared in /opt_mpsd/linux-debian11/dev-23a
##### Setting up spack
Cloning into 'spack'...
.
.
.
##### Creating toolchain module 'foss2022a-mpi'
##### Installation finished in 01:51:20 ( setup: 00:01:08, buildcache: 00:00:02, install: 01:49:52, lmod: 00:00:17)
##### 'module use /opt_mpsd/linux-debian11/dev-23a/skylake_avx512/lmod/Core' can be used to get the new environments (note that lmod does **not** allow trailing '/')
3 ) Check the status of the installation:
.. code:: shell
$ mpsd-software status
Available MPSD software releases:
dev-23a
$ mpsd-software status dev-23a
Installed toolchains (dev-23a):
- skylake_avx512
foss2022a-mpi
[module use /opt_mpsd/linux-debian11/dev-23a/skylake_avx512/lmod/Core]
4 ) Load the toolchain:
.. code:: shell
$ module use /opt_mpsd/linux-debian11/dev-23a/skylake_avx512/lmod/Core
$ module avail
-------------------------------------------------------- /opt_mpsd/linux-debian11/dev-23a/skylake_avx512/lmod/Core ---------------------------------------------------------
gcc/10.3.0 toolchains/foss2022a-mpi
$ module load toolchains/foss2022a-mpi
$ which mpicc
/opt_mpsd/linux-debian11/dev-23a/skylake_avx512/spack/opt/spack/linux-debian11-skylake_avx512/gcc-10.3.0/foss2022a-mpi-4.1.0/bin/mpicc
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment