Tips and tricks
If an experiment runs over several restart cycles and is supposed to take several days to finish, the run should be closely monitored.
A simple cronjob ensures that the simulation is always queued:
*/5 * * * * ssh -i /home/mpim/m300408/.ssh/id_rsa_breeze3_20221127 levante.dkrz.de "bash /work/mh0010/m300408/DVC-test/EUREC4A-ICON/EUREC4A/run/submit.cron" >/dev/null 2>&1
submit.cron:
#!/bin/bash
cd /work/mh0010/m300408/DVC-test/EUREC4A-ICON/EUREC4A/run/
if [ `squeue -n exp.DOM01+DOM02.run | grep m300408 | wc -l` != 0 ]; then
echo "Still running or queued"
else
echo "Not running; resubmitting"
sbatch -d singleton exp.DOM01+DOM02.run
fi