deployment-maintenance.md



Deployment and Maintenance

Puppet
For server configuration and setup we use puppet for all of our FE maintained servers. The main puppet code is provided in GitLab https://gitlab.gwdg.de/dariah-de-puppet. Please see the project's README file for detailed information.
The DARIAH-DE and TextGrid Repository module (dhrep) is contained in GitHub https://github.com/DARIAH-DE/puppetmodule-dhrep.
There is a task force in FE/GWDG for maintaining the Puppet code and related issues, that plans and performs sprints for planning, implementing, and documenting current Puppet issues such as major updates or discussing module usage.

Monitoring

Icinga
We are using the DARIAH-DE Monitoring system for our RDD server, that is at the moment installed at https://icinga.de.dariah.eu/icinga.
Different kinds of probes are implemented, mostly HTTP checks (for external HTTP function monitoring) and NRPE checks (for internal server probes such as memory checks, ping, etcpp), NRPE probes are configured using Puppet.
The configuration is available at https://projects.gwdg.de/projects/dariah-nagios/repository and can be accessed by all rdd developers (rdd group settings are updated regularly).
If a new contribution to the config is pushed, the configuration will build every 30 minutes at https://icinga.de.dariah.eu/jenkins/, please check the outcome of your configuration changes there every time you contribute. You have to be a member of the DARIAH LDAP group “jenkins-admin“, please use the DARIAH AAI https://auth.de.dariah.eu for membership.
Icinga's configuration is roughly build on hosts, service probes, contacts, and contact groups. So the members of the contact group “sub-fe-geobrowser-notify“ will be notified via email, if a service probe this contact group is assigned to fails. The service and host groups are mainly used for visualizing.
The migration to Icinga2 and the CLARIN-D monitoring at https://monitoring.clarin.eu is already in progress and will soon take over (planed for Fall 2020).

Real time statistics / metrics
To view real time metrics from our servers or applications we use Grafana, which is available at https://metrics.gwdg.de inside GoeNet.
Grafana retrieves its data from influxdb. Telegraf can be used to store data from the servers in that database. It is easy to enable telegraf on puppet configured servers. Telegraf stores metrics from the server in the influxdb.
Some system statistics monitored by telegraf in our current puppet setup:

CPU
Memory
Space
Apache
Tomcat usage
...

Telegraf collects statistics with input plugins. A list of plugins is
available.

Release Management