Skip to content
Snippets Groups Projects
Commit e1c80454 authored by Stefan E. Funk's avatar Stefan E. Funk
Browse files

Merge branch '73-ops-and-maintenance-icinga' into 'main'

fix(monitoring): Update Icinga section

Closes #73

See merge request !73
parents e2ad7514 0c9a22d9
No related branches found
No related tags found
1 merge request!73fix(monitoring): Update Icinga section
Pipeline #285009 passed
......@@ -34,26 +34,23 @@ to be aggregated. This means, configure the agent for level "INFO" or above if t
#### Icinga
We use the DARIAH-DE Monitoring system for our RDD server, which is currently installed at <https://icinga.de.dariah.eu/icinga>.
Different kinds of probes are implemented, mostly HTTP checks (for external HTTP function monitoring) and NRPE checks
(for internal server probes such as memory checks, ping, etc.).
NRPE probes are configured using Puppet.
The configuration is available at <https://projects.gwdg.de/projects/dariah-nagios/repository> to all RDD developers.
RDD group settings are updated regularly.
If a new contribution to the config is pushed, the configuration will build every 30 minutes at <https://icinga.de.dariah.eu/jenkins/>.
Please check the outcome of your configuration changes every time you contribute.
You have to be a member of the DARIAH LDAP group “jenkins-admin“.
Please use the DARIAH AAI <https://auth.de.dariah.eu> to apply for membership.
Icinga's configuration is roughly built on hosts, service probes, contacts, and contact groups.
The members of a contact group will be notified via email if a service probe associated with this group fails.
The service and host groups are mainly used for visualizing.
The migration to Icinga2 and the CLARIN-D monitoring at <https://monitoring.clarin.eu> is already in progress.
It will replace the current monitoring service.
We use the CLARIN-D / CLARIAH-DE / DARIAH-DE monitoring system (Icinga2) for our RDD servers, which is currently
installed at <https://monitoring.clarin.eu/dashboard>.
Different kinds of probes are implemented, mostly HTTP checks for external HTTP function monitoring,
and NRPE checks for internal server probes such as memory checks, ping, or certificate validity.
NRPE probes are configured on the servers using Puppet.
The workflow for configuration and deployment is realized in different repositories using Travis and Gitlab CI.
The main repository is located at <https://github.com/clarin-eric/monitoring>.
The repository for the DARIAH-DE configuration is located at <https://github.com/clarin-eric/monitoring-dariah>,
its state is validated via [Gitlab CI](https://gitlab.gwdg.de/dariah-de/monitoring-dariah).
The DARIAH-DE configuration repository is checked out from the main repository every hour, and deployed as soon
as a merge to the main branch has been successfully tested.
More detailed information concerning administration and monitoring workflow can be found in the
[FE-develop Wiki](https://wiki.de.dariah.eu/pages/viewpage.action?pageId=115195756) and in the
[monitoring-dariah Github repository](https://github.com/clarin-eric/monitoring-dariah).
#### Real time statistics / metrics
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment