Skip to content
Snippets Groups Projects
Commit 59bd9e4c authored by jbierma's avatar jbierma
Browse files

Merge branch '13-deployment-and-maintenance' into 'master'

Add monitoring infos for Icinga usage

Closes #13

See merge request !33
parents f34592ad 3f8130e4
No related branches found
No related tags found
1 merge request!33Add monitoring infos for Icinga usage
Pipeline #145553 passed with warnings
...@@ -9,15 +9,25 @@ For server configuration and setup we use puppet for most servers. The main pupp ...@@ -9,15 +9,25 @@ For server configuration and setup we use puppet for most servers. The main pupp
### Monitoring ### Monitoring
- Icinga probes for DARIAH-DE services <https://icinga.de.dariah.eu/icinga> #### Icinga
We are using the DARIAH-DE Monitoring system for our RDD server, that is at the moment installed at <https://icinga.de.dariah.eu/icinga>.
Different kinds of probes are implemented, mostly HTTP checks (for external HTTP function monitoring) and NRPE checks (for internal server probes such as memory checks, ping, etcpp), NRPE probes are configured using Puppet.
The configuration is available at <https://projects.gwdg.de/projects/dariah-nagios/repository> and can be accessed by all rdd developers (rdd group settings are updated regularly).
If a new contribution to the config is pushed, the configuration will build every 30 minutes at <https://icinga.de.dariah.eu/jenkins/>, please check the outcome of your configuration changes there every time you contribute. You have to be a member of the DARIAH LDAP group “jenkins-admin“, please use the DARIAH AAI <https://auth.de.dariah.eu> for membership.
Icinga's configuration is roughly build on hosts, service probes, contacts, and contact groups. So the members of the contact group “sub-fe-geobrowser-notify“ will be notified via email, if a service probe this contact group is assigned to fails. The service and host groups are mainly used for visualizing.
The migration to Icinga2 and the CLARIN-D monitoring at <https://monitoring.clarin.eu> is already in progress and will soon take over (planed for Fall 2020).
#### Real time statistics / metrics #### Real time statistics / metrics
To view real time metrics from our servers or applications we use [Grafana](https://grafana.com/), which is available To view real time metrics from our servers or applications we use [Grafana](https://grafana.com/), which is available at <https://metrics.gwdg.de> inside GoeNet.
at <https://metrics.gwdg.de> inside GoeNet. Grafana retrieves its data from influxdb. [Telegraf](https://github.com/influxdata/telegraf) can be used to store data from the servers in that database. It is easy to enable telegraf on puppet configured servers. Telegraf stores metrics from the server in the influxdb.
Grafana retrieves its data from influxdb. [Telegraf](https://github.com/influxdata/telegraf) can be used to store data
from the servers in that database. It is easy to enable telegraf on puppet configured servers. Telegraf stores metrics
from the server in the influxdb.
Some system statistics monitored by telegraf in our current puppet setup: Some system statistics monitored by telegraf in our current puppet setup:
...@@ -25,6 +35,7 @@ Some system statistics monitored by telegraf in our current puppet setup: ...@@ -25,6 +35,7 @@ Some system statistics monitored by telegraf in our current puppet setup:
- Memory - Memory
- Space - Space
- Apache - Apache
- Tomcat usage
- ... - ...
Telegraf collects statistics with input plugins. A list of plugins is Telegraf collects statistics with input plugins. A list of plugins is
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment