Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
T
technical-reference
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
FE
technical-reference
Commits
e1c80454
Commit
e1c80454
authored
2 years ago
by
Stefan E. Funk
Browse files
Options
Downloads
Plain Diff
Merge branch '73-ops-and-maintenance-icinga' into 'main'
fix(monitoring): Update Icinga section Closes
#73
See merge request
!73
parents
e2ad7514
0c9a22d9
No related branches found
Branches containing commit
No related tags found
Tags containing commit
1 merge request
!73
fix(monitoring): Update Icinga section
Pipeline
#285009
passed
2 years ago
Stage: build
Stage: test
Stage: compile
Stage: release
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
chapters/maintenance.md
+17
-20
17 additions, 20 deletions
chapters/maintenance.md
with
17 additions
and
20 deletions
chapters/maintenance.md
+
17
−
20
View file @
e1c80454
...
...
@@ -34,26 +34,23 @@ to be aggregated. This means, configure the agent for level "INFO" or above if t
#### Icinga
We use the DARIAH-DE Monitoring system for our RDD server, which is currently installed at
<https://icinga.de.dariah.eu/icinga>
.
Different kinds of probes are implemented, mostly HTTP checks (for external HTTP function monitoring) and NRPE checks
(for internal server probes such as memory checks, ping, etc.).
NRPE probes are configured using Puppet.
The configuration is available at
<https://projects.gwdg.de/projects/dariah-nagios/repository>
to all RDD developers.
RDD group settings are updated regularly.
If a new contribution to the config is pushed, the configuration will build every 30 minutes at
<https://icinga.de.dariah.eu/jenkins/>
.
Please check the outcome of your configuration changes every time you contribute.
You have to be a member of the DARIAH LDAP group “jenkins-admin“.
Please use the DARIAH AAI
<https://auth.de.dariah.eu>
to apply for membership.
Icinga's configuration is roughly built on hosts, service probes, contacts, and contact groups.
The members of a contact group will be notified via email if a service probe associated with this group fails.
The service and host groups are mainly used for visualizing.
The migration to Icinga2 and the CLARIN-D monitoring at
<https://monitoring.clarin.eu>
is already in progress.
It will replace the current monitoring service.
We use the CLARIN-D / CLARIAH-DE / DARIAH-DE monitoring system (Icinga2) for our RDD servers, which is currently
installed at
<https://monitoring.clarin.eu/dashboard>
.
Different kinds of probes are implemented, mostly HTTP checks for external HTTP function monitoring,
and NRPE checks for internal server probes such as memory checks, ping, or certificate validity.
NRPE probes are configured on the servers using Puppet.
The workflow for configuration and deployment is realized in different repositories using Travis and Gitlab CI.
The main repository is located at
<https://github.com/clarin-eric/monitoring>
.
The repository for the DARIAH-DE configuration is located at
<https://github.com/clarin-eric/monitoring-dariah>
,
its state is validated via
[
Gitlab CI
](
https://gitlab.gwdg.de/dariah-de/monitoring-dariah
)
.
The DARIAH-DE configuration repository is checked out from the main repository every hour, and deployed as soon
as a merge to the main branch has been successfully tested.
More detailed information concerning administration and monitoring workflow can be found in the
[
FE-develop Wiki
](
https://wiki.de.dariah.eu/pages/viewpage.action?pageId=115195756
)
and in the
[
monitoring-dariah Github repository
](
https://github.com/clarin-eric/monitoring-dariah
)
.
#### Real time statistics / metrics
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment