Skip to content

Commit

Permalink
Merge pull request #82 from ARGOeu/devel
Browse files Browse the repository at this point in the history
Version 0.4.0
  • Loading branch information
themiszamani authored Sep 1, 2022
2 parents c536b31 + 09f44e5 commit 6965c07
Show file tree
Hide file tree
Showing 23 changed files with 258 additions and 125 deletions.
11 changes: 9 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Changelog

## [0.4.0] - 2022-09-1

### Changed

* ARGO-3754 Build two RPMS, Nagios and Sensu with appropriate runtime permission settings
* ARGO-3825 List requires explicitly for each ams-publisher package

## [0.3.9] - 2021-02-01

### Changed
Expand All @@ -11,7 +18,7 @@

### Fixed

* remove leftovers from erroneous SIGHUP handling
* remove leftovers from erroneous SIGHUP handling

## [0.3.7] - 2020-07-08

Expand Down Expand Up @@ -135,7 +142,7 @@

## [0.1.1] - 2017-03-14

### Added
### Added

* ARGO-732 Structure the body of alarm message as JSON object

Expand Down
4 changes: 2 additions & 2 deletions Jenkinsfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
pipeline {
agent any
options {
checkoutToSubdirectory('argo-nagios-ams-publisher')
checkoutToSubdirectory('ams-publisher')
}
environment {
PROJECT_DIR="argo-nagios-ams-publisher"
PROJECT_DIR="ams-publisher"
GIT_COMMIT=sh(script: "cd ${WORKSPACE}/$PROJECT_DIR && git log -1 --format=\"%H\"",returnStdout: true).trim()
GIT_COMMIT_HASH=sh(script: "cd ${WORKSPACE}/$PROJECT_DIR && git log -1 --format=\"%H\" | cut -c1-7",returnStdout: true).trim()
GIT_COMMIT_DATE=sh(script: "date -d \"\$(cd ${WORKSPACE}/$PROJECT_DIR && git show -s --format=%ci ${GIT_COMMIT_HASH})\" \"+%Y%m%d%H%M%S\"",returnStdout: true).trim()
Expand Down
7 changes: 4 additions & 3 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
include config/*
include bin/*
include pymod/*
include helpers/*
include init/*
include argo-nagios-ams-publisher.spec
include init/ams-publisher-nagios.service
include init/ams-publisher-sensu.service
include ams-publisher.spec
include setup.py

recursive-exclude pymodule *.pyc *.pyo
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
PKGNAME=argo-nagios-ams-publisher
PKGNAME=ams-publisher
SPECFILE=${PKGNAME}.spec

PKGVERSION=$(shell grep -s '^Version:' $(SPECFILE) | sed -e 's/Version: *//')
Expand Down
52 changes: 26 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# argo-nagios-ams-publisher
# ams-publisher

## Description
## Description

`argo-nagios-ams-publisher` is a component acting as bridge from Nagios to ARGO Messaging system. It's essential part of software stack running on ARGO monitoring instance and is responsible for forming and dispatching messages that wrap up results of Nagios probes/tests. It is running as a unix daemon and it consists of two subsystems:
- queueing mechanism
`ams-publisher` is a component acting as bridge from Nagios to ARGO Messaging system. It's essential part of software stack running on ARGO monitoring instance and is responsible for forming and dispatching messages that wrap up results of Nagios probes/tests. It is running as a unix daemon and it consists of two subsystems:
- queueing mechanism
- publishing/dispatching part

Messages are cached in local directory queue with the help of OCSP Nagios commands and each queue is being monitored and consumed by the daemon. After configurable amount of accumulated messages, publisher that is associated to queue sends them to ARGO Messaging system and drains the queue. `argo-nagios-ams-publisher` is written in multiprocessing manner so there is support for multiple (consume, publish) pairs where for each, new worker process will be spawned.
Messages are cached in local directory queue with the help of OCSP Nagios commands and each queue is being monitored and consumed by the daemon. After configurable amount of accumulated messages, publisher that is associated to queue sends them to ARGO Messaging system and drains the queue. `ams-publisher` is written in multiprocessing manner so there is support for multiple (consume, publish) pairs where for each, new worker process will be spawned.

Filling and draining of directory queue is asynchronous. Nagios delivers results on its own constant rate while `argo-nagios-ams-publisher` consume and publish them on its own configurable constant rate. It's important to keep the two rates close enough so that the results don't pile up in the queue and leave it early. Component has a mechanism of inspection of rates and trends over time to keep the constants in sync. Also it's resilient to network issues so it will retry configurable number of times to send a messages to ARGO Messaging system. It's also important to note that consume and publish of the queue is a serial process so if publish is stopped, consume part of the worker will be also stopped. That could lead to pile up of results in the queue and since every result is represented as a one file on the file system, easily exhaustion of free inodes and therefore unusable monitoring instance.
Filling and draining of directory queue is asynchronous. Nagios delivers results on its own constant rate while `ams-publisher` consume and publish them on its own configurable constant rate. It's important to keep the two rates close enough so that the results don't pile up in the queue and leave it early. Component has a mechanism of inspection of rates and trends over time to keep the constants in sync. Also it's resilient to network issues so it will retry configurable number of times to send a messages to ARGO Messaging system. It's also important to note that consume and publish of the queue is a serial process so if publish is stopped, consume part of the worker will be also stopped. That could lead to pile up of results in the queue and since every result is represented as a one file on the file system, easily exhaustion of free inodes and therefore unusable monitoring instance.

More about [Directory queue design](https://dirq.readthedocs.io/en/latest/queuesimple.html#directory-structure)

Expand All @@ -23,56 +23,56 @@ Complete list of features are:
- configurable bulk of messages sent to ARGO Messaging system
- configurable retry attempts in case of network connection problems
- purger that will keep queue only with sound data
- message rate inspection of each worker for monitoring purposes
- message rate inspection of each worker for monitoring purposes

## Installation

Component is supported on CentOS 6 and CentOS 7. RPM packages and all needed dependencies are available in ARGO repositories so installation of component simply narrows down to installing a package:
Component is supported on CentOS 7. RPM packages and all needed dependencies are available in ARGO repositories so installation of component simply narrows down to installing a package:

yum install -y argo-nagios-ams-publisher
yum install -y ams-publisher

Component relies on:
- `argo-ams-library` - interaction with ARGO Messaging
- `argo-ams-library` - interaction with ARGO Messaging
- `avro` - avro serialization of messages' payload
- `python-argparse` - ease build and parse of command line arguments
- `python-daemon` - ease daemonizing of component
- `python-messaging` - CERN's library for directory based caching/queueing
- `python-daemon` - ease daemonizing of component
- `python-messaging` - CERN's library for directory based caching/queueing
- `pytz` - timezone manipulation


| File Types | Destination |
|-------------------|----------------------------------------------------|
| Configuration | `/etc/argo-nagios-ams-publisher/ams-publisher.conf`|
| Configuration | `/etc/ams-publisher/ams-publisher.conf`|
| Daemon component | `/usr/bin/ams-publisherd` |
| Cache delivery | `/usr/bin/ams-alarm-to-queue, ams-metric-to-queue` |
| Init script (C6) | `/etc/init.d/ams-publisher` |
| SystemD Unit (C7) | `/usr/lib/systemd/system/ams-publisher.service` |
| Local caches | `/var/spool/argo-nagios-ams-publisher/` |
| Inspection socket | `/var/run/argo-nagios-ams-publisher/sock` |
| Log files | `/var/log/argo-nagios-ams-publisher/` |
| Local caches | `/var/spool/ams-publisher/` |
| Inspection socket | `/var/run/ams-publisher/sock` |
| Log files | `/var/log/ams-publisher/` |

## Configuration

Central configuration is in `ams-publisher.conf`. Configuration consists of `[General]` section and `[Queue_<workername>], [Topic_<workername>]` section pairs.
Central configuration is in `ams-publisher.conf`. Configuration consists of `[General]` section and `[Queue_<workername>], [Topic_<workername>]` section pairs.

### General section

```
[General]
[General]
Host = NAGIOS.FQDN.EXAMPLE.COM
RunAsUser = nagios
StatsEveryHour = 24
PublishMsgFile = False
PublishMsgFileDir = /published
PublishArgoMessaging = True
TimeZone = UTC
StatSocket = /var/run/argo-nagios-ams-publisher/sock
StatSocket = /var/run/ams-publisher/sock
```

* `Host` - FQDN of ARGO Monitoring instance that will be part of formed messages dispatched to ARGO Messaging system
* `RunAsUser` - component will run with effective UID and GID of given user, usually `nagios`
* `StatsEveryHour` - write periodic report in system logs. Example is given in [Running](Running)
* `PublishMsgFile`, `PublishMsgFileDir` - "file publisher" that is actually only for testing purposes. If enabled, messages will not be dispatched to ARGO Messaging System, instead it will just be appended to plain text file
* `StatsEveryHour` - write periodic report in system logs. Example is given in [Running](Running)
* `PublishMsgFile`, `PublishMsgFileDir` - "file publisher" that is actually only for testing purposes. If enabled, messages will not be dispatched to ARGO Messaging System, instead it will just be appended to plain text file
* `TimeZone` - construct timestamp of messages within specified timezone
* `StatsSocket` - query socket that is used for inspection of rates of each worker. It used by the `ams-publisher` Nagios probe.

Expand All @@ -83,7 +83,7 @@ Eachs `(queue, topic)` section pair designates one worker process. Two sections
Example of one such pair:
```
[Queue_Metrics]
Directory = /var/spool/argo-nagios-ams-publisher/metrics/
Directory = /var/spool/ams-publisher/metrics/
Rate = 10
Purge = True
PurgeEverySec = 300
Expand All @@ -99,15 +99,15 @@ Topic = metric_data
Bulksize = 100
MsgType = metric_data
Avro = True
AvroSchema = /etc/argo-nagios-ams-publisher/metric_data.avsc
AvroSchema = /etc/ams-publisher/metric_data.avsc
Retry = 5
Timeout = 60
SleepRetry = 300
SleepRetry = 300
```

* `[Queue_Metrics].Directory` - path of directory queue on the filesystem where local cache delivery tools write results of Nagios tests/probes.
* `[Queue_Metrics].Rate` - local cache inspection rate. 10 means that cache will be inspected 10 times at a second because thats the number of status results expected from Nagios that will be picked up verly early. For low volume tenants this could be a lower number.
* `[Queue_Metrics].Purge,PurgeEverySec,MaxTemp,MaxLock` - purge the staled elements of directory queue every `PurgeEverySec` seconds. It cleans the empty intermediate directories below directory queue path, temporary results that exceeded `MaxTemp` time and locked results that exceeded `MaxLock`.
* `[Queue_Metrics].Purge,PurgeEverySec,MaxTemp,MaxLock` - purge the staled elements of directory queue every `PurgeEverySec` seconds. It cleans the empty intermediate directories below directory queue path, temporary results that exceeded `MaxTemp` time and locked results that exceeded `MaxLock`.
> It is advisable to leave `MaxLock = 0` which skips every result that have been transformed into a message and added into in-memory queue, but had not yet been dispatched.
* `[Queue_Metrics].Granularity` - new intermediate directory in the toplevel directory queue path is created every `Granularity` seconds
* `[Topic_Metrics].Host,Key,Project,Topic` - options needed for delivering of messages to ARGO Messaging system. `Host` designates the FQDN, `Key` is authorization token, `Project` represents a tenant name and `Topic` is final destination scoped to tenant
Expand All @@ -130,7 +130,7 @@ CentOS 7:
systemctl start ams-publisher.service
```

Component periodically reports rates of each worker in system logs. It does so every `StatsEveryHour`.
Component periodically reports rates of each worker in system logs. It does so every `StatsEveryHour`.
```
2020-04-08 08:53:42 ams-publisher[963]: INFO - Periodic report (every 6.0h)
2020-04-08 08:53:42 ams-publisher[983]: INFO - ConsumerQueue metrics: consumed 45787 msgs in 6.00 hours
Expand Down
136 changes: 92 additions & 44 deletions argo-nagios-ams-publisher.spec → ams-publisher.spec
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
%define stripc() %(echo %1 | sed 's/el7.centos/el7/')
%define mydist %{stripc %{dist}}

Name: argo-nagios-ams-publisher
Version: 0.3.9
Name: ams-publisher
Summary: Bridge from Sensu/Nagios to the ARGO Messaging system
Version: 0.4.0
Release: 1%{mydist}
Summary: Bridge from Nagios to the ARGO Messaging system

Group: Network/Monitoring
License: ASL 2.0
Expand All @@ -16,65 +16,113 @@ Source0: %{name}-%{version}.tar.gz

BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
BuildArch: noarch
BuildRequires: python3-devel
Requires: python3-argo-ams-library
Requires: python3-avro
Requires: python3-dirq
Requires: python3-messaging
Requires: python36-pytz

Requires(post): systemd
Requires(preun): systemd
Requires(postun): systemd


%description
Bridge from Nagios to the ARGO Messaging system

%prep
%setup -q

%description
Bridge from Sensu/Nagios to the ARGO Messaging system

%build
%{py3_build}

%install
rm -rf $RPM_BUILD_ROOT
%{py3_install "--record=INSTALLED_FILES"}
install --directory --mode 755 $RPM_BUILD_ROOT/%{_sysconfdir}/%{name}/
install --directory --mode 755 $RPM_BUILD_ROOT/%{_localstatedir}/log/%{name}/
install --directory --mode 755 $RPM_BUILD_ROOT/%{_localstatedir}/spool/%{name}/metrics/
install --directory --mode 755 $RPM_BUILD_ROOT/%{_localstatedir}/spool/%{name}/alarms/
%{py3_install}
install --directory --mode 755 $RPM_BUILD_ROOT/%{_sysconfdir}/ams-publisher/
install --directory --mode 755 $RPM_BUILD_ROOT/%{_localstatedir}/log/ams-publisher/
install --directory --mode 755 $RPM_BUILD_ROOT/%{_localstatedir}/spool/ams-publisher/


%package -n argo-nagios-ams-publisher
Summary: Bridge from Nagios to the ARGO Messaging system
Conflicts: argo-sensu-ams-publisher

BuildRequires: python3-devel
Requires: nagios
Requires: python3-argo-ams-library
Requires: python3-avro
Requires: python3-dirq
Requires: python3-messaging
Requires: python36-pytz
Requires(post): systemd
Requires(preun): systemd
Requires(postun): systemd

%description -n argo-nagios-ams-publisher
Bridge from Nagios to the ARGO Messaging system

%files -f INSTALLED_FILES
%files -n argo-nagios-ams-publisher
%defattr(-,root,root,-)
%config(noreplace) %{_sysconfdir}/%{name}/ams-publisher.conf
%config(noreplace) %{_sysconfdir}/%{name}/metric_data.avsc
%dir %{python3_sitelib}/%{underscore %{name}}
%{python3_sitelib}/%{underscore %{name}}/*.py
%{_bindir}/ams-alarm-to-queue
%{_bindir}/ams-metric-to-queue
%{_bindir}/ams-publisherd
%config(noreplace) %{_sysconfdir}/ams-publisher/ams-publisher-nagios.conf
%config(noreplace) %{_sysconfdir}/ams-publisher/metric_data.avsc
%dir %{python3_sitelib}/ams_publisher
%{python3_sitelib}/ams_publisher/*.py
%{python3_sitelib}/ams_publisher/__pycache__/
%{python3_sitelib}/*.egg-info
%{_unitdir}/ams-publisher-nagios.service
%defattr(-,nagios,nagios,-)
%dir %{_localstatedir}/log/%{name}/
%dir %{_localstatedir}/spool/%{name}/
%dir %{_localstatedir}/log/ams-publisher/
%dir %{_localstatedir}/spool/ams-publisher/

%post
%systemd_postun_with_restart ams-publisher.service
%post -n argo-nagios-ams-publisher
%systemd_postun_with_restart ams-publisher-nagios.service

%clean
rm -rf $RPM_BUILD_ROOT
%preun -n argo-nagios-ams-publisher
%systemd_preun ams-publisher-nagios.service


%package -n argo-sensu-ams-publisher
Summary: Bridge from Sensu to the ARGO Messaging system
Conflicts: argo-nagios-ams-publisher

BuildRequires: python3-devel
Requires: sensu-go-backend
Requires: python3-argo-ams-library
Requires: python3-avro
Requires: python3-dirq
Requires: python3-messaging
Requires: python36-pytz
Requires(post): systemd
Requires(preun): systemd
Requires(postun): systemd

%preun
%systemd_preun ams-publisher.service
%description -n argo-sensu-ams-publisher
Bridge from Sensu to the ARGO Messaging system

%files -n argo-sensu-ams-publisher
%defattr(-,root,root,-)
%{_bindir}/ams-alarm-to-queue
%{_bindir}/ams-metric-to-queue
%{_bindir}/ams-publisherd
%config(noreplace) %{_sysconfdir}/ams-publisher/ams-publisher-sensu.conf
%config(noreplace) %{_sysconfdir}/ams-publisher/metric_data.avsc
%dir %{python3_sitelib}/ams_publisher
%{python3_sitelib}/ams_publisher/*.py
%{python3_sitelib}/ams_publisher/__pycache__/
%{python3_sitelib}/*.egg-info
%{_unitdir}/ams-publisher-sensu.service
%defattr(-,sensu,sensu,-)
%dir %{_localstatedir}/log/ams-publisher/
%dir %{_localstatedir}/spool/ams-publisher/

%post -n argo-sensu-ams-publisher
%systemd_postun_with_restart ams-publisher-sensu.service

%preun -n argo-sensu-ams-publisher
%systemd_preun ams-publisher-sensu.service

%pre
if ! /usr/bin/id nagios &>/dev/null; then
/usr/sbin/useradd -r -m -d /var/log/nagios -s /bin/sh -c "nagios" nagios || \
logger -t nagios/rpm "Unexpected error adding user \"nagios\". Aborting installation."
fi
if ! /usr/bin/getent group nagiocmd &>/dev/null; then
/usr/sbin/groupadd nagiocmd &>/dev/null || \
logger -t nagios/rpm "Unexpected error adding group \"nagiocmd\". Aborting installation."
fi

%clean
rm -rf $RPM_BUILD_ROOT

%changelog
* Thu Sep 1 2022 Daniel Vrcic <dvrcic@srce.hr> - 0.4.0-1%{?dist}
- ARGO-3754 Build two RPMS, Nagios and Sensu with appropriate runtime permission settings
- ARGO-3825 List requires explicitly for each ams-publisher package
* Mon Feb 1 2021 Daniel Vrcic <dvrcic@srce.hr> - 0.3.9-1%{?dist}
- ARGO-2855 ams-publisher py3 switch
- ARGO-2929 Let systemd handle runtime directory
Expand Down
2 changes: 1 addition & 1 deletion bin/ams-alarm-to-queue
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/python3

from argo_nagios_ams_publisher import alarmtoqueue
from ams_publisher import alarmtoqueue

alarmtoqueue.main()
2 changes: 1 addition & 1 deletion bin/ams-metric-to-queue
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/usr/bin/python3

from argo_nagios_ams_publisher import metrictoqueue
from ams_publisher import metrictoqueue

metrictoqueue.main()
10 changes: 5 additions & 5 deletions bin/ams-publisherd
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ import pwd
import signal
import sys

from argo_nagios_ams_publisher.config import parse_config
from argo_nagios_ams_publisher.log import Logger
from argo_nagios_ams_publisher.run import init_dirq_consume
from argo_nagios_ams_publisher.stats import query_stats, setup_statssocket
from argo_nagios_ams_publisher.shared import Shared
from ams_publisher.config import parse_config
from ams_publisher.log import Logger
from ams_publisher.run import init_dirq_consume
from ams_publisher.stats import query_stats, setup_statssocket
from ams_publisher.shared import Shared


def get_userids(user):
Expand Down
Loading

0 comments on commit 6965c07

Please sign in to comment.