A collectd plugin which checks if given systemd services are in
"running" state and sends graphite metrics with 1.0
or 0.0
value.
The plugin is particularly useful together with grafana's alerting.
Make sure Python dbus bindings are installed in your system:
- Debian/Ubuntu:
sudo apt-get install python-dbus
- Fedora/CentOS:
sudo yum install dbus-python
Copy collectd_systemd.py
to collectd Python plugin directory
(usually /usr/lib64/collectd/python/
or
/usr/lib/collectd/python/
). Add following snippet do
/etc/collectd.conf
:
LoadPlugin python <Plugin python> ModulePath "/usr/lib64/collectd/python" Import "collectd_systemd" <Module collectd_systemd> Service sshd nginx postgresql </Module> </Plugin>
Restart collectd daemon and open grafana web ui. Add a new graph with following query:
aliasSub(collectd.*.systemd-*.gauge-running, '.+systemd-(.+)\..+', '\1')
You should see all configured systemd services in the graph. Now it's
enough to add an alert for values lower than 1.0
to be paged when
services are down.
Following configuration options are supported:
Service
: one or more systemd services to monitor. Separate multiple services with spaces.Interval
: check interval. It's ok to keep the default (60 seconds)Verbose
: enable verbose logging (off by default)
Install tox using pip or Linux package manager.
Type tox
to run tests.