After installing Telegraf and hooking up everything into InfluxDB, I was missing the status of my backups. Every system here creates encrypted backups every night, and stores them on a central NAS, and off-site. But I want to know statistics about the backups, and see if something is not working.
I’m using Restic for the backups (will blog about this another time). However Telegraf does not support Restic directly, I need a few workarounds. This blog post however is not directly about monitoring the backups, but about how to write your own plugin for Telegraf.
Telegraf allows to include external plugins using the [Exec Input Plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec)
. The theory is easy:
- Telegraf will read the
STDOUT
coming from the plugin - The plugin can send data in different formats (
CSV
,JSON
, or InfluxDB line protocol)
Since I’m using InfluxDB, I opted for the line protocol. Every data set is a single line:
- The first word is the measurement (
backup
in my case) - this ends up being the table name in InfluxDB - A space
- The second set are the tags, in
key=value
format, and multiple tags are comma-separated (example:host=sunlight,name=home
) - A space
- The third set is the actual data, again in
key=value
format (example:size=123456i,status=true,age=98765i
) - A space
- The timestamp in Nanoseconds (Usually it’s ok if this is the Unix timestamp - Epoch - and multiplied by
1000000000
)
The data field uses different data types, and the value is followed by a type indicator:
- Integer:
size=123456i
- Float (the default):
temperature=27
- String:
path=/home
- Boolean:
status=true
orstatus=false
Even if the data field looks like an integer, but has no i
suffix, InfluxDB will assume it’s a floating point number.
This is coded into a Python script, in /usr/bin/restic-telegraf.py
. Now this must be hooked up into Telegraf.
Telegraf Configuration
As mentioned in my previous blog post about Telegraf, the Playbook I’m using can re-create the telegraf.conf
from scratch, and it configures the Input and Output plugins. I have to add the exec Input plugin to the list in telegraf-config.yml
, and I introduce a new flag monitor_backup
- because multiple scripts can use the exec
plugin:
|
|
The exec
plugin uses the [[inputs.exec]]
section in the configuration, however the command
list is pre-populated with 3 demo scripts. That’s a problem, because the Ansible ini_file plugin can’t replace this, and will produce an invalid configfile. I’m already loading the content of the configfile into $etc_telegraf_telegraf_conf
, and use this to check if the demo scripts are still in the file. If that is true, I replace the entire command
section with the replace module before moving on to the actual configuration:
|
|
After that problem is moved out of the way, I can configure the plugin:
|
|
Format is changed to influx
, and I increase the interval time to 5 minutes
: I don’t really need to check the backup every 10 seconds
.
The entire block is wrapped into a where condition which checks if the backup monitoring is enabled:
|
|
That’s it, the backup status data is now feeded into InfluxDB.