After installing Telegraf and hooking up everything into InfluxDB, I was missing the status of my backups. Every system here creates encrypted backups every night, and stores them on a central NAS, and off-site. But I want to know statistics about the backups, and see if something is not working.
I’m using Restic for the backups (will blog about this another time). However Telegraf does not support Restic directly, I need a few workarounds. This blog post however is not directly about monitoring the backups, but about how to write your own plugin for Telegraf.
Telegraf allows to include external plugins using the
[Exec Input Plugin](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/exec). The theory is easy:
- Telegraf will read the
STDOUTcoming from the plugin
- The plugin can send data in different formats (
JSON, or InfluxDB line protocol)
Since I’m using InfluxDB, I opted for the line protocol. Every data set is a single line:
- The first word is the measurement (
backupin my case) - this ends up being the table name in InfluxDB
- A space
- The second set are the tags, in
key=valueformat, and multiple tags are comma-separated (example:
- A space
- The third set is the actual data, again in
- A space
- The timestamp in Nanoseconds (Usually it’s ok if this is the Unix timestamp - Epoch - and multiplied by
The data field uses different data types, and the value is followed by a type indicator:
- Float (the default):
Even if the data field looks like an integer, but has no
i suffix, InfluxDB will assume it’s a floating point number.
This is coded into a Python script, in
/usr/bin/restic-telegraf.py. Now this must be hooked up into Telegraf.
As mentioned in my previous blog post about Telegraf, the Playbook I’m using can re-create the
telegraf.conf from scratch, and it configures the Input and Output plugins. I have to add the exec Input plugin to the list in
telegraf-config.yml, and I introduce a new flag
monitor_backup - because multiple scripts can use the
exec plugin uses the
[[inputs.exec]] section in the configuration, however the
command list is pre-populated with 3 demo scripts. That’s a problem, because the Ansible ini_file plugin can’t replace this, and will produce an invalid configfile. I’m already loading the content of the configfile into
$etc_telegraf_telegraf_conf, and use this to check if the demo scripts are still in the file. If that is true, I replace the entire
command section with the replace module before moving on to the actual configuration:
After that problem is moved out of the way, I can configure the plugin:
Format is changed to
influx, and I increase the interval time to
5 minutes: I don’t really need to check the backup every
The entire block is wrapped into a where condition which checks if the backup monitoring is enabled:
That’s it, the backup status data is now feeded into InfluxDB.