Skip to content

Restic backup

Was asked quite a few times how I do my backups with Restic.

For more than 10 years I was using "Duplicity" for backups, but in 2019 I changed to Restic. The main reason for the change was that Duplicity still can't handle "Big Data", as in: larger directories. In 2009 someone opened an issue on the Duplicity bugtracker, and this problem still exists as of today. For about two years I was shifting around the problem, excluding files, trying to make the sigfile smaller. But at some point I decided that it is enough and I need to change the tool.

Duplicity knows two backup modes: "full backup" and "incremental backup". Once in a while you take a full backup, and then you add incremental backups to that full backup. In order to restore a certain backup you need the full backup and the incremental backups. Therefore my go-to mode was to always have two full backups and a couple incremental backups in-between. Even if something goes wrong with the latest full backup, I can still go back to the previous full backup (of course with some changes lost, but that's still better than nothing). When taking a new full backup, the oldest one is only deleted when the new one is completed. Accordingly when a new incremental backup is created, it's a new set of files. Removing the backup removes all the files from this incremental backup. That worked well, but needed scheduling. Over time I wrote a wrapper script around Duplicity, which did schedule new full and incremental backups.

Restic works in a different way. There is no concept of "full backup" and "incremental backup". Basically every backup is a full backup, and Restic figures out which files changed, got deleted, or added. Also it does deduplication: if files are moved around, or appear multiple times, they are not added multiple times into the backup. Deduplication is something which Duplicity can't do. But because Restic can do deduplication, there is no common set of files which belong to a single snapshot. Data blobs from one backup can stay in the repository forever, removing snapshots might not remove any files at all.

Restic on the other hand needs "prune" to remove old data. A snapshot can be removed according to the policy specified, but this does not remove the data from the backup directory. A "prune" run will go over the data and remove any block which is no longer needed.

My first question - after figuring out which other backup tool to use: shall I replicate the wrapper script, or try something else? Given that the backup doesn't need complex scheduling, I decided against writing a complex wrapper. And since I am now deploying all devices with Ansible, I decided to integrate this into my Playbooks, and deploy a set of shell scripts. The goal was to have a small number of dedicated scripts doing the daily backup work, and another set of "helper" scripts which I can use to inspect the backup, modify it, or restore something.

My main goals for this: "small number of programs/scripts" (Unix style: each tool does one job), "rapid development" (don't spend weeks writing another scheduler), "rapid deployment" (re-run Playbooks and let Ansible deploy this to all devices).

 

Continue reading "Restic backup"

Restic upgrade on Debian Buster

A while ago I switched backups from "Duplicity" to "Restic". About time: I was using Duplicity for many years (I think I started using it around 2010, long before "Restic" became available) and it served me well. But recently I ran into more and more issues, especially with archives getting larger and larger. There is an 11 years old open bug in the Duplicity bugtracker, which describes a showstopper for backing up larger archives. And it doesn't look like this will be solved anytime soon. Therefore it was time for something new.

Since I'm rolling out my backups with Ansible, it was relatively easy to create a set of scripts for Restic which use almost the same infrastructure as the old Duplicity backups. That works as expected on all our laptops. But the Raspberry Pi, which does the fileserver backups, seem to had a problem. Backups took way longer than before, jumped from 30-60 minutes (depending on the amount of changes) to constantly around 10 hours.

After some investigation (means: --verbose --verbose --verbose debugging), it turns out that Restic identifies most of the files as new, even though they did not change at all. Some background information: the Raspberry mounts the QNAP fileserver using the SMB3 protocol. The "mount -t cifs" uses the "serverino" option, but apparently that is not enough to provide a stable inode number. And if the inode for a file changes, Restic assumes it is a new file.

On the bright side, because the content of the files do not change, the deduplication still works, and no additional content is added to the backup. The size of the backup does not increase. Still, Restic fetches all the data from the server, and that takes a long time.

 

Continue reading "Restic upgrade on Debian Buster"