Skip to content

QNAP: exclude files and directories from rsync

I'm moving files from one QNAP system to another, and I'm using rsync for this. It's preinstalled on a QNAP system. So far, so good.

To sync entire shared volumes, I want to exclude the '@Recently-Snapshot' and '@Recycle' entries - I don't want to sync the trash bin and I also don't want to sync entire snapshots.

The usual approach when using rsync is to just use the --exclude option.

rsync --exclude='@Recently-Snapshot' --exclude='@Recycle'

To my surprise this does not work. rsync on the QNAP does not complain, but also does not ignore the entries. Using escapes in front of the '@' doesn't work either. Ok, which version is the rsync program anyway?

[~] # rsync --version
rsync  version 3.0.7  protocol version 30
Copyright (C) 1996-2009 by Andrew Tridgell, Wayne Davison, and others.

Ouch, that is old. Very old. Released December 2009. Pretty sure QNAP did not fix all the bugs in there.

But according to the documentation, and the --help output, it accepts the --exclude option. Still not working though.

Ok, there is one more option: --exclude-from

I create a text file and add the two entries in there:

@Recently-Snapshot
@Recycle

And then I use the --exclude-from option to skip entries in these two directories:

rsync --exclude-from=/tmp/pattern.txt

This option works. At least something.

Summary: The rsync on a QNAP system does not accept the --exclude parameter, but the --exclude-from parameter works.

And for everyone asking why I don't use the integrated file copy: this one skips certain files, but I also want my dot files copied over.

PGSQL Phriday #002: PostgreSQL Backup and Restore

Thanks to Ryan Booz we now have the #PGSQLPhriday blogging series. The second edition is about backups. And of course also about restore.

Backups? Do you need that?

If your data is not important, you certainly don't need to bother about backups. Then again, why do you even store the data in the first place?

For anyone else, especially everyone who runs business critical or mission critical applications, having a backup is important. Equally important, but sometimes ignored, is the ability to restore the backup.

The requirements for backups are manifold:

  • How quickly must you be able to restore the backup (SLA)?
  • Do you need to be able to restore every transaction (catch all changes) or is a snapshot (backup freuency, hourly, daily, weekly, monthly) sufficient?
  • How can the backup integrity be verified? 
  • Where do store the backup, and make sure that a disaster (as example: lost of data center) does not affect the backup?
  • Who can restore the backup, what is the plan for that?
  • ...

Your #PGSQLPhriday task

Describe how you do backups for your PostgreSQL databases.

Which tool(s) are you using, where do you store backups, how often do you do backups?
Are there any recommendations you can give the reader how to improve their backups?
Any lesser known features in your favorite backup tool?
Any new and cool features in a recently released version?

Bonus question: Is pg_dump a backup tool?

Continue reading "PGSQL Phriday #002: PostgreSQL Backup and Restore"

Changes to the public schema in PostgreSQL 15 and how to handle upgrades

In September 2021, a patch for the upcoming PostgreSQL version 15 was committed which introduces a visible change for users: the CREATE privilege for the public schema is no longer set by default. This is a recommendation from CVE-2018-1058.

What does that mean for the average (non-superuser) user?

In PostgreSQL 14:

postgres=# SELECT version();
                                                version                                                
-------------------------------------------------------------------------------------------------------
 PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
(1 row)

postgres=# CREATE ROLE unprivileged WITH LOGIN;
CREATE ROLE

postgres=# CREATE DATABASE priv_test;
CREATE DATABASE

postgres=# \c priv_test 
You are now connected to database "priv_test" as user "ads".

priv_test=# \dn+ public
                       List of schemas
  Name  | Owner | Access privileges |      Description       
--------+-------+-------------------+------------------------
 public | ads   | ads=UC/ads       +| standard public schema
        |       | =UC/ads           | 
(1 row)

We see that PUBLIC (the second line in Access privileges) has USAGE (U) and CREATE (C) privileges. A regular user can create a table in the public schema:

priv_test=# SET SESSION ROLE unprivileged;
SET
priv_test=> SHOW search_path;
   search_path   
-----------------
 "$user", public
(1 row)

priv_test=> CREATE TABLE priv_test (id INT);
CREATE TABLE
priv_test=> \dp priv_test 
                               Access privileges
 Schema |   Name    | Type  | Access privileges | Column privileges | Policies 
--------+-----------+-------+-------------------+-------------------+----------
 public | priv_test | table |                   |                   | 
(1 row)

 

This is how the public schema privileges look in PostgreSQL 15:

postgres=# SELECT version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 15beta3 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
(1 row)

postgres=# CREATE ROLE unprivileged WITH LOGIN;
CREATE ROLE

postgres=# CREATE DATABASE priv_test;
CREATE DATABASE

postgres=# \c priv_test 
You are now connected to database "priv_test" as user "ads".

priv_test=# \dn+ public
                                       List of schemas
  Name  |       Owner       |           Access privileges            |      Description       
--------+-------------------+----------------------------------------+------------------------
 public | pg_database_owner | pg_database_owner=UC/pg_database_owner+| standard public schema
        |                   | =U/pg_database_owner                   | 
(1 row)

The C (CREATE) is missing. And the CREATE TABLE for a regular user is no longer working by default:

priv_test=# SET SESSION ROLE unprivileged;
SET
priv_test=> SHOW search_path;
   search_path   
-----------------
 "$user", public
(1 row)

priv_test=> CREATE TABLE priv_test (id INT);
ERROR:  permission denied for schema public
LINE 1: CREATE TABLE priv_test (id INT);

But how is this change handled in upgrades?

 

 

Continue reading "Changes to the public schema in PostgreSQL 15 and how to handle upgrades"

PostgreSQL Upgrades are hard!

Together with Lætitia Avrot and Nikolay Samokhvalov I was invited to participate in a Community Panel (YouTube video) about PostgreSQL Upgradability at Postgres Vision 2022. The panel was moderated by Bruce Momjian and initiated and organized by Jimmy Angelakos. Bruce did talk with each of us before, which helped a lot to guide the discussion in the right direction. The recording of the panel discussion is available on the Postgres Vision website.

During this panel each of us provided examples for how easy or complicated PostgreSQL upgrades still are.
 

 

Continue reading "PostgreSQL Upgrades are hard!"

Delete directories recursive on Hetzner Storage Box

Among other external solutions, I store some data on Storage Boxes from Hetzner. The Storage Box allows you to have sub-accounts, so for every server and system storing data there, I use a separate account. For each sub-account, one can select a subdirectory where the data is stored, and the sub-account then can only see this data. The Admin account can see all data, and see all directories.

The usual way I access the Storage Box from other systems is by using sftp protocol from ssh (don't confuse this with the "other" sftp). That's all good, until I remove a sub-account and want to delete the subdirectory with the data. The server doesn't know "rm -r" for recursive deletion, which means I have to traverse into every directory, delete all files, then delete the empty directories. And the encrypted backup I'm using is creating plenty of subdirectories.

Or I find a better tool.

 

Continue reading "Delete directories recursive on Hetzner Storage Box"

Restic backup

Was asked quite a few times how I do my backups with Restic.

For more than 10 years I was using "Duplicity" for backups, but in 2019 I changed to Restic. The main reason for the change was that Duplicity still can't handle "Big Data", as in: larger directories. In 2009 someone opened an issue on the Duplicity bugtracker, and this problem still exists as of today. For about two years I was shifting around the problem, excluding files, trying to make the sigfile smaller. But at some point I decided that it is enough and I need to change the tool.

Duplicity knows two backup modes: "full backup" and "incremental backup". Once in a while you take a full backup, and then you add incremental backups to that full backup. In order to restore a certain backup you need the full backup and the incremental backups. Therefore my go-to mode was to always have two full backups and a couple incremental backups in-between. Even if something goes wrong with the latest full backup, I can still go back to the previous full backup (of course with some changes lost, but that's still better than nothing). When taking a new full backup, the oldest one is only deleted when the new one is completed. Accordingly when a new incremental backup is created, it's a new set of files. Removing the backup removes all the files from this incremental backup. That worked well, but needed scheduling. Over time I wrote a wrapper script around Duplicity, which did schedule new full and incremental backups.

Restic works in a different way. There is no concept of "full backup" and "incremental backup". Basically every backup is a full backup, and Restic figures out which files changed, got deleted, or added. Also it does deduplication: if files are moved around, or appear multiple times, they are not added multiple times into the backup. Deduplication is something which Duplicity can't do. But because Restic can do deduplication, there is no common set of files which belong to a single snapshot. Data blobs from one backup can stay in the repository forever, removing snapshots might not remove any files at all.

Restic on the other hand needs "prune" to remove old data. A snapshot can be removed according to the policy specified, but this does not remove the data from the backup directory. A "prune" run will go over the data and remove any block which is no longer needed.

My first question - after figuring out which other backup tool to use: shall I replicate the wrapper script, or try something else? Given that the backup doesn't need complex scheduling, I decided against writing a complex wrapper. And since I am now deploying all devices with Ansible, I decided to integrate this into my Playbooks, and deploy a set of shell scripts. The goal was to have a small number of dedicated scripts doing the daily backup work, and another set of "helper" scripts which I can use to inspect the backup, modify it, or restore something.

My main goals for this: "small number of programs/scripts" (Unix style: each tool does one job), "rapid development" (don't spend weeks writing another scheduler), "rapid deployment" (re-run Playbooks and let Ansible deploy this to all devices).

 

Continue reading "Restic backup"

Restic upgrade on Debian Buster

A while ago I switched backups from "Duplicity" to "Restic". About time: I was using Duplicity for many years (I think I started using it around 2010, long before "Restic" became available) and it served me well. But recently I ran into more and more issues, especially with archives getting larger and larger. There is an 11 years old open bug in the Duplicity bugtracker, which describes a showstopper for backing up larger archives. And it doesn't look like this will be solved anytime soon. Therefore it was time for something new.

Since I'm rolling out my backups with Ansible, it was relatively easy to create a set of scripts for Restic which use almost the same infrastructure as the old Duplicity backups. That works as expected on all our laptops. But the Raspberry Pi, which does the fileserver backups, seem to had a problem. Backups took way longer than before, jumped from 30-60 minutes (depending on the amount of changes) to constantly around 10 hours.

After some investigation (means: --verbose --verbose --verbose debugging), it turns out that Restic identifies most of the files as new, even though they did not change at all. Some background information: the Raspberry mounts the QNAP fileserver using the SMB3 protocol. The "mount -t cifs" uses the "serverino" option, but apparently that is not enough to provide a stable inode number. And if the inode for a file changes, Restic assumes it is a new file.

On the bright side, because the content of the files do not change, the deduplication still works, and no additional content is added to the backup. The size of the backup does not increase. Still, Restic fetches all the data from the server, and that takes a long time.

 

Continue reading "Restic upgrade on Debian Buster"

How to reset your KDE (without deleting everything else)

I'm a (more or less) happy KDE user, ever since the KDE 3 days. Before that, I used fvwm2 for a long time, but that is a different story. It also happens that I never really reinstalled my home directory - the oldest files I can find are from 1997, and that is pretty much when I switched from an old Slackware system with self-compiled updates, to something with a more modern distribution. That means, that all the time from 1997 to today, I carry the same /home/ads across my computers. The home directory grew from a few MB to 133 GB today (maybe I should clean it up, but then again it's cheaper to buy a bigger harddisk).

It also means, that I never deleted my KDE config, even when upgrading to KDE 4 or Plasma.

 

Continue reading "How to reset your KDE (without deleting everything else)"

How I do backup of my mobile devices

Recently I was asked how I do backups of my mobile devices. The discussion started when I told that I "survived" a bricked device without data loss.

Disclaimer: I work (well, was working, before we got spinned of into a subsidiary) for EMC. Part of EMCs portfolio is "backup" and for sure I learned a lot from my employer. All my own devices and servers are backed up, some of them multiple times. Backups, devices and communication is encrypted. I'm not using public services (Dropbox, Drive, ...) for the backups.

So, what happened? One Friday afternoon, my Samsung S3 bricked itself. It was connected to the charger on my desk, the display flashed for a moment and then the device was dead. Reset, reboot, remove battery - nothing helped. No access to the data on the device.


Continue reading "How I do backup of my mobile devices"

Die Wolke hat es schon wieder getan

Einen Anbieter zentralisierter Dienste habe ich in meinem letzten Posting über Probleme mit genau diesen Diensten übersehen: RIM - Research in Motion. Die Firma, die den BlackBerry herstellt, jenes Gerät, dass Zugang zu den eigenen E-Mails von überallher verspricht.

Um dies zu gewährleisten, hat sich RIM quasi unentbehrlich gemacht: die Kommunikation des BlackBerry läuft über Server von RIM, die Nachrichten werden dort passend aufbereitet und auf das Gerät geschoben (Push, statt dem sonst üblichen Abrufen/Pollen der Nachrichten).

Nun hat so ein Konzept eine große Schwachstelle: den Anbieter. Dumm halt, wenn genau dort nicht genügend Kapazitäten zur Verfügung stehen oder irgendwelche Server ausfallen. Dann ist man offline, trotz all der großmundigen Versprechen des verkäufers.

Selbiges ist T-Mobile in den USA passiert, selbiges ist Palm mit dem supertollen Backup-Konzept des Palm Pre passiert. Und nun hat es zum wiederholten Male auch RIM erwischt und die Nutzer der mobilen Endgeräte standen mit einem Stück wenig sinnvollen Elektronikschrott da. Zum Glück ist in das Gerät zusätzlich ein Mobiltelefon eingebaut, so dass man nun wenigstens noch anrufen kann.

Daran zeigt sich wieder einmal, dass Anbieter von zentralisierten Diensten damit zwar gutes Geld verdienen wollen, die Kunden aber um solche Konzepte tunlichst einen sehr weiten Bogen machen sollten.

Probleme in der Wolke (aka: Backup in the Cloud)

Als langjähriger Palm-Anwender hat es mich sehr gefreut, vor einiger Zeit die Ankündigungen zum Palm Pre zu lesen. Ein neues Gerät, die Optik sah gut aus und außerdem sollte ein Linux drunter laufen. Was will man mehr?

Kurz vor dem Erscheinen vermehrten sich dann die Gerüchte über Details der mitgelieferten Software. So ist z. B. ein Sync zum lokalen Rechner überhaupt nicht vorgesehen, im Gegenzug sichert der Pre täglich alle Daten auf Server von Palm - in den USA. Außerdem will wollte sich der Pre unbedingt an der iTunes Software von Apple anmelden und spielt sehr gut mit diversen Google Applikationen sowie mit Facebook zusammen.

Als Handy für den täglichen Einsatz im Geschäftsleben interessiert mich jedoch viel mehr wie ich an meine Termine und E-Mails herankomme - und zwar möglichst ohne Zwischenanbieter und ohne das die Daten einen Umweg über die Staaten halbe Welt nehmen. Auch möchte ich nicht, dass meine Geschäftsdaten irgendwo anders im Backup liegen und dort gelesen, durchsucht und ausgewertet werden können.

Das sind meine Wünsche und jeder mag anders über die Wichtigkeit und die Privatheit seiner Daten denken. Was mich an diesem "Backup in der Wolke" (Cloud) jedoch wirklich stört: wenn der Anbieter Mist baut, sind die Daten weg. Und man kann selbst nicht mal was dagegen tun (Backup auf den eigenen Rechner ist nicht erwünscht). Palm ist da keine Ausnahme sondern eher die Regel:

Kunden von Microsoft und T-Mobile USA hatten als Nutzer des Sidekick-Handys vor nicht allzulanger Zeit ähnliche Erfahrungen machen dürfen. Bei einem Serverupdate ging etwas schief und daraufhin waren alle Daten von diversen Kunden weg. Der Datenverlust replizierte sich danach von den Backupservern auf die Mobilfunkgeräte. Der Anbieter benötigte zwei Wochen, um zumindest einen Teil der Daten wieder herzustellen. Währenddessen ergingen sich sowohl Microsoft wie T-Mobile in Pressemeldungen, dass sie ja gar nicht zuständig seien sondern die Firma Danger als Anbieter des Service. Als Kunde interessiert mich in dem Moment aber nicht, wer nun nicht zuständig ist, weil der Service ausgelagert wurde. Ich will einfach meine Daten wiederhaben, die angeblich in der Cloud so sicher sind.

Das Zwangsbackup von Palm zeigte nun heute ähnliche Probleme: die Daten von einigen Anwendern waren nach dem Backup einfach nicht mehr vorhanden. Wen möchte man als Anwender da jetzt haftbar machen? Die Firma Palm sitzt in den USA. Die "Zwangsleistung" Backup ist jedoch nicht im Vertrag mit dem Serviceanbieter in Deutschland enthalten.

Bleiben für mich zwei Fragen:

1) Was will man mit einem Palm Pre noch anfangen? Aufgrund der Backup Situation, dem nicht gewollten lokalen Sync, der Gängelung der Anwender und dem nur zögerlichen Befolgen der GPL-Bestimmungen ist Palm für mich keine Option mehr.

2) Welches Handy kann heutzutage so Sachen wie Kalender nach Caldav Standard, vernünftiges IMAP oder bietet mal ein Terminal zum Arbeiten mit ssh?

Linux Installation kopieren

Wenn man eine Linux Installation von einer Platte auf eine andere kopieren möchte, kann dies mit einigen einfachen Schritten erfolgen. Die SuSE Supportdatenbank hat dafür einen netten Eintrag.

In kurz auch hier:

mkdir /old /new
mount /dev/alte_disk /old
mount /dev/neue_disk /new
cd /old
tar -cSp --numeric-owner --atime-preserve -f - . | ( cd /new && tar -xSpv --atime-preserve -f - )
edit /new/etc/fstab
Nicht vergessen: lilo oder grup auf dem Bootblock der neuen Platte installieren.
cd /
umount /old
umount /new

Backups ... properly

Today i found this little script which does backups on a server:

----- cut -----

#!/bin/bash

cd /home/Backup/

DATE=`/bin/date "+%d.%m.%y"`

WEB="web-($DATE).tar"
SQL="sql-($DATE).tar"

tar -cvf "$WEB" /var/lib/mysql
tar -cvf "$SQL" /var/www

exit 0

----- cut -----

Ok, let me count:

1) no check, if the cd fails

2) no packer used

3) no errorchecking at all for the tar (disk full as example)

4) copying the open database files is always a bad idea[tm], mysqldump exists

5) and now check out, which variable name is used for which backup ...