gmuslera 6 hours ago

One thing that worked for me was, with a particular big database, to use borg instead of restic. Most of the database was historical data that usually don't change, the mysqldump file is almost identical with the exception of the new and old modified data. And there is where borg deduplication and compression works, the new dump will have most blocks identical to the old one so with it I could have several days of backups taking little extra space in the borg repository. And I was able to rclone that borg repository to S3 intelligent tier so I could keep long time backups that way that for most of the space it would end in glacier storage transparently.

Of course, it is not a general solution, but knowing what the data is and how it changes may let you take more efficient approaches than what is usually recommended.

  • conception 5 hours ago

    It was kopia to b2 for me. Takes like 6 minutes to do a diff backup on a few terabytes of a remote share (aka no snap diff, just a raw scan).

timwis 3 hours ago

Thanks for sharing - definitely learnt a few things reading it!

sam_goody 2 hours ago

1. There are tools designed specifically for backing up database data, such as percona-xtrabackup for MySql and PgBackRest for Postgres.

2. If you use mysqldump, it may make sense to export the data as CSV, and backup the db structure and data separately.

CSV is obviously less compact than binary export, but on the other hand is only the data, and triangulating between the old and new CSV data is obviously much simpler - and is human readable/diffable if that is important to you.

To the point where I had a script running export/commit-to-git/push daily as a secondary db backup, and found it to be quite efficient and useful.

Of course, that can be combined with restic, but worth knowing.