Hacker News new | past | comments | ask | show | jobs | submit login
Restic – Backups Done Right (restic.net)
573 points by IceWreck on Nov 13, 2021 | hide | past | favorite | 286 comments



Oh yeah, restic is awesome

  _backup_prepare () {
    export $(sudo cat <protected_credentials_file> | xargs)
  }

  _backup_remove_old_snapshots () {
    restic forget -r <repo_name> --keep-weekly 10
  }

  _backup_verify () {
    restic check -r <repo_name>
  }

  backup () {
    echo "---------------Scheduled backup time---------------"
    echo ""
    _backup_prepare
    restic -r <repo_name> --verbose --exclude="$HOME/snap" --exclude="$HOME/Android" --exclude="$HOME/.android" --exclude="$HOME/ApkProjects" backup ~/
    echo ""
    echo "---------------Backup done, removing old snapshots---------------"
    echo ""
    _backup_remove_old_snapshots
    echo ""
    echo "---------------Old snapshots removed, verifying the data is restorable---------------"
    echo ""
    _backup_verify
    echo ""
    echo "---------------Backup done and verified!---------------"
  }
Then in cron I just have this entry:

  20 \* \* \* DISPLAY=:0 kitty -- /bin/zsh --login -c 'source ~/.zshrc; backup; read'
So every time my PC is on @ 20:00, a shell window will pop-up, asking me for password and runs the backup :). Since they are incremental, it takes maybe 10-15 minutes top.


Note that `restic check` only verifies that the repository metadata is correct, and doesn't detect, say, bit swaps in actual packfiles, which would render your backup unrestorable. You might be interested in the `--read-data` or `--read-data-subset` flags to help double check your backups!


Ooh I just checked this and you're right! Thanks for the heads up, I'll have to incorporate this into the script.

Relevant manual page in case anyone's interested: https://restic.readthedocs.io/en/latest/045_working_with_rep...


Love seeing a code example. It's one thing to hear "restic is fantastic, super easy to set up", it is another to see an example of HOW simple it is. Thank you for sharing.


You might be also interested in how I use restic to backup PostgreSQL and other data onto Backblaze for security and to save cloud costs as Cloud providers charge exuberant fees for backups[1].

[1] https://abishekmuthian.com/backup-postgresql-to-cloud/


Suggestion: You can use pass to securely store and pipe your password, no superuser required.


Are you sure you want to remove old backups unconditional to the success of the backups


I've never seen/used xargs (/export) like:

    export $(sudo cat <protected_credentials_file> | xargs)

Is that the same as sudo cat f | xargs export?

If so, I think an important difference is that mine won't hide the exit code, and you can then answer a sibling commenter (on forgetting without checking) with 'I omitted set -e at the top'.


export is a shell builtin -- it can't be executed by xargs and even if it could, wouldn't be in the context of the shell you want.


True. `sh -c export` then, with `set -o allexport` ;)


The sibling comment said.

It looks like in this example it's used simply to trim whitespace. E.g.

    $ echo '  hello  ' | xargs
    hello


Not sure if sarcasm...


My eyes immediately glazed over seeing the code, tbh. Not what I expected from a super simple backup solution.


The relevant restic part is:

restic -r <repo_name> --verbose --exclude="$HOME/snap" --exclude="$HOME/Android" --exclude="$HOME/.android" --exclude="$HOME/ApkProjects" backup ~/


Never mentioned it's super simple. It's DIY that I stitched together in 1-2 hrs of incremental upgrades and it's been running like this for months. To me it's simple enough but YMMV


Sorry it wasn't meant as a criticism of your work. I just had hoped for something simpler from the "Backups Done Right" headline.

To me if it requires complicated shell commands it is not really "done right".


> complicated

The snippet GP mentioned includes only shell functions, invocations of restic, and echo commands. How could it be simpler?


simple/fun vs. complicated/scary depends on the target user.

Simple for someone who is comfortable with the terminal. Scary for those who are oyherwise familiar with GUIs only.


Sure, but if I'm not comfortable with the terminal, I probably shouldn't be too loud with my criticisms of shell scripts.


It just seems a pity that knowledge of shell scripts is necessary to do backups.


Yeah I bet there are good GUI tools out there, but I always want to go for the stuff I can script myself, so I can hook desktop notifications into it and such.

It makes it hard for me to recommend backup tools to the non-technical people in my life, because they're looking for GUI solutions with the corners rounded off, and I want something crunchy and scriptable.


With a GUI. I think it could also be less error prone if the user interface guides through the configuration.


You won't find much support for an opinion like that on a site named "hacker" news, haha.

A lot of backup services (tarsnap comes to mind) prioritize scriptability, as I imagine many people run backups from cron or a systemd timer.


Then it’s less scriptable. Many of us would prefer a CLI-based solution for that reason


It doesn't have to be one or the other. The GUI would be for one-off or intermittent usage, and the CLI program would still come in as the primary tool for scripting or recurring use.


Would it really be simpler to have 100 lines of GUI setup and tear down, obscuring the most important part (the commands being run)?


Both sounds good as opposed to instead.

GUIs for simple stuff and 99% of the population, CLIs for hackermans who need the advanced stuff.


Depending on the GUI, it seems possible. You can also verify input in a GUI


I'd personally put the threshold for "complicated" at "do you need a keyboard to be able to use it". So it's pretty complicated.


It does seem a bit obtuse. Especially compared with

# tar cvf /dev/st0 .


Its a single shell command.

restic -r reponame /path-to-files-to-be-backuped

Everything else in the mentioned script is optional stuff for scheduling, removing old backups, etc.


It kind is, simple but effective. For comparison, take a look at my over-engineered backup script: https://github.com/danisztls/arbie/blob/main/arbie


Well, to be fair, your does a bit more than the simple solution above. If you backup system configs and stuff with restic it wouldn't be much shorter than what you do here.


Wow :-)


Restic is a great backup program. I use it (via rclone) to back up my laptop and it truly saved my bacon when I attached my main hard disk to a VM I was playing with and accidentally formatted it!

Restic can use rclone to access cloud providers it can't access and that's something we worked together on.

I use rclone directly for backing up media and files which don't change much, but restic nails that incremental backup at the cost of no longer mapping one file to one object on the storage which is where rclone shines.

(rclone author)


Will there be any issue if an rclone crypt remote is selected to be the backend for restic?

Is there any interaction between these two layers of encryption, or are they completely independent?


I do something similar but using Borg (backup), gocryptfs (encryption) and rclone (sync) for cloud backup. No cascade.


Not OP but why would you want to use rclone crypt with restic if restic already does the encryption for you.


There are sometimes vulnerabilities in software. For instance, popular rclone and tarsnap have had fatal vulnerabilities in the past.

Also, I have a crypt set up for other stuff and rather deal with one remote.

Otherwise, cascades encryption is generally not recommended.


Do you have any links about the tarsnap vulnerability?

I've been using it for a few years and was not aware of it so would be great to know if any backups were affected.



Thanks Colin!


Past related threads:

Saving a restic backup the hard way - https://news.ycombinator.com/item?id=28438430 - Sept 2021 (2 comments)

Restic Cryptography (2017) - https://news.ycombinator.com/item?id=27471549 - June 2021 (5 comments)

Backups with restic and systemd and prometheus export - https://news.ycombinator.com/item?id=27411713 - June 2021 (1 comment)

Restic – Backups Done Right - https://news.ycombinator.com/item?id=21410833 - Oct 2019 (177 comments)

Append-only backups with restic and rclone - https://news.ycombinator.com/item?id=19347188 - March 2019 (42 comments)

Restic Cryptography - https://news.ycombinator.com/item?id=15131310 - Aug 2017 (36 comments)

A performance comparison of Duplicacy, restic, Attic, and duplicity - https://news.ycombinator.com/item?id=14796936 - July 2017 (49 comments)

Restic – Backups done right - https://news.ycombinator.com/item?id=10135430 - Aug 2015 (1 comment)


Seems great, but I don't find the "Quickstart" section helpful. A demo video that shows 2-3 unknown commands without context is not a guide, same with links to the full documentation. Both make sense on the landing page, but both don't tell me how to get started quickly.

Edit: interestingly the restic repo's Readme is better in that regard. It's the demo video but in text form with explanations: https://github.com/restic/restic#quick-start


Quickstart:

    repo=/mnt/mybackupdrive/somefolder
    restic init -r $repo
    restic backup /my/files -r $repo
    restic mount /mnt/backedup -r $repo
    cp -r /mnt/backedup/snapshots/latest/accidentally-removed-directory ~/restored
Where the repository can also be sftp or various other things.

There are a ton of other commands and options, but to literally just get started, this is all you need for making and restoring data from a backup.


> unknown commands without context is not a guide

This quickstart assumes some context that not every person starting to use Restic has. It could be improved by offering some more context for each line:

    # RESTIC QUICKSTART FOR LINUX

    # Choose were you want to save your backups

    MYREPO=/mnt/mybackupdrive/somefolder

    # Initialize a new restic repo at your chosen backup location

    restic init -r $MYREPO

    # Backup your files to the newly created repo

    restic backup /my/files -r $MYREPO

    # To restore a backup, first mount the repo

    restic mount /mnt/backedup -r $MYREPO

    # Browse the latest backup at /mnt/backedup/snapshots/latest

    ls -la /mnt/backedup/snapshots/latest

    # Copy the files you want to restore out of the repo

    cp -r /mnt/backedup/snapshots/latest/directory-to-restore ~/restored


> > unknown commands without context is not a guide

Quick start != guide, to me. A guide will guide you through thing at a slow pace, a quick start is the quickest way to get something running/started. And the commands seem fairly self-explanatory, but I could be suffering from the so-called curse of knowledge.


That's fine, but 'quick' can still tell you what you're doing.

Fwiw, I've used restic for a few years, and (still, if perhaps less) agree the docs are not great.


I don't remember really having that issue, though I don't remember the very first use. I think the only time where I remember being a bit confused, was with the 'forget' command, and I would guess that's just because of its nature rather than due to lacking docs. Testing what I thought was correct with --dry-run solved that problem.

Mostly I just use restic --help or restic subcommand --help anyhow, rather than the docs. Perhaps I'm just not doing fancy enough things with it and that's why I didn't run into edge cases yet where it lacks documentation?


Same goes for S3 support, working great. Been using MinIO and SeaweedFS as target, works identical to native AWS.


It is never encouraging when I see that a project is at least 6 years old and has unapproachable documentation like you describe, and language about how they'll stop changing the backup format "when 1.0.0 is released."

Like...maybe it's time to stop working on new features, and focus on a release.

Duplicati has a similar problem. They're endlessly tinkering with new features instead of squashing bugs and meeting user expectations. I think it still chokes and permanently corrupts backup archives if you interrupt it during the initial backup.

Like: guys. People expect backup software to be able to handle interruptions and disconnects. It's actually one of the things I liked the most about Crashplan, and it could handle being interrupted without issue...nearly a decade ago.


Really looking forward to the day I can ditch borg backup and switch over but really can't until https://github.com/restic/restic/issues/21 is addressed. I have to pay for cloud storage and the lack of compression would easily double my costs based on my testing.


Restic looked interesting - I would never have guessed it didn't support such a fundamental feature as compression?! So thanks for mentioning this and saving me from wasting my time.

Note: I'm not trying to be horrible or disrespectful to the Restic devs; it's just that backup without compression is a complete show-stopper, especially if you want to use cloud storage (where storage space, storage operation, and bandwidth all eat money).


It does block-wise (restic definition, file chunks, not disk) de-duplication, so it's not as bad as you are perhaps imagining.


In my experience, the deduplication on SQL dumps is way inferior to borg, even without the compression.


There can be security issues when using encryption on top of compression. These attacks have been used against HTTPS.

I don't know if they apply to this specific case, but I assumed that the restic developers were being extra paranoid.


Known plaintext attacks aren’t much relevant in backup context.


Sure, that's intrinsic when combining encryption with compression - but it shouldn't be a reason not to make compression optional!


Why are you looking forward to ditching borg? I've started to use it as my backup solution and it works wonders.


My main gripe with borg, personally, is that it's push only. I want to be able to back up my VPS without having to have it ssh into my home network.


Do I have the horrible hack for you!

It is possible to do pull backups with borg, with some gruesome ssh hackery.

On the backup client side, you need to have a /root/.authorized_keys line like this (edit borg options to suit):

  command="BORG_PASSPHRASE=$(cat /root/.borg-passphrase) borg create --rsh 'ssh -o \"ProxyCommand socat - UNIX-CLIENT:/root/.socket/borg-socket\" ' --compression auto,zstd -s -x ssh://borg-backup@localhost/<repo location>::<backup name> <backup dirs>" ssh-ed25519 <keydata> <keyname>
Then, on your borg server, you need an authorized_keys file like this:

  command="borg serve --append-only --restrict-to-repo <repo location>",restrict ssh-ed25519 <keydata> <keyname>
Finally, run a small shell script like this from the borg server to trigger a pull backup:

  #!/bin/bash
  eval $(ssh-agent) > /dev/null
  ssh-add -q /home/borg-backup/.ssh/<keyname>
  ssh -A -R /root/.socket/borg-socket:localhost:22 -i .ssh/<keyname> root@borg-client
  ssh-agent -k
I trigger this using cron every night, but systemd timers will work too.

The first neat thing about this setup is that the client never even sees the private key that it uses to authenticate to the borg server - the key stays on the server & authentication is tunnelled between client & server via ssh-agent. You don't even need to be able to make a tcp connection from the client to the server - so long as the borg server can make an outgoing tcp connection to the client then everything just works. The client connects back to the server via a socat connection through a unix socket created by the outgoing ssh connection that tunnels any tcp connection made through it back to the sshd on the server. (You could probably tunnel the repo passphrase through as well, if you really wanted to.)

The second neat thing is the use of authorized_keys commands which are tied to an ssh keypair means that you're giving the minimal possible access - each ssh connection can only trigger that specific command & no other. You can issue ssh keys on a per-host basis & revoke them individually if necessary.

You have to use socat as a proxy program for the return ssh connection as ssh doesn't know how to connect to a unix socket & this setup requires

  config
    StreamLocalBindUnlink yes
in the .ssh/config on the client (possibly both client + server?), as otherwise the unix socket doesn't get cleaned up after the connection ends & the whole thing only works once before you have to remove the socket by hand. I'm not sure why this isn't the default for ssh to be honest.

This method is outlined in the ssh-agent section of https://borgbackup.readthedocs.io/en/stable/deployment/pull-... but the docs don't really call it out as a method of getting pull backups working properly. It's a bit convoluted, but it does work!

(If your client can make a direct tcp connection to the server you can skip the whole song + dance with the unix sockets of course.)


Wow, I only just saw this now but ... my mind is boggled. I don't know if I'd trust this for general use, but it's super cool. Thanks for sharing!


If you strip out the unix socket stuff (which I need for oddball network config reasons...) it’s just standard ssh authorised keys configs & ssh-agent working exactly as designed. It’s quite elegant really!

It’s the unix socket dance that introduces the gruesome hackery (imo at least!).


I mirror data on VPSs to my local storage array with rsync/unison, and then backup the whole thing with borg.


Ah true, that's a good point. I can't use it to back up my Android phone, for instance.


I do backup my android phone using borg installed via termux.


Holy crap that's just `pkg install borgbackup`. I had no idea (my phone is already rooted anyhow, so this will also be able to get data folders). This changes everything. There is also `pkg install restic` btw. Based on the problems with append-only in borg and the lack of those in restic's implementation (I did a short audit on that part of the `restic/rest-server` code, looks solid but don't take my word for it), I might go with the latter but this is a great tip regardless.


The reason I chose borg over restic is there are at least two commercial providers (useful for an offsite backup). borgbase.com and rsync.net too iirc


That's an interesting solution. Do you have any more details or a blog post to share?


I never wrote a blog post about it, but it is triggered when I plug in my charger and the phone is on Wifi. There are hooks in termux to do so. Thanks for the suggestion to write a blog post about it ;)


Quid pro quo: I've been using Titanium Backup[1] to make backups of all my apps, however it does not working properly with Android 11 and seems to perhaps be abandoned. So I'm now also using OAndBackupX[2] as well, which seems to be doing the job.

I then use FolderSync[3] to SFTP synchronise those two backup folders across to my server regularly when the phone is on the home wifi. (I also two-way sync my photos folder which is really quite handy.)

I use to also occasionally do a full sync of my phone contents to my server using FTP[4] although since upgrading, Android 11 has clobbered access to the Android/data folder making that problematic.

Using Termux + Borg (or Restic) so push full full backups looks attractive. Never seen Termus before. Thanks.

[1] https://play.google.com/store/apps/details?id=com.keramidas....

[2] https://github.com/machiav3lli/oandbackupx

[3] https://play.google.com/store/apps/details?id=dk.tacit.andro...

[4] https://play.google.com/store/apps/details?id=com.theolivetr...


Man, don't tell me Titanium Backup is abandoned... I've been using it for almost 10 years now!


Also a long time user. I'm only speculating on abandonment because: it hasn't had an update since Nov 2019; I believe the fix for Android 11 would be a fairly simple permissions change, and; from the comments, no one has had a response from the author on the issue.

It is a shame, it has been a mainstay for me, restoring apps and data across at least three phones now. I'm hoping OAndBackupX works out, but have not really battle-tested it yet.


tailscale


Kopia supports compression: https://kopia.io/


Borg locally in combination with rclone to AWS (or other) works very well, compression and deblocking.


What is deblocking? Did you mean deduplication?


Yes, dedup of blocks :)


You might like Snebu then, it has always had compression and deduplication, now does public key encryption. No direct cloud support, although you can keep your repository sync'd to a bucket if you want. (Disclosure -- I'm the author)


"Once version 1.0.0 is released, we guarantee backward compatibility of all repositories within one major version; as long as we do not increment the major version, data can be read and restored. We strive to be fully backward compatible to all prior versions.

During initial development (versions prior to 1.0.0), maintainers and developers will do their utmost to keep backwards compatibility and stability, although there might be breaking changes without increasing the major version. "

they are on 0.12 now


For what it's worth, it has a good track record, but yes you're totally right that restic is not exactly commercial-grade software with proper guarantees.

In one of the early talks at a local hackerspace, the author did also demo decrypting the data manually if something got somehow broken. The tool is just a layer on top of relatively straightforward cryptography. From what I remember (I looked at this in 2018 so forgive any errors), you'd have to write a script that iterates over the index where it says which blocks are in which files (since it's deduplicated) and decrypt it with some standard AES library. Perhaps as a security consultant this seems easier to me than it does to others, though, but it's not as if you're without hope if the tool did break, or as if you couldn't just download a previous version from the GitHub releases page.


One of the reasons I prefer restic to borg is it is trivial to maintain a standalone copy of the executable. Wherever I put backups, I keep a copy of the restic executable used to generate the dump.

For extra paranoid, could clone the restic source tree (with vendored dependencies). Go language backwards compatibility is such that I should always be able to read my data.


Hmm, anyone know if they fixed some of the exponential time issues it was having? I was really excited about some of the features and tried it a couple years ago, and it died before getting very far sync'ing my NAS to a usb3 jbod I plugged directly in to prime it for a rpi4 network attached backup. For something like that I would expect the backup speed to be a few hundred MB/sec and it quickly was just running in the 10's of MB/sec and getting slower.

I've noted this before that my NAS has a tendency to kill a lot of backup applications that haven't really been stressed. Its about 50 something TB of data, made up of a fair number of compressed video files, family movie kinds of things and about ~7T of source files. The combination of a crapton of tiny files and 50T of data kills most of these more recent open source backup applications, which seem to be released with the "it can backup my laptop" testing.

Also, I see someone mentions it sill doesn't have a compressed block option, which I really want because most of those source files compress to about 1/3rd of their space, which is important since my upload speeds are crummy (thanks spectrum!) I wonder how much of that deficiency is just lack of good go compression routines that can run at a few hundred MB/sec.


My main dataset that I back up with restic is around 11 million files or so, and fluctuates between 8-15TB of source data. I think I've run into similar issues testing out various backup tools too, but have settled on restic for now since it's been the most reliable.

I'd definitely recommend trying restic again if it's been a couple years. Somewhat recently they made some nice speed improvements. It used to take me several days to do a forget+purge on my restic repo. Now it takes only a few hours (less than an actual backup takes).

How many files are you backing up? I found the biggest issue for me was that almost every tool tries to keep the list of files in memory, so once you get into millions of files - it starts to require a lot more memory and can crash on low-resource machines.


I've always liked the look of Restic. I should really start using it to backup my Linux servers.

However for desktop use, I've always really struggled with the idea of not having a UI for my backup client. I'm not afraid of the command line, but the idea of browsing backup archives without a GUI feels awkward to me.

I wonder if there is room for some sort of add-on GUI for Restic for those that are more visual (unless such a thing already exists?)


You can use `restic mount` to mount a repository and browse and restore backed up files using your file browser GUI of choice. Works quite well.


Wow, that's awesome.

I think many more people would use Restic if this was called out prominently on the home page.

I've known about Restic for years and would likely be using it by now had I realised that!

EDIT: Ah looks like it may not work on Windows. Part of the appeal of Restic, for me, would be being able to cover all of my Windows, Mac and Linux machines with the same system.


I'm guessing it would work in WSL.


Ooh, didn’t know. Sounds very useful!


Restic doesn’t have a GUI for backup/restore configuration and actions (yet), but I found that Borg has a GUI called Vorta [1] for macOS and Linux. This was mentioned on a HN comment several days ago by @crossroadsguy here. [2]

[1]: https://vorta.borgbase.com

[2]: https://news.ycombinator.com/item?id=29089947


I really like that tool. I mentioned it on this post as well


Indeed, I feel the same way. I used to be a big user of CrashPlan, largely because it had a straightforward UI: highlight a bunch of directories; choose a backup destination; choose a backup frequency.

I do like the idea of being able to tune and configure things from the CLI. It was frustrating configuring CrashPlan on a remote computer.

With that said, I feel like both should be possible. Even just a basic wrapper GUI would be a start.

Edit: and some basic searching has lead me to lots of options! Time to do some more research.


I also used (and loved) CrashPlan for years before they went in an odd direction!


I am a restic user. Use it to backup to b2 and Scaleway. It is not without hiccups, but once setup, and as long as I don’t backup anything from those protected Mac folders, it has worked smooth so far.

However I also acutely feel the lack of a standalone GUI so that I can get rid of the scrips and custom setup or at least that can be an option. (There’s a commercial third party UI I think which is a subscription)

https://vorta.borgbase.com (a third party qt GUI for borg backup) has been really awesome.

Another tool I’m looking at closely is https://kopia.io. It comes with a UI by default (Electron I guess). Though its UI and logo has quite some work left.


I trialed restic, kopia and borg and ended up with kopia, backing up to backblaze + opportunistic external hdd. Costs me cents a month for tb+ between all our devices. Agreed on the kopia UI (and logo, was only thinking that the other day!), it's pretty basic .. but to be honest it's all you need. I use it on my family's machines as well, including wife and parents (same bblaze, dif HDDs), and it means they can pretty much manage it without me. I liked restic and probably would have ended up with it if there was a decent oss/stock UI. Borg was significantly slower on my machine for backup and restore.


I’ve thought a lot about this. One alternative solution could be this: Instead of backing the laptop up directly, sync it to a NAS. (Using rsync or a similar tool.) Then run Restic on the NAS and back up the data from the NAS to S3 or similar.


Check out Kopia. Seems similar to Restic.


Previously: 176 comments, 2 years ago https://news.ycombinator.com/item?id=21410833

Related: "Restic Cryptography" by FiloSottile (2017), 36 comments https://news.ycombinator.com/item?id=15131310


Restic is amazing. I like how Restic could be automated.

I usually work in multiple Linux virtual machines and I have a Bash script to setup regular backup of all my many $HOMEs.

I did not yet have the script for verification (did it manually just to be sure), but the rest is here if someone is interested https://github.com/senotrusov/sopkafile/blob/main/lib/ubuntu...


Cool repo! You've now sent me down a deep Sopka rabbit hole..


I'm really glad that you find it interesting, it was my solo project for the long time. Feel free to contact me on discord stan#9673 if you need any help or just for the general shell-scripting related chat :)


I've been using Restic for at least 4 years to store encrypted, incremental backups on Backblaze. Before that I used duplicity.

I pay around 0.40$ a month to have a backup of around 70 GB.


Backblaze is that cheap? Or am I just out of touch as storage prices go?


Most likely Backblaze B2


B2 is super affordable and restic supports it ootb


I'm still waiting for backup tool that uses asymmetric encryption (data encrypted using public key, decrypted using private key) and having write-only server mode (so bad actor can't remove backups)


At the risk of sounding like a shill, tarsnap does that and explicitly supports write-only keys[1]

[1] https://www.tarsnap.com/tips.html#write-only-keys


At the risk of sounding like a shill

For the record, this is not a shill. I don't even know who Jenny is.


I do use Tarsnap for this reason.

That said, Tarsnap isn't free software, and doesn't come with the ability for me to self-host backups. Thus, I am somewhat at the mercy of Amazon.

Still, the advantages of Tarsnap currently outweigh the disadvantages in my opinion.


Indeed, this is one of the major advantages of tarsnap, though a friend also mentioned borg can do this apparently. I should really look into borg (again).


The append-only mode can be implemented using https://github.com/restic/rest-server or services like rsync.net that offer read-only zfs snapshots. Doesn’t solve the asymmetric crypto of course.


http://duplicity.nongnu.org/ at least can use PGP public keys. I've used it for a long time and not seen any particular reason to change.


(I make Arq, a backup app that supports S3's object-lock API for immutable backups). I can't see how to do incremental backups without using the private key to read the previous backup record. Can you explain how that would work?


Hmm. If anybody can encrypt backups indistinguishably, and you want write-only so that bad guys can't remove stuff, surely you can incur unlimited costs as bad guys fill it with gibberish and you can't stop them?


You would run out of remote storage space quickly and then find out you have been compromised. But your backups that you made up to the point of being compromised are still in tact, which seems to be the best you can hope for.


> incur unlimited costs as bad guys fill it with gibberish

That's always an option, no matter if it's set to append-only or not. If you don't want to pay infinite costs for storage, you will need to limit it using software on the server.


The remote backup store can easily choose to only store things you actually encrypted, but this isn't possible for a simple public key setup. If you wanted to get fancy, you could use a sign + encrypt setup with separate keys so the store can tell if this is a real backup from you, and not allow things to get stored unless they've got such a signature, yet it still can't actually decrypt the backups it has been given.

As a proof of concept, take a look at a Certificate Transparency log server. Most CT logs are configured to only accept certificates meeting certain criteria. They'll log any such certificates (their SLAs only apply to contractual users, but you don't need an SLA you're probably just writing one certificate to the log to see it works) but you cannot fill them with garbage because you can't make any certificates they'd accept, only the legitimate CAs can do that†

† The CAs have their own reasons not to let you produce heaps of garbage, even Let's Encrypt has finite resources and so it imposes rate limits.


But if your box gets compromised, then whatever it was using to prove to the server that it's legit is also going to be available to the attacker. I guess you're thinking of a scenario where backups are not automated and the user either unerringly knows whether their box is compromised (and doesn't type in said proof when that happens) or uses some 2FA hardware device.

Given the chance, there certainly will be people that use 2FA when making an append-only regular backup, but even among command line restic users I expect this will be the exception rather than the rule.


Incremental and differential backups aren't possible that way.

Also Borg has an append only mode.


Why would that not be possible? Especially since you say borg can do it, and borg is said to be incremental and deduplicated?

Note that write-only or append-only is a bit of a misnomer, since reading the files is fine (you need the decryption key anyhow before they're of any use). It's about not being able to overwrite or remove backup data without some verification that you're not ransomware or similar.


Tarsnap provides incremental and differential backups with public key cryptography. So it must be possible.


Only if the backup side is unreadable (which is different from being append-only).


I'm the GP's setup, it's not readable without the private key.

You could do deduplication on the encrypted blocks I suppose (is that secure?)


For the longest time I used to use borg. These days, however, I am using kopia[0]

https://kopia.io/

Quite happy with it.


Seems like an interesting project, what are the differences from borg?


Well it is written in Go instead of Python. If I remember correctly, borg is single threaded and quite slow. Kopia is really fast.

Other than performance, the biggest benefit is that multiple clients can write to one repository at once. There is a designated maintenance user, though.


When it comes to backup software, do yourself a favor and do extensive research on their stability and reliability.

From what I learnt, borg has least problem of all the open source backup software with the most wanted features (encryption, dedup, compression and rotation) and others all have their quirks, be it huge memory usage, data loss to worst being corruption of the backup repository.

Things may have improved since I've looked but you want user feedbacks saying so.

It's too late to know that backup software has been failing when you need it as the original data is already unavailable, so choosing by the look of landing page is a bad way when it comes to backup.

kopia is still new and last time I used about a year ago, it still had basic problems, so I would never use it in place of borg.

restic, duplicati, duplicity and duplicacy all have some sort of problems especially when the repo gets large but of course there are cases where things are working fine.

https://forum.rclone.org/t/rclone-as-destination-for-borgbac... (Restic issues)

https://www.reddit.com/r/Backup/comments/opu1ep/comment/h67n... (Kopia issues)

https://forum.duplicacy.com/t/memory-usage/623/24 (Duplicacy doing a big rewrite of the engine to fix memory issues.)

https://forum.duplicati.com/t/is-duplicati-2-ready-for-produ... (duplicati stability issues)

https://www.reddit.com/r/unRAID/comments/eg0zpe/duplicati_se... (Another duplicati issues)

borg also has change logs with any major reliability problems mentioned up front which gives you more confidence than serious bugs buried in GitHub issues like other tools.

https://borgbackup.readthedocs.io/en/stable/changes.html

The only downside of borg is it can only target ssh host natively, but there are services like rsync.net (with special borg pricing), borgbase or you could locally run borg and rclone the entirety to anywhere you want.


So my issue with borg is that it is slow and verification takes forever on large repositories. Will you not use borg now as well?

Every software has its issues. That's why it's important to test restores regularly.


There are a bunch of options to borg check depending on what you want. A full check with --verify-data is indeed rather slow, because it checks everything (essentially equivalent to seeing if all archives can be extracted plus a bunch of extra checks). If you only want to detect e.g. bit-rot, --repository-only will be sufficient in most cases and will be I/O limited.


There are certain priority to what you treat as issues.

Slowness is the least of the problem against data loss or repo corruption and perhaps you may be able to somewhat circumvent it by splitting the repo to different locations or disks.


You mention duplicity but do not give a link to an issue. Do you know if there is something wrong with it? Would really appreciate you pointing me towards that if so.


Sorry I didn't link to an issue but a concern with duplicity is more about its implementation which needs incremental backups to depend on the base full backup (that you would make the first time) which means if you want to prune that base backup later on, you need to take a new full backup for later incremental backups to depend on instead of taking continuous incremental backups indefinitely.

https://duplicity.gitlab.io/duplicity-web/vers8/duplicity.1....

> When restoring, duplicity applies patches in order, so deleting, for instance, a full backup set may make related incremental backup sets unusable.

Other software mentioned have each incremental backups independent of each others, so you can prune any of it and the last one is still retrievable.


I have a Makefile to invoke restic on just about every Linux machine I have (even Raspberry Pis), usually to do incremental backups to Azure blob storage. It is a great, no-fuss, “forever” tool I’ve relied upon for years, and has made it trivial to clone or restore some pretty weird setups.

Just mind you exclude node_modules and the like from it :)


Do you also run things like 'restic check' on it from time to time? I once managed to corrupt a restic repository after ctrl+c'ing the program and pulling a disk before it was apparently done syncing the write cache (I wanted to leave and catch a bus but noticed that this was still running), not sure if there are other conditions where that might happen. Better to know ahead of time to make a new backup before the primary copy breaks.


I've used restic for over four years. It's been rock solid on all platforms (mac/windows/linux). I've set it up on everything from beefy linux servers to 10 year old Windows machines (scheduled job running for 3+ years, pruning on a separate machine, never had any problems). The support for B2 and S3 compatible backends as well as rclone makes it a breeze to set up. The community and maintainers are also very friendly and helpful. Highly recommended!


How does this compare to Borg, which I am currently quite happy with.


borg supports compression, restic doesn't (and there is no way to add it without breaking backwards compatibility because of how the file format was designed). That's all I need to know for my use cases.


I used to use borg and I migrated away from it to Restic when I somehow corrupted my backup archive. I dunno what I did, but I started getting "non-utf-8 filename" python errors every time I tried to access it. It might have had something to do with the archive being on a removable disk.

Anyway! I'm happier with restic now. It's never crashed for me, and it has native cloud backends. But it's ultimately just another backup application.


And I managed to corrupt Restic archive, so I switched to Borg


So the lesson is, they can all be corrupted. I'm a borg user btw, it's been working fine for several years now so I have no plans on switching.

I use it over NFS.


> So the lesson is, they can all be corrupted.

To which the conclusion is: test your backups!

Still, some might corrupt more easily than others. People just confuse first-hand problems with statistical significance.


Is there any statistically-significant data on which backup applications are the most reliable? I'm not married to restic, but I'll judge it by first-hand experience in the absence of anything else.


Not really, no. It usually goes like in this thread: Someone had a problem with software X and switched to software Y. Someone else had the opposite experience. It's worth pointing out that Borg and other hash-deduplicating backup tools regularly find faulty hardware where other backup tools wouldn't notice the data getting corrupted (e.g. many people advocate for "plain" backup tools like rsnapshot or just having an rsync cronjob, but all of these are unable to check the integrity of backups). Sometimes, users point to the backup tool (sometimes they're right and it's a bug, but usually it's a bad stick of RAM or a hard drive loosing a few bits here and there).


I switched from Borg to Restic because it nicely integrates with cloud backup services. I've been using restic+backblaze happily for the last 2 years.


And it's just a single go-binary, i just trow it on a win/bsd/linux machine create a key, start the backup. I love the simplicity of it, however for more complex plans i use git-annex.


Does restic provide time travel like Borg does?


By time travel if you mean viewing/mounting your repo at various checkpoints then yes.


Borg is much older and has seen production use for decades and had all the bugs worked out. Iirc rustic is still sub-1.0. Not ideal for backup software.


Borg sequentially scans your filesystem for changes and only then starts backing up changed data. (tbf: a lot of backup tools seem to do this)

Restic scans your filesystem for changes, and then also starts backing up the changes it finds in parallel while it is still scanning for more changes.

When you have millions of files, this makes a huge difference.


This is wrong. The difference between the two is that Restic uses multi-threading and Borg currently doesn't. Both just scan the filesystem and add files to the backup set as they go.


Hmm, maybe it has changed or I'm remembering wrong. It's been years since I tried borg, but I remember it taking something like 10 hours to scan for changes, and then another 4 hours or so to actually backup the data.

With restic, it still took around 10 hours to scan for changes, but it was also already done backing up all the data by the time the scan finished.


Speaking of which, borg in Debian is maintained by the Debian Borg Collective, and the nickname of one of the maintainers is Locutus of Borg: https://tracker.debian.org/teams/borg/


And yet again: A list of alternatives.

attic (python) - https://github.com/jborg/attic

borg (c) - https://github.com/borgbackup/borg

bupstash (rust) - https://github.com/andrewchambers/bupstash

duplicacy (go) - https://github.com/gilbertchen/duplicacy

duplicati (c#) - https://github.com/duplicati/duplicati

duplicity (python) - https://github.com/henrysher/duplicity

kopia (go) - https://github.com/kopia/kopia

nfreezer (python) - https://github.com/josephernest/nfreezer

rdedup (rust) - https://github.com/dpc/rdedup

restic (go) - https://github.com/restic/restic

rclone (go) - https://github.com/rclone/rclone

rsnapshot (perl) - https://github.com/rsnapshot/rsnapshot

snebu (c) - https://github.com/derekp7/snebu

tarsnap (c) - https://github.com/Tarsnap/tarsnap

I think there are many more out there (https://github.com/restic/others) - I personally use

  restic
while technology wise (speed, only restore needs password) i would prefer

  rdedup
which is an impressive piece of software but unfortunately without file iterator... :-)


My full-system backups are done with btrfs snapshots synced to an external disk (actually two disks to have two backup locations). It's nice because you can keep the snapshot on your system and don't need the external disk as long as you have enough space, and both filesystems are in almost exactly the same state which makes it easy to mount it for copying a single file or to even boot your system from the external disk.


Same here. Btrfs send/receive made me forget about backup programs. I trigger periodic snapshots with Anacron and never worry about corruption.

I do miss Obnam, nevertheless. The interface was the best.


Might add

zpaq (C++) - http://mattmahoney.net/dc/zpaq.html

to the list.


and especially lzip and lrzip then too.

But zpaq is pure magic...a slow one but with massive compression.


Bup (Python + C) - https://github.com/bup/bup - de duplicated and compressed. Storage format is a fit repository, an interesting choice that lets you restore using just git tools, “cat” and some effort.


So bup has learned to store file metadata (permissions etc), that's neat.


Do you know which of these support a full Windows 10/11 system backup and restore?

I’m trying to avoid the need to reinstall and configure my system (for example, the registry, custom installed and tweaked programs) in case of complete data loss or a migration to new hardware.


My understanding is that full system backups on Windows requires the tool to create VSS snapshots and back up from the snapshot. Any tool that just copies files on the disk won't work.

I use Veeam Agent for this purpose (free, but not open source). It can do full system backups and supports both restoring to the same hardware and new hardware. Restores are done via a bootable WinPE-based image that the tool creates.

One cool thing about it I haven't seen in other backup software is that incremental backups work via a driver that tracks which disk blocks are changed as the system is running. It avoids the need to rescan the disk to detect what has been changed (though it will still do that if the filesystem is modified outside of Windows, eg. if dual booting).

The biggest downside is Veeam's website. It's pretty "enterprisey" and they want you to register to be able to download. I install via the Chocolatey package manager to avoid this. Chocolatey's package source has a direct link to the official installer [0].

There are no ads, nagging, nor upselling in the software itself. I have not seen it making any network connections outside of connecting to my backup target host and the auto-updates server.

I've been looking for open source alternative with a similar feature set, but haven't had too much luck. There's Bacula, but that seems to very much be designed for an enterprise use case.

[0] https://github.com/sbaerlocher/chocolatey.veeam-agent/blob/m...


Windows system backup requires support for correctly handling pretty much every NTFS feature, even (especially) the most obscure ones. While a generic file backup tool works fine for Linux and BSD system backups, it's hopeless for Windows. You need a tool that's specifically designed to do that.


Correct me if I'm wrong, but isn't attic just an older release of Borg?


borg is a fork of attic. Different authors.


attic is python and c

borg is python and c

bupstash is rust and c

Maybe update your list a bit ;)


There are a few downsides to using restic:

No support for compression yet

No support for deleting data from snapshots

No support for continous backups (restic walks the directory tree for each backup).

No support for resilience from disk errors using par2 or similar.

No directory chunking, so backing up millions of files in one directory uses a lot of memory.


A simple duplicity then seems superior, unfortunately. Duplicitly seems solid but a bit complicated, possibly fragile (with so many components fitted together, like par2 stacking on top of any other storage).


Backups done right? That's Tarsnap iirc.

Previously on HN:

https://news.ycombinator.com/item?id=21410833


That's like comparing rsync to google drive. One is an open source tool where you can use whatever back-end you want, the other is a service. (Which is fine, just different kinds of things.)

However, in this case it's the open source tool that has a much easier user interface (I am actually proficient with tar, but still my tarsnap experience is like comparing 'restic backup /my/files --repo /mnt/backupdisk' with https://xkcd.com/1168/)


> One is an open source tool where you can use whatever back-end you want, the other is a service.

What is your definition of "service" that makes tarsnap - a company that asks you to pay them over time to provide an, uh, service - not one?


They probably meant this article is the open source tool and tarsnap is the service


Indeed. Restic is just something you apt install and nobody provides you any service (you have to organise your storage space yourself); tarsnap is not simply free to use for yourself with your own storage. (Not saying it has to be free, but that's what makes it the definition of a service you have to purchase.)


tarsnap is solid but has two major weakspots — it’s not price competitive, at least for the b2c product retailed from the site, and it is very slow.


Is tarsnap too expensive now?

For years we had a running discussion here, where patio11 writes a nice article called something like:

fake patio11> "10 reasons why tarsnap must raise the price and stop using funny units"

And a week later cperciva writes another article called something like:

fake cperviva> "Nah. Amazon reduced the storage price 10%, so I'm reducing the price of tarsnap in 1 picodollar/byte"


Tarsnap works out as ~6$/GB/year for me. That’s for a mostly managed backup service. The only thing missing is snapshot pruning which is slow and a bit of pain due to the way tarsnap’s cache works. Restore is on par with restoring from tape — reliable but slow, but who can really complain about how fast disaster recovery is?

Raw managed storage with rsync.net is 0.18$/GB/year.

Do it all yourself, with the associated peril and time sink that entails, and the disks will cost you 0.04$/GB/replica.

Tarsnap has its place and I’m still a happy customer, but it’s one small part of a wider strategy that includes bulk storage elsewhere — rsync.net with borgbackup and plain rsync, on premises ZFS dumpsters, and offsite drives used like they are tapes.


on https://rsync.net/pricing.html it says 2.5 cents per GB per month. so 0.33$/GB/year.

i have 700G so it would cost me ~ $230/year (and you have to buy a min of 400G)

microsoft onedrive is $70 for 1TB and you don't pay for bandwidth. can use rclone.

you get some other goodies too (office) which i don't use but i'd imagine it's a nice to have for some people.


Is the Microsoft OneDrive is $70 for 1TB a good option for the truly paranoid, or Microsoft will share the content in case they get a court order or some with a tank in their front door?


use restic.


I want to live in a world where tarsnap exists and is priced in picodollars.

I would be sad if either of those ceased.


Tarsnap is not expensive at all for its target audience: folks with highly compressible data. For any data that's not very compressible, it's super costly.

Example, if you are into photography it's not uncommon to generate hundreds of GBs of files _per year_. Only in 2020, I generated over 200GB of photographs. Putting that on Tarsnap would cost me about $60/month. In 5 years time, I could be paying upto $4000/yr. Tiered services like B2 would cost an order of magnitude less.


Why on earth someone would pay 10x-20x alternatives for encryption that these days is available in high quality free open source software such as Restic, Borg or Duplicacy?


For me the one to beat is IBM’s backup utility (known for years as ADSM (adstar storage manager) and later called TSM (tivoli storage manager). I’ve never seen a commercial or open source program that comes close.

I’ve also never seen successful backups anyplace that did not have TSM. Usually the backups are corrupt, or nobody knows how to do a restore, or you need 1000x the storage capacity in order to restore the initial backup and all of the incremental backups until you reach a specific point in time.

At places I worked with TSM it was so simple that individual users could fire up a gui and pull files out of the backup pool.

On the backend we had massive IBM tape libraries and it was hypnotic to watch the robot jet around moving tapes in and out of drives and the storage slots. It never stopped moving either, when backups are restored were not happening it was busy consolidating the tapes, making copies of data from tapes that had been used too many times, or preparing copies to be sent off site. It was a full time job for someone to load new blanks when TSM requested and remove the offsite tapes and put them into a box for fedex to pick up. (The one thing that has not changed is that it’s still quicker to send massive amounts of data by fedex then it is to send it over the public internet)


I have used TSM (or ADSM or Spectrum Protect or whatever IBM calls it this week) quite a bit. The basic functionality and performance are not too bad. However, it clearly shows that the software originates in the 1980s. The client is written in C++ and really likes to leak memory. This becomes problematic when backing up more than a few million files. The official "fix" suggested by IBM is to configure a cronjob that restarts the scheduler once a day (seriously).

TSM also has no support for deduplication, so good luck backing up large variable binary files such as VM images or project files (video, CAD, etc).


I’m pretty sure it did originate in the 80s, it had an earlier name than ADSM, then was rebranded back when IBM was going to split itself into “baby blues”, then Lou Gerstner took over and stopped the split-up. Despite its faults it’s still the best I’ve ever encountered.


Using the chance to ask: what is currently the cheapest, most efficient place for home PC backups (on linux)? Backblaze B2? Something else?


€0.40/TB/month or €120 per 5 years for 5TB: Old phone or a raspberry pi with a hard drive attached located at a friend's or family member's house.

Any cloud service is going to be 5-10 times more at least, last I checked (2019), and that's already a lot better than five years before that (as a ratio of self-hosted to a managed service, so independent of raw storage prices).

B2 and Glacier seem to be some of the cheaper options these days. Backblaze (the backup software) doesn't run on Linux and is closed source but they pinky promise to support "unlimited" backups for $5/month which is a really good deal if you both trust them and run Windows. Tarsnap is S3 pricing plus markup, but what you get is linux support, open source clients from a person the community trusts, write-only keys, and the hosting part is off your hands, and pay-as-you-go, which is quite a unique combination.

Edit: B2 is $5.7/TB/month according to https://news.ycombinator.com/item?id=29209665 (`0.4/70*1000`)


B2 doesn't require any closed-source platform specific software to use, you can just use restic. And I'm not sure why should I care about either software and hardware on their end, they should be basically an external hard-drive to me, don't they?

About the "raspberry pi" thing... this is kind of answer you immediately regret you didn't preemptively dismiss in the question. I mean, it's hard to even decide if the person saying this is serious or not. Like, setting up a backup server at "your friends house"? Really? Is this seriously something that everybody but me does? Should I do that too? Is it considered normal practice in their cultures? Or is it just something that they say, because they like giving advice they don't follow? To me, that sounds just crazy.

Paying under $100/year to know that all of my junk is safely kept somewhere doesn't sound crazy at all, on the other hand. But which is the best option and if they really keep their promises I don't know, of course, that's exactly why I'm asking. Maybe there's some problem in disguise, maybe hardly anybody even uses their services and I shouldn't trust them. I don't know.


> B2 doesn't require any closed-source platform specific software to use

I never said it did, unless you are confusing Backblaze (the name of their backup solution, https://www.backblaze.com/cloud-backup.html) with their separate and much newer B2 service.

> this is kind of answer you immediately regret you didn't preemptively dismiss in the question

Well I'm sorry.

For what it's worth, this is literally the solution I use so I thought I could be helpful by at least including that as a base price point.

> which is the best option and if they really keep their promises I don't know, of course, that's exactly why I'm asking.

I don't understand, has anyone shared stories of paying Google/Amazon/Microsoft/Backblaze/Dropbox for X amount of storage and them not keeping their end of the contract? I understand your question even less now than I thought I did before.


Ok, sorry, I didn't mean to be rude. I'm sure you had the best intentions in mind when writing your answer. The thing is, for me the main reason to ask questions on forums like HN is to work against "unknown unknowns". Things like B2 storage cost can be easily googled if you don't know them, so it makes much more sense to just google them instead of asking here. In fact, that's what is usually expected from anybody asking advice on forums. But the problem is, that in reality there's always more to that than the basic specifications.

Take B2 for example. I know their storage pricing (that's very easy to find on their website), and I know for a fact it's super affordable, compared to other similar services. But I also know, that the "fine print" in their case is just that the upload speed is the bottleneck, which will prevent most users from backing up too much. And the fact that they have only 4 facilities. Is the latter the real problem? Well, I don't know. I didn't hear any stories about them loosing user data, but that might be just it — I didn't hear them. That's why asking such questions on places like HN has value in my opinion.

Similarly with HDD cost. It's kinda obvious that this is the most affordable solution, so if the person asking doesn't know that, it means he didn't do his homework. I don't use my friends' houses for that (that really sound super awkward), but in fact something similar is my current "solution" as well. But it feels like something I should be adviced to stop rather than to start doing. Backup service backend needs maintenance too. And maintaining it doesn't seem like a fun hobby (that's part of a reason why asking a friend to do this for you seems very weird). Backups are something you want to be reliable, by definition. And HDDs are not.

(As a side-note, I also sometimes contemplate if backing up rarely changing info to more reliable storage, like tapes and optic drives is a viable option. It still seems like no, unfortunately. But keeping everything on HDDs that I personally own makes me feel uneasy as hell. Despite being something I do, this is basically a strategy equivalent to "just hoping that everything will be ok". I don't do any real work, to make sure it's the case. I have no idea, when they are likely to fail. I just hope that they don't get broken at the same time, and that's it. I have no idea what the actual probability of that is.)


Fair enough, thanks for explaining! I think this sort of thing helps me read the intention better in the future.


> €0.40/TB/month or €120 per 5 years for 5TB: Old phone or a raspberry pi with a hard drive attached located at a friend's or family member's house.

I like to leave an offline backup in my car, too


rsync.net is awesome. You get a Linux filesystem you can access via SSH. Many tools support it (like git-annex), and since it's ssh-able, you can script anything that doesn't work out of the box.

They also have optional replication, and you can choose the location of both of the servers that host your files. Great support from actual engineers, used by huge companies for their offsite backups, and never been breached (to my knowledge).

EDIT: their features page is a good read. https://www.rsync.net/cloudstorage.html


Sounds very enticing, however $25/m for 1 TB seems rather pricey. Is replication that x2, or is it included?


I'm not the GP. There's a slightly cheaper "borg backup" plan at rsync.net.

[1]: https://www.rsync.net/products/borg.html


Hard drives I guess (happy if someone can correct me on that)


Doing sysadmin/devops for over 20 years now and restic is the easiest and most robust backup tool I've seen.


Robust in what way? If you mean user error then I would agree, it's super easy to manage backups and not do stupid things (I say this without much experience with other tools, fwiw). But as for file corruption, is it much different from the other standard solutions out there like borg or duplicity?


Any opinions on Restic vs. Borg Backup (or Borgmatic), for encrypted backups (over SSH and to USB drives)?


Why not use both? Recommendations are to have two backups anyway, one local and one remote. I use borg to make local backups to a USB HDD, and restic for remote backups in the cloud. Using different software guards against implementation bugs.


Using Borg for more than a year I haven't had an issue yet. In the past I used a similar tool that had error-correction capability but the tool was buggy and slow when compared to Borg.


Both failed me with recovery at some point (corrupt indexes)


I had Borg fail me the same way with index corruption. Since moving to Restic I haven't had any issues backing up or restoring, and it seems quicker too.


both are fantastic. i started on borg, and moved part of my shit to restic so i use both. borg ui is a bit better, but you can normalize that away with a bunch of scripts.


Restic is a great piece of software. I use it for more than two years now to do encrypted backups of my home server into Backblaze B2. Took minutes to set up with a couple of lines of script, works like a charm since then. Highly recommended!


Same here! Really happy with the setup. Got an Odroid HC2 doing daily backups to Backblaze B2. The thing doesn't even sweat scanning around 1TB of data. I find the automated pruning of old snapshots also pretty sleek: restic forget --keep-daily 7 --keep-weekly 5 --keep-monthly 12 --keep-yearly 75


Just a heads up in case you aren't already aware, 'restic forget' does not automatically prune data. You also need to pass --prune or run 'restic prune' later. Other wise, your snapshots are dropped from the index but the data used by them still exists in the repo.


Like others here, I'm a big fan of restic. I use it to backup to backblaze B2, and have systemd timer units to run it daily. I use it with pass (https://www.passwordstore.org/) for secrets management, my wrapper script is at https://github.com/wfleming/dotfiles/blob/arch-linux/home_no... if it's useful for anyone.


Autorestic wraps restic in YAML config files, and for that I am very grateful.

https://github.com/cupcakearmy/autorestic


> Once version 1.0.0 is released, we guarantee backward compatibility of all repositories within one major version; as long as we do not increment the major version, data can be read and restored. We strive to be fully backward compatible to all prior versions.

> During initial development (versions prior to 1.0.0), maintainers and developers will do their utmost to keep backwards compatibility and stability, although there might be breaking changes without increasing the major version.

Hmm not sure I would like to try on a backup tool that might introduce breaking changes that easily


I used to religiously make backups, and I still rsync my homedir onto a server occasionally.

That said, I keep ~/Documents and ~/dev these days in Syncthing directories, and one of my syncthing nodes is an Ubuntu LTS server with zfs, with the zfs-auto-snapshot package installed.

I still run my old backup system periodically (once or twice a month) but I now think Syncthing is at a point of reliability where realtime cross-machine sync is now my primary safety net wrt "the machine in front of me has turned to entropy", versus some point-in-time backup.


From the Syncthing FAQ (https://docs.syncthing.net/users/faq.html#is-syncthing-my-id...):

> Is Syncthing my ideal backup application?

> No. Syncthing is not a great backup application because all changes to your files (modifications, deletions, etc.) will be propagated to all your devices. You can enable versioning, but we encourage you to use other tools to keep your data safe from your (or our) mistakes.


I would argue in conjunction with zfs snapshots (can be every minute), it's safe and the caveat no longer applies.


If you accidentally reformat the ZFS drive, that will delete all your files and snapshots and then Syncthing will replicate those deletions.

I'm really not super familiar with Syncthing, but it sounds like it's niche is availability as opposed to durability.


Even if you format your drive and Syncthing copies over the empty files you will still have a history of snapshots on the other host running ZFS snapshots.


What if you accidentally format the drive on the host with the ZFS snapshots? Won't that result in synchronizing the deletion of those files to all your other devices?


Yes, for his setup (if I understand right). I'm suggesting having two hosts with ZFS snapshots running (independent of each other) with Syncthing between. So if you delete the files on one host and they get Syncthing'd across there will still be a ZFS snapshot history on the other host. I'd also have a cold storage backup on hand that is also ZFS. Having your backup as a regular filesystem is a very nice feature once it comes to recovery.


syncthing has send-only and receive-only folders. You can set a receive-only folder on your backup device and send-only on your phone or whatever. With incremental backups it's not an issue. Syncthing is just the tool to get your data to the place it needs to be. Just like rsync or cp

Anyone taking backups seriously already knows the 3-2-1 rule anyway.


Syncthing frequently stops syncing on my phone and requires me to delete everything and resync. I'm using receive-only folders. I think I'm an exception but it's been a pain in the ass.


If you accidentally delete your data, your data will be deleted.

This is the curse of sharp tools. Don't run unix if you don't want rm to rm.

I still also have backups, and zfs snapshots on a different independent machine.


I'm coming to the same conclusion. I was running Freenas and my file server died and then my files were locked up on that host. Luckily I was using Syncthing to the central file server but I just added new links between individual devices and then it didn't matter that my file server died (except for files that were not on any non-server device). File servers are annoying for disconnected usage. I was using duplicacy for backups but the storage format is annoying (not just regular files) and nothing is more reliable than regular file systems. So now with my new setup I'm using a Rpi4b with NixOS and ZFS data drive over a USB3 dual drive docking station with UASP. I can boot from a USB3 thumb drive (which I do) or an SSD in the docking station and still have another SATA port for the data drive. (The downside of this is RAID doesn't work over USB.)

I'll have two of the same RPi servers in different locations with all software running (like a hot swap) with Syncthing keeping them in sync while ZFS snapshots keeps a history. I can plug in a cold storage drive in the second dock slot once in a while too. I'll have a spare RPi4b on the shelf in case it dies. If my server dies I can take the off site hot backup home and reconfig the network and then it is my primary server. With remote duplicacy backup I'm days away from getting going again. So ZFS snapshots + Syncthing and cold storage is where I'm going (for home use). Also I want to stick with Linux because I set up Freenas 5 years ago and now I forget how to admin it so I'd rather just keep with ZFS and Linux (the zfs send from freenas to zfs recv on Linux works perfectly).


I built our backup service for our users on top of restic! Works great :)


What kind of service, is it a commercial thing I can use or do you mean in a company or so?


It's part of kubesail.com, which helps folks host at home (but works with any kubernetes cluster). We use restic as part of the backup service which launches a container on your machine to locally encrypt and then upload data. Restic works great to de-dupe files and minimize the size of the backup and restore.


One of my favorite things about restic is not about the software, it's the community. People in the forums and bug reports are respectful and decent, and it's fantastic to see.


Restic is great, I use it to back up my Nextcloud data from my raspberry pi to two locations, a separate USB drive and a remote DigitalOcean space. Very easy to use and all encrypted. I wrote a blog post about it actually! https://compileandrun.com/2021-01-31-nextcloud-traefik-resti...


Does restic have a user interface on windows/Mac?


I was wondering how this compared to Bacula, which I setup for a company with significant amounts of data to backup (and which worked great, especially after I wrote a retention management + pruning tool).

Here's an old discussion: https://news.ycombinator.com/item?id=19485783


Can this be used to incrementally backup the whole / (root) directory? Or is this only for non-system folders with data?


> Can this be used to incrementally backup the whole / (root) directory?

That's how I use it (with --one-file-system), but I haven't tried restoring that. I just check that my data is there (e.g. open some pictures) and call it a day. For me, if I need to reinstall the system anyhow (super rare), I'm also happy to just reinstall the OS, do a bunch of apt installs, restore a few files in /etc perhaps, and put my homedir back. So I can't vouch for whether it will store all special attributes on system files, but I would expect that common things like owner/group/mode are there.

Fwiw, I once rsync'd a remote root partition to the current machine and that worked. Would not recommend to depend on this, but it apparently doesn't take much to be able to restore the root and boot from it :). You can also quite easily test this in a VM, if you'd like to make sure it works.


Restic makes it trivial to use the very cost effective Backblaze B2

  export \
  B2_ACCOUNT_ID=123456 \
  B2_ACCOUNT_KEY=DEADBEEF\
  RESTIC_REPOSITORY=b2:myname-restic-myhost \
  RESTIC_PASSWORD=CAFEBABE

  restic backup --verbose --host myhost \
        --exclude /not_this \
        /yes_this \
        /and_this


Duplicity is excellent.

Furthermore, Duplicity is written in Python so you get tracebacks and quick fixes are really easy to do.

How does Restic compares to it?


And how does it compare to duply, which let me boil down my duplicity setup to 8 lines config, 3 lines of globs (2 includes, 1 global exclude) and this line in my crontab:

30 4 * * * /usr/bin/duply myjob backup_purge_purgeFull --force > /tmp/duply_myjob.log


I'm a bit confused. For all the description of different convoluted processes below, why not simply handle backup with something even simpler like Backblaz, SpiderOak or maybe sync.com? Seems a lot simpler than the Restic, Rclone and other methods described here.


I need a backup system in which files are never deleted, even if they are deleted locally. These files should also be easy and efficient to search for; you shouldn't have to go through all previous snapshots, trying to guess which snapshot a file might be in.

Does restic do this?


My backup script runs a 'restic diff' command after every snapshot it takes and puts it in a log file. It's basically a list of every file added, removed, or modified in that snapshot.

Since I have a directory full of these logs, I can search for deleted files by doing something like 'grep "^- " restic-diff-*.log'

I'm not sure if there's a better way to do this, but it works pretty well for me.


Thanks, this might work.


Yes on deletes, backups are incremental. Not sure about how easy to find files but I'd be surprised if there weren't commands for doing this.


I'm not sure, but seems to be an expensive operation on snapshot based architectures - have to search through all snapshots.


If you know the path, it would be

    $ restic mount --repo /mnt/backupdrive/myrepo /mnt/backedup
    $ echo /mnt/backedup/snapshots/*/path/to/file
And it would print which snapshots it is contained in.

If you don't know the path, I don't know how long `find -type f` takes so you might be right about that being inefficient. It is certainly not a use-case that I think restic ever had in mind.

I also don't know of "backup" software to solve this, it seems a bit out of scope for most of them (they are meant to have a backup copy of your disk, not be a file manager with history). You might have more luck with tools like rsync (or Toucan, from the good old days where I used Windows and Portable Apps), those can certainly create only new files and never delete deleted files, but then you typically don't get encryption and deduplication.


Thanks, I’ve also thought that something built on rclone or rsync might be more suitable.

I haven’t yet moved to restic, so it might be that find -f is good enough for me.


It's a backup tool, so it won't delete anything unless you tell it to.


Yes, I should have specified that the emphasis is on the use-case of finding and restoring old files.

Yes, a backup tool never deletes by itself, but the standard way of using such tools is that you keep the last n snapshots - this is just delayed sync.

I don't know anything about backup architectures, but is it possible to be more efficient with storage and indexing(for search) if your use-case includes finding files from 5 years ago, but not necessarily recreate the whole filesystem as it was on some date?


> Yes, a backup tool never deletes by itself, but the standard way of using such tools is that you keep the last n snapshots - this is just delayed sync.

Restic, Kopia and most other modern backup solutions allow you to to define the retention policy. Usually by specifying how many hourly/daily/weekly/monthly/yearly snapshots to keep. They don't just remove all snapshots that are older than n.

They also usually let you restore any given file, from any given snapshot, without having to restore everything. And as long as there's a functional index, this shouldn't be terribly slow either.


Thanks.

If a file existed for 3 days and was then deleted, it might make it to 3 daily snapshots, and no weekly, monthly or yearly snapshot.

Even if you’re lucky and the snapshots are not deleted, if you don’t remember when the file existed, currently you need to search through all snapshots. There’s no index or any data structure to speed this search up.

I agree, restoring is fast once you’ve found the file.

Currently I use backblaze, and it does have infinite retention and snapshots, and restoring is fast. But I’m looking to migrate to something that gives me search as well.


My current backup system involves having all my data on ZFS and then using sanoid for snapshots and syncoid for syncing those snapshots to my server that also runs ZFS. Would I gain anything significant from switching to something like Restic for backups?


I use syncoid to sync to a server that only has 1 open port, ssh, which only allows one user, and that non root user runs rbash and can only do zfs receive, not zfs destroy. Hopefully that stumps the ransomware people trying to delete backups.


Somewhat offtopic but this reminds me: what's the best way to backup a Raspberry Pi? Mine has a 128GB SD card and that makes backing up the whole thing kind of silly. There has to be a better way.

I've lost a few months of data twice because of my ignorance


I like rdiff-backup. For me, it's a good middle-ground between rsync and a full backup solution.

https://rdiff-backup.net/docs/features.html


rdiff is good but if I had to pick between the two I'd still use rsnapshot.


Not unless they've improved it substantially in the past decade.

A startup I worked for [0] shipped a backup/"DR" appliance using rdiff-backup behind the scenes that I had the pleasure of inheriting ownership of.

What rdiff-backup was good for was creating the false impression that you had working backups you could restore from. But once your available disk space for backups filled up, which is kind of the whole goal of a backup system; accumulate as many revisions going back as far as you have space for, the thing paints itself into a corner you can't recover from without creating potentially huge amounts of free space first.

Here's why:

1. The backup tree is modified in-place in the course of performing a backup. If the backup is prevented from finishing for any reason (admin/user cancelled, ENOSPC, power loss, backup source became unavailable, etc), the backup tree is left partially in the new revision and partially in the previous revision. Any subsequent operation, restore or backup, must first restore the backup tree to its previous version, using the same primitive restore algorithm rdiff-backup uses for general restores.

2. Unless you're restoring from the latest revision requiring no reassembly from differentials, the restore algorithm requires enough free space to store up to two additional copies of any given file having changes it's reassembling. This doesn't even include the final destination file, if restoring into the backup filesystem (as it does when recovering from an interrupted backup, mentioned in #1), you need space for the third copy too.

Maybe they've fixed these problems since my time dealing with this, it's been years.

I ended up writing a compatible replacement for my employer at the time which used hard link farms to facilitate transactional backups requiring no recovery process when interrupted. This also enabled remote replication to always have a consistent tree to copy offsite while backups were in-progress, something rdiff-backup's in-place modification interfered with. As-is you'd end up just propagating a partially updated backup offsite if it happened to overlap with an ongoing backup.

My replacement also didn't require any temporary space for reassembling arbitrary versions of files from the differentials. So it could always perform a restore, even with no free space available. I even built a FUSE interface and versioned backup fs virtualization shim for QEMU+qcow2 atop those algorithms. But it was all proprietary and some of the stuff got patented unfortunately.

I wouldn't consider rdiff-backup usable if it didn't at least have the ability to restore without free space yet. At least then it might still be able to do its rollback process when ~full, assuming it's still doing the in-place modification of the backup data.

Edit:

In case it's not clear from the above; it's particularly nefarious the way rdiff-backup would fail, since it was typically unattended automated backups that would fill the disks, leaving the backup tree in a rollback-required state to either run another backup OR restore. The customers usually discovered this situation when they urgently needed to restore something, and rdiff-backup couldn't perform any restore without first doing the rollback, which it couldn't do because there was no space available. Not that it could even perform a differential restore without free space, but the rollback-required state almost guaranteed a differential restore was required just to do the rollback.

Back when I was implementing the replacement it was such an urgent crisis that I was logging into customer appliances to manually restore files from sets of differentials without needing temporary space, using unfinished test programs before I had even started on the integration glue to streamline that process.

[0] https://www.crunchbase.com/organization/axcient


Basically every back-up system does incremental backups, because indeed uploading or copying everything every time is kind of silly as you say. It's easy to go here "restic does that" but basically every other tool does that as well, so what "the best way" is depends on what other things you need, not on incrementalness.

Personally I would use restic in your situation because I'm familiar with it already and it does what I need (I particularly like the encryption aspect), but that's not to say that borg, bup, rsync, or other tools couldn't also fit your needs.


For small systems in my home lab, I use rsync from the small devices (e.g., Rpi, Intel NUC, etc) to the big system with a couple of 6TB drives.


I do this too. And you can combine rsync with gocryptfs (in reverse mode) to get strong encryption of your backups as well. This is especially important if you are storing the backups on a remote/untrusted device.

https://wiki.archlinux.org/title/Gocryptfs#Usage


I use rsync to backup my desktop (which is mostly used for development) pretty much just rsync /home to a big enough external LUKS drive.

Has saved my bacon a few times, I don't care about incremental snapshots or delta's (though I've used both rsnapshot and rdiff successfully in the past), I'm covering the "if the SSD blows up, how long to chuck a new one in and be back up" case not the "I might need that file from 6mths ago".

I also have syncthing setup via /home/<user>/Shared/{Personal,Work}/ on every machine and important stuff I just chuck in Shared/ and forget about/it's available wherever I need it at any point.

I've had really ornate bulletproof snapshot based backups but honestly for my particular use case they where more hassle than they were worth, rsync does what I want every time and has never let me down.


Use Borg, and copy the whole image. Borg will compress and deduplicate it, so you only backup the changes each time. You'll get point in time historical backups, so if you accidentally delete stuff, you can find it again. Useful because data loss isn't always just hardware failure!


A while ago, I wrote a blog post on how I prepared an automatic backup solution: https://kavela.ch/article/backups-on-linux.html


Truly suprised not seeing burp mentioned anywhere in these comments. Super reliable, laptop friendly and with lots of Nice features.

https://github.com/grke/burp


use it for the last three years. good so far!


What if I don't want to encrypt? I would like tarsnap/restic/whatever (but unlike tarsnap, I want to be able to self host) and I want it without encryption.


Use a default password, eg, 123.


Restic is great. Used to use Duplicati because of its GUI, but Restic is far faster, reliable, and restoring files via FUSE is extremely easy to use.


I’ve been using Backblaze to back up my mac for the last four years.

Would there be any advantage for me to look into a self-hosted alternative like Restic?


Emborg with borg is my favorite combo right now.


Vorta is another nice Borg frontend.


I use Emborg on servers with no GUI.


Right. I didn't mean to suggest Vorta as a 'competitor' to Emborg, just as another nice interface to Borg.


Care to elaborate on how it compares and what emborg is (I haven't seen that mentioned before)?


It’s just a front end for borg. Makes things a bit more user friendly.

https://emborg.readthedocs.io/en/latest/


I did not see anywhere mention of deduplication - is this implemented in Restic?

(I use Borg and the feature is an absolute must for me)


Yes. There's deduplication. It's using a rolling hash for chunking and then deduplicates those chunks. So every snapshot it always a full backup.


To be clear, "every snapshot [is] always a full backup" does not mean that you have to upload your whole terabytes-large drive every time. I guess what you mean is that each backup stands on its own, referring to the chunks it needs. I don't think that's what people mean when they say that each snapshot is a new full backup but rather that this is the definition of it being incremental.


Correct. Thanks for clarifying.


IIRC restic operates at the file level on the source side and then stores data in blobs, and unchanged blobs are deduplicated (it operates as a content addressed store, kind of like git), which means iterative backups are possible.

EDIT: see "blob" https://restic.readthedocs.io/en/latest/100_references.html#... and "efficient" https://github.com/restic/restic#design-principles


Yes, it deduplicates but does not compress. Since most large files (mp3, jpg, mp4, docx, etc.) are already compressed, this is not a big issue but it's not ideal either.

If I remember correctly, another tool either does both or does compression instead of deduplication, this might have been borg or bup. In case that's something you care about.


Typical borg run:

                       Original size      Compressed size    Deduplicated size
This archive: 167.99 GB 136.78 GB 53.85 MB

All archives: 2.93 TB 2.38 TB 133.66 GB

Sorry about the formatting. But the compression is not completely irrelevant. Dedup of blocks between files and backups is of course the absolutely most crucial part though.


Thanks for the real-world stats! I must say it took me a bit to read it though, let me format it for others (hopefully also narrow enough for mobile):

                    compress
               data |    dedup
    1 backup:  168G 137G  54M
    Σ backups: 2.9T 2.4T 134G
Please correct me if 'one backup' and 'all backups' is an incorrect interpretation of 'all archives'. I wasn't entirely sure what you mean by that but I think I get the point.

So in conclusion, adding compression saves about 17% (a $10 monthly bill would be $8.28 instead, if you pay per GB) for you.


Another friend has a stat of 30%, by the way.

Even if I'd get the better of the two values, I don't have enough systems that a backup being 2/3rds of the original data size reduces the number of backup disks I need to buy, but it's not insignificant either.


Compression may help but that would be best case 10-30% (which is worth or not depending on the use case).

Deduplication means ~20x smaller backups so this is absolutely key for me.


I use restic to back up my 1TB external HDD to Backblaze for 5$/mo. It's great.


What about compression?

What does restic use under the hood?


Can you backup your C-drive?


everything works fine in wsl2 too including the fuse mounts. go golang!


Why wouldn't you use tarsnap instead?

https://www.tarsnap.com/


Well, for one Tarsnap is incredibly expensive. I get 6TB of cloud storage for roughly $65/year with unlimited transfer. Tarsnap would be in excess of $1500 per month.


Holy crap. Yeah. 3TB, $750/month?!?


The tarsnap subthread is here: https://news.ycombinator.com/item?id=29209658




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: