The Changelog: Software Development, Open Source - Restic has your backup (Interview)

Episode Date: April 2, 2021

This week Alexander Neumann takes Jerod on a tour of Restic, the world-class backup solution that's fast, secure, and cross-platform. We discuss why he created Restic in the first place, how (and why ...you should) you use it, some of its more interesting technical bits, lessons learned over the years building and maintaining a community, and more of course.

Transcript
Discussion (0)
Starting point is 00:00:00 This week on The Change Log, we're joined by Alex Neumann. Jared's talking about RESTIC with him, the world-class backup solution that's fast, secure, and cross-platform. We discuss why he created RESTIC in the first place, how and why you should use it, some of its more interesting technical bits, and lessons learned over years of building and maintaining a community. Huge thanks to our partners Linode, Fastly, and LaunchDarkly. We love Linode because they keep it fast and simple. Check them out at linode.com slash changelog. Our bandwidth is provided by Fastly.
Starting point is 00:00:30 Learn more at fastly.com. And get your feature flags powered by LaunchDarkly. Check them out at launchdarkly.com. What up, friends? You might not be aware, but we've been partnering with Linode since 2016. That's a long time ago. Way back when we first launched our open source platform that you now see at changelog.com, Linode was there to help us, and we are so grateful.
Starting point is 00:00:57 Fast forward several years now, and Linode is still in our corner, behind the scenes helping us to ensure we're running on the very best cloud infrastructure out there we trust linode they keep it fast and they keep it simple check them out at linode.com change log I'm joined by Alex Neumann, who is the maintainer of the RESTIC program. Alex, thanks for coming on the changelog. Yeah, thanks for having me. It's an honor to be here. It's an honor to have you. I should say we're having you back, sort of, because you were on go Log. Yeah, thanks for having me. It's an honor to be here. It's an honor to have you.
Starting point is 00:01:45 I should say we're having you back, sort of, because you were on Go Time. Yeah, that's right. Episode number 48. You haven't been on The Change Log before. But I have to tell you that that episode, it resonated with me because you said something on that episode, which I think I've quoted half a dozen times since then, probably on The Change Log, maybe even without attribution. So I'm here to give you your due credit today,
Starting point is 00:02:06 which you said, nobody wants backup. Everybody wants restore. Yeah, that's a great quote. It's not by me. It's by the Admin Zen, which is a collection of sysadmin-related things that you should do. And this resonates with me quite well
Starting point is 00:02:23 because really a backup isn't worth anything if you cannot restore so you need to like practice this a bit and do it regularly to make sure that in the event you need something that you can really restore it not only that but backups are kind of a pain in the butt right oh yeah it's kind of like uh the saying what's the saying code is liability features are assets or something like this it's like what you really like the saying, what's the saying? Code is liability, features are assets or something like this. It's like the code is actually a problem that you're going to have to deal with and maintain and et cetera. The feature is what brings value.
Starting point is 00:02:53 And the same thing, like the restore. The restore is the value. But the backup itself is kind of a liability at the end of the day, isn't it? Yeah, and you never know if the backup will restore correctly until you like really do it and most people tend to only do that when they need it and this might be too late already so yeah we're trying to change that with the project yeah so you've been doing this project for a very long
Starting point is 00:03:17 time like i said you're on go time number 48 and i think they're in the 200s now or close to it so that was 2017 years ago and you were already working on RESTIC for a while at the time. Yes. So it's been around. Tell us about the genesis of this project and why you decided,
Starting point is 00:03:33 you already gave a little bit of the why, but what was going on when you decided I'm going to solve this problem in the open source world? Initially, I tried to avoid having to solve this problem for myself because I cannot really do everything. So I had to look around.
Starting point is 00:03:49 And at the time, I was in need of a backup program, like really like a few years before. I started RESTIC in 2014. And I've been thinking about doing it like since back in 2012 or something like that. Wow. Because at that time, we threw money together with a few friends and bought like a small server and hosted it in a data center to use as a backup box.
Starting point is 00:04:13 But the thing was that there were several administrative users on that system. So it was a bunch of friends, but some people I knew better of these friends and some people i don't know that well so it was always a concern that when i when i leave my data there will it be like secure because everybody with administrative privileges could obviously be like deleting it but on the other hand it was my like my personal data like financial statements whatever and i was concerned that whenever there's another administrative user
Starting point is 00:04:46 that they can like access my files so i had a look what other backup solutions were around there and it basically fell into two categories the one category was like the enterprisey thing which means that there was like a backup demon and a distributed system of agents and they are meant to backup like servers. But I would like to just backup my like working directory on my personal mobile machine and so on. And this was like too overblown. And they also tend to like trust the central server
Starting point is 00:05:19 with everything that the data is just collected by the agents and the central server will just collect the data and store it somewhere but the threat model for these implementations does not include like another administrative user on the central machine that is potentially trying to access data and on the other end of the spectrum where tools like i think opnum was pretty popular at the time which is a backup program that does everything what that encrypts everything before sending it to some storage location but it depends on gpg and what's really really slow so even at time I had like a fast machine but it was unable to like saturate my upstream bandwidth so the problem with backups
Starting point is 00:06:06 is that they need to be in my opinion really really fast and not disturb any operation because whenever there's something that makes backups harder or makes it increase the the amount of friction that i need to go through in order to do a backup, then I won't do it at some point. And then I will not have the version of the file that I need right now. So the backup process itself, saving data somewhere needs to be as frictionless as possible.
Starting point is 00:06:34 And some of these tools did not satisfy my threat model and other tools were just like too slow because Opnum, I think at the time had a great design, but it's depending on calling the GPG binary every time a file is to be encrypted. And this was not enough performance for me. So it's worth noting, you're saying things like threat models and administrative privileges and multi-users.
Starting point is 00:06:58 And so you have a security background, you're a penetration tester, just letting the listener in on that. And so I assume RESTIC has a lot of security things a security background you're a penetration tester just letting the listener in on that and so i assume restic has a lot of security things you know baked into it or at least like that mindset is part of what restic was from the very start is that fair to say oh yeah that's really just i'm working as a penetration tester so i'm used to like breaking things at work and in the evening i'm like a recreational programmer so i'm trying to build things whenever
Starting point is 00:07:26 i'm not on the clock so to say but restix design decisions are heavily influenced by what i see at work every day and i also took my colleagues at work into discussions about the design and so on so restic has an explicit threat model for example because it's always very important to let users know what restic takes into consideration in terms of the threat model like what can it guarantee but what maybe it cannot guarantee that's also very important so it's very intentional about that why don't you go ahead and give a few of the other things that restic tries to do right out of the box that you think are like core to what makes restic restic yeah what makes restic restic that's a great question actually what makes restic restic is
Starting point is 00:08:10 that it's really fast it tries to to maximize maximally use all the resources that are available but it tries to do that without like shutting the machine down we did that initially on accident we can talk about that in a bit if you like the other thing is that rest stick must be easy to use that's really important because as i already said whenever there's like friction when i have to look something up in the main page and i'm not able to find it then like like the command line of tar for example is really awful to new users and whenever you need for example by restoring an important file and your boss is on your back and breathing into your neck and then you have to look up what what the entire command line is that is just not not gonna work with with backups so it must be really easy to use and we're
Starting point is 00:08:54 still using this to improve the workflow whenever we add a feature or correct something we make sure that it's what how does this feature looks like for a new user is are they able to understand it or is it too complicated and for each option or each flag that we add to the rest stick program we always talk about is it really necessary to expose this flag do we really need that or do we like want to keep it internally and if there are some some power users who want to change this this constant for example then they can just easily rebuild the program. And this is the second important thing. The other thing is that RESTX makes not many assumptions
Starting point is 00:09:34 about the location at which data is stored. So this may be used, for example, on a shared machine somewhere, on a virtual machine where somebody else is a system administrator or so and i cannot prevent as as the program i cannot prevent that somebody deletes my data but i must be able to as a user detect that data is missing this is not really this goal cannot be reached completely whenever there's somebody else the system administrator but i need to know whenever there's a modification that would prevent me from properly restoring files and this is like the deliberate attacker model and on the other hand we've built several layers of failed saves into restic so that users are able to recognize whenever, for example, their machine has a problem. And we have a dedicated label on GitHub whenever RESTIC discovers that somebody is using faulty hardware.
Starting point is 00:10:32 And I think the last time I looked at the list, there was like seven or eight cases where we discovered broken hardware in the wild. So RESTIC has several layers of fail-safe that users can check themselves. And you can also, for example, order RESTIC to download and check everything from the repository and see if everything is in order. Very cool. So I do want to hear that story. But first, what does it look like to use this program? So it runs on Linux, BSD, Mac and Windows, cross-platform thing.'s written in go so it has that single binary that single executable in terms of installation and you know getting restic onto other machines it's very simple in that way it just has to be like executable on your path and
Starting point is 00:11:15 you can run it but let's just say i wanted to back up my laptop's home directory what would that look like if i had nothing i hadn't done anything yet yeah it's basically a two-step process the first thing is like similar to git which restic takes heavily inspiration from you have to initialize a repository so we try to use the same or similar words in the same or similar meanings so that people don't have to like learn a completely new vocabulary and the first thing is that you need to tell restic where to store the files for example you would like to use a folder on a usb drive or something like that and then you would like you would call restic dash dash repository dash dash repo i think it's called
Starting point is 00:11:57 and tell it to like store the files at slash srv slash external underscore data or something like that and then you tell it this to run the sub command init and that's about it and then it initializes repository and asks you for a password twice and this password is very important you can also give it to the program via for example a password file or call an external program to get the password or use an environment variable or something like that. And this password is the most important thing about the repository, because if you ever lose it, then you won't be able to access any data. I made sure that this assumption holds.
Starting point is 00:12:37 And yeah, this was the first part. And the second part is just call restic-repo with the path again, and then tell it to backup. This is the other subcommand that's important here. Slash home slash Jared, for example. And then it will just start working. Then you go ahead and just run that on a schedule and... Yeah, basically.
Starting point is 00:12:55 You have to supply it with the location of the repository where the file should be stored and the password that the repository uses. And that's it. Nice. So what about destinations? Are there lots of different ways? Maybe you can SSH to another machine. Maybe you can connect to a cloud account.
Starting point is 00:13:12 I'm sure there's got to be concerns for those kind of things. Like where can I back up to? Got my NAS on my home network, etc. Yeah, exactly. We have a list of built-in backends that we support. There's the local file system, which is the easiest one. And the other one that I had in mind when I started the project was SFTP, which means that you can use, for example, your local NAS
Starting point is 00:13:34 or some virtual machine somewhere in a data center, as long as you're able to SSH into it and run an SFTP server. There's no need for like a server-side component or anything like that. So other backup solutions, we can talk about interesting alternatives later maybe. Other backup solutions require to install a server-side component
Starting point is 00:13:55 in the roughly correct version to be fast or responsive and RESTing doesn't need that. So like we have like a list of built-in things. We've talked about the local folder and the SFTP server and I think four or five different mainline cloud storage backends like Backblaze B2 or Google Cloud Storage, Amazon S3, and Microsoft Azure. These are the built-in ones. And we also support using another popular open source program
Starting point is 00:14:26 called Rclone, which you can just use as a backend for RESTic. It has a special built-in mode to be used as a RESTic backend. And then you can basically use any cloud storage service that is supported by this program, even obscure ones like FTP or something like that. So these Arcane protocols are still supported via Eklon. So that's interesting that you mentioned some of the cloud providers,
Starting point is 00:14:49 because when I see things like RESTIC, and let me just say, I've never spoken to a RESTIC user who hasn't just raved about it. I mean, it's beloved by its users. And I always think like, how can we bring this to the 99%, you know, because you have like the 1% of people who understand a terminal and these things,
Starting point is 00:15:08 you know, and maybe it's more than 1% of the humanity, but probably not more than 2%. And it's like, this is such a valuable thing, you know, like a reliable, securable or secure,
Starting point is 00:15:20 et cetera, et cetera, well-written cared about program. That's going to back up your files and really do it well how can we get this to more people and then i go to thinking well most of the people who want that but don't have the skills are using something like a backblaze or a insert your cloud provider here but they're paying them to do that and so i'm curious when you say you have like a backblaze interface or backblaze adapter what exactly does that do and why would i want to plug into a backblaze don't they back up everything
Starting point is 00:15:49 for me anyways like via their built-in tools backblaze the company has has different services as far as i know and one is the the popular backup program and another one is just a simple like blob storage and they are using their high availability multi-location storage thing and you can just use it to to store files and this is what restic uses it had nothing to do with the backblaze backup service gotcha so it's like a it's kind of like a power user feature of backblaze yes something like that it's like we just want their cloud storage i don't want all their other things cool and there is even a commercial offering of somebody who's written a very nice local web ui for rustic which is even compatible as far as i know compatible with like end users who are not
Starting point is 00:16:38 that technically savvy and it's called relica and it's written by matt hold who's also written the caddy web server and they are offering it as a as a subscription model and what what they also added was yeah distributed backup storage among a group of friends so that you can join and buy the subscription i think and then you can share your files with all of your friends and you store some files for your friends. So you have a distributed backup in your local cloud of friends, so to say. Right, which is kind of where you started with this, right? When you were back in 2012,
Starting point is 00:17:15 you wanted a server that you and your friends could all share and could back up to there. Exactly, but with the subscription, I don't think you need a server. It'll just distribute the files among your peers. But they also offer upload to cloud services and so on. So that's cool. Have you ever considered that for yourself and RESTic?
Starting point is 00:17:35 I mean, we're going to talk about some of the open source, community, sustainability, etc., which ultimately goes to like, hey, valuable thing. Why can't you extract some value that you're putting all this value into the world? And here we have some other open source people doing that with RESTIC, which is awesome. Have you considered anything like that for yourself? Or you're just happy to hack on it, nice and weekend style and keep your day job? What are your thoughts around that? I think it's twofold. The one thing is that as soon as I'm doing RESTIC as like a full time
Starting point is 00:18:02 job, then it may be not so interesting anymore to work on it because it's a job. And at the moment I can decide that, okay, in this evening or this week, I don't have any time for RESTIC and just let other people, which we have right now, take care of the project. And on the other end, IT security pays really well.
Starting point is 00:18:21 So this is, and I really like my day job and the company I'm working for. So, um, this was not an option for, for me right now. It seems like also I'm not like the entrepreneur kind of person. Uh, so I'm more engineer, I think. Yeah. And I'm happy to the, that there are other services who offer like, uh, restic support and so on. Yeah, that's really cool. I think it's well said. I mean, penetration testing is really kind of like, I've done it some. I used to do it back in college and right out of college,
Starting point is 00:18:50 I did some penetration testing on contract. And it is really kind of like a game. I mean, it can be fun. It also can be a grind, right? Yeah, you have to write a report at the end, right? Yeah, exactly. The report at the end. If you can get someone else to write that for you,
Starting point is 00:19:02 then it's just fun. The whole thing is fun. But if you enjoy your day job, like you said it's good pain so you can live a quality life off that salary and keep it fun and free and hobby you don't risk ruining it by making it your job so exactly yeah i also like explaining things to people and usually our customers are very interested in what we find. So this is a very satisfying job, at least for me. in production at any scale, here's how it works. LaunchDarkly enables development teams and operation teams to deploy code at any time, even if a feature isn't ready to be released to users.
Starting point is 00:19:50 Wrapping code with feature flags gives you the safety to test new features and infrastructure in your production environments without impacting the wrong end users. When you're ready to release more widely, update the flag status and the changes are made instantaneously by the real-time streaming architecture. Eliminate risk, deliver value, get started for free today at LaunchDarkly.com. Again, LaunchDarkly.com. So Alex, you described to me how you use RESTIC. We haven't talked much about how RESTIC accomplishes what it does. We talked about it's written in Go.
Starting point is 00:20:36 It's a single binary at the end of the day. So distribution is somewhat simple. But how does it work on the inside? Explain to us a few of the internals that are interesting yeah let's do that the first thing that restic does whenever it sees a new file that it hasn't seen before is that it will just open the file and read all the data and it will then try to cut the file into so-called chunks which is some some data blob in between like 512k to like i think four or four or eight megabytes which is the largest size and it'll recognize these chunk install them separately
Starting point is 00:21:13 independent of the file so whenever you like you have a file that it's a copy of another file it will recognize that and it will also recognize whenever there is like this is a file a log file for example and a day later it's res6c is the same file but there has been like 100 megabyte appended to it then it will not store the first part of the file again but it will see that these chunks have already been stored so in this in this case it will do a so-called deduplication of all the data that is stored in such a repository. So this is really interesting because the algorithm that we're using to cut parts is able to recognize parts,
Starting point is 00:21:55 even if some data has been inserted at the beginning of the file, then you will just have changes in the first chunk, but all the others will still properly deduplicate so this is different from from most other backup solutions which work on either complete files or like strict one megabyte boundaries of these chunks and when restic reads a file it will see that which which chunks are the file consists of and it will upload only the new chunks that haven't been seen before so you have like a global deduplication within one repository which is very space efficient and
Starting point is 00:22:32 basically if you have a backup which contains not much changed data then you will only have to store the differences to the previous version in contrast to the other backup programs that have been out there for a long time restic doesn't distinguish between full backup and an incremental backup in this case every backup restic stores is independent of all the others so that you can just restore the backup because it just consists of a list of files and a list of chunks that the files consist of so you need to do an operation called prune which does a bit of chunks that the files consist of so you need to do an operation called prune which does a bit of garbage collection whenever you remove a snapshot according to a
Starting point is 00:23:10 policy for example then it needs to look up all the all the chunks that are still in the repository and remove the ones that aren't used anymore interesting so you you get with a restic repository backup you have a, you have snapshots effectively each time that RESTIC is scheduled to run. So if you schedule it once a day, it'll have 31 at the end of January, right? Snapshots of what your files look like on disk at the time it ran.
Starting point is 00:23:39 Exactly. But it does do incremental insofar as it's saving differences between those snapshots but you can also maybe i didn't track you said it's not incremental but it sounds like it is yeah maybe explain it again it sounds like it is because let's say let's say you created a picture at the first of january maybe you use eve celebrations uh something like that and you store it in a folder that is saved by restic so at the first of january restic will read the files split it into chunks and store these like five chunks somewhere in
Starting point is 00:24:11 the repository let's talk about that in a bit and whenever on the subsequent days whenever it sees the file it will first recognize that it has seen this file this exact file before so it will not store it again but just use the list of blobs the file consists of from the previous backup done at the first of january and whenever you like modify the file for example let's say you add like a fancy border around it or something like that and save it again and it will recognize that the file has been changed and it will read the file again and a picture is not a good example here because let's say only the beginning of the file and the end of the file have been modified and in the in the
Starting point is 00:24:51 middle the like 20 megabyte file and the 15 megabyte in the middle are completely unchanged the joystick will read the file see that just the first the first few chunks at the beginning and the last few chunks at the end have been modified. So it will make a new list of chunks the file consists of, and it will only store the new chunks that haven't been stored before in the repository. And each of these snapshots are completely independent, which means that RESTIC stores all the metadata information
Starting point is 00:25:22 for the file on the 1st of January, which means the file name and the modification timestamp and the list of chunks it consists of. And when you change it on the 5th of January, it will also store this metadata information, which means the file name, the new modification timestamp, and the new list of chunks it consists of. Gotcha.
Starting point is 00:25:41 So the RESTIC repository really stores a few different things the first is the arbitrary number of chunks and the other is like metadata information for files and folders and the third is the snapshot information when was a snapshot made and which metadata does it consists of yeah so you kind of the best of both worlds because with an incremental backup, generally you have your last full backup, right? And then you have 95 incrementals since then, or however many there were. And in order to get to number 94,
Starting point is 00:26:14 you have to have the full, and then you have to also be able to run, usually, in order, those incrementals to get to the point where it is. But this has the advantages of incremental insofar as you're storing incremental changes, or you're storing just the new chunks, or the changed chunks,
Starting point is 00:26:30 but the incremental backup itself is not incremental because it has the metadata it needs. That's pretty neat. Now I track you. Yeah, and whenever you need, at least on macOS and Linux and BSD, whenever you need a file, but you're not really sure which version of it that you need,
Starting point is 00:26:50 then you can just use RESTIC mount and have a FUSE-mounted virtual file system. And you can browse all the snapshots and all the files in there. And you can use your regular shell functions like find and ls and du and so on to get to the file version that you need. And it will only fetch the data
Starting point is 00:27:09 from the possibly remote repository that it needs at that time. So it's really fast. It also has a local cache of metadata. But whenever you open a file, it will only pull down and download the chunks that are needed to like fulfill the user's request and show it the picture for example gotcha so the backup repository is very much
Starting point is 00:27:34 a restic thing in terms of it's not you're not mirroring a file system onto a backup yeah in other words you need restic to restore like if restick backs it up you need restick to restore yes that's right uh right now by the way we've achieved complete version compatibility even with the first released versions of restick so you can still use a very old version of restick to restore a repository that has been created recently with the recent version and vice versa and what we also have and which was really helpful because people started to re-implement the repository algorithms that we use in other programming language languages we have a complete specification
Starting point is 00:28:15 written down as a markdown i think markdown document which is completely independent of the implementation of restic itself This was very handy to have. I've started with RESTIC and implemented the chunk cutting algorithm and everything. But then I sat down and wrote the first version of the design document, which is, as I said, independent of the implementation. It is really valuable to get back to that and improve the wording and so on, and also show it to other programmers who are interested in understanding the data structures involved. So we can just point to the document and say, okay, this is the set of vocabulary that we use.
Starting point is 00:28:57 These are the data structures. And at the end of the day, a RESTIC repository is just a collection of file and folders. And there are files in there consisting of the data chunks. There are files in there consisting of the metadata chunks. And there are a few other files, like, for example, for each snapshot, there is a small file that contains the timestamp of the snapshot and the user and which metadata and files and folder structure it references. And you can start from that and then look at the implementation
Starting point is 00:29:28 and how RESTIC does things. This was one of the very interesting discoveries when I discussed it initially with my friends and my co-workers that the most important thing about the RESTIC project is not the implementation itself, but it is the repository format. Because people or even users expect that they can restore their backups even like 10 years or 20 years from now.
Starting point is 00:29:54 So the most important thing is not what features does the backup program have, but how good is the specification of the storage format. And there are toy implementations that reimplement all the things needed to access data in a RESTIC repository from scratch just by using the design document. And this is somehow like the,
Starting point is 00:30:14 I think there is a FreeBSD manual which explains the design of the operating system FreeBSD from the ground up, which is, I haven't read it, but it's on my bucket list to do that. And this is something like that for the repository format. Well said. You obviously saw where I was driving to when I said you need RESTIC to restore RESTIC, because if restore is the feature and it's not stored as like an operating system level primitive, although it is at the end of the day, but it has its own format, then you obviously need,
Starting point is 00:30:43 you know, you want RESTIC to be around, but it sounds like you guys have well prepared for a backwards compatibility and even this specification where, you know, you could disappear, RESTIC could disappear. It could be completely changed or something, but somebody could go out and re-implement the restore because it's been documented so well.
Starting point is 00:31:00 So that's spectacular. Yeah, this was really important for me because the realization that the storage format is more important than the implementation and also the community this is something that's um it's it's once once you thought about it it's easy to see in hindsight but to arrive at this point was uh yeah i had several discussions with many people before realizing it and people in the we have a forum where where people can like ask questions and usage stories about it and one person there asked about why was rustic written in go and i've tried to answer that but the first
Starting point is 00:31:40 thing that i made sure to include in this section was like the programming language is nice to have and i really love the the go ecosystem but the programming language is not the most important part of the project and even the implementation isn't so it's the repository format and the backwards compatibility but i also made sure that at the beginning i decided i had the decision between which which license should I use for RESTIC. And I've decided that at least for all of my code and all code that's contributed to RESTIC, it's the two-clause BSD license, which is one of the most permissive licenses there is. So it's no GPL or anything like that. So you can even take the code and use it commercially without contributing back.
Starting point is 00:32:25 What led you to that decision? At first, when I started with free software in the late 90s, the GPL was really popular. If you're using GPL software and you're developing it further, then you have to contribute your changes back, at least once you start publishing your software. But in practice, what I also saw for our customers, for example, it was really hard sometimes to use GPL software in a commercial context
Starting point is 00:32:53 because of the considerations that the legal departments of the companies, for example, have against the GPL. So whenever you have, like, they are not a user of a program, but they are modifying it and using it for internal processes for example then it's sometimes hard to get a gpl program or gpl library approved at all and this is one other thing that can be a source of friction whenever you have like a license that you need to get approved before you can use a program then maybe you don't like you postpone implementing backups until someday you need a restore and you don't have a backup so this was like maybe i'm too naive or too idealistic but i think that the two class bsd
Starting point is 00:33:36 license is a great choice for like a backup program what else is cool or unique about restic everything in a restic repository besides the really tiny bits of data is completely encrypted as i said in the beginning when you initialize a repository you have to supply a password and this password is not optional you have to supply some kind of password and it uses strong cryptography and stores everything encrypted so that's all these data chunks that i talked about earlier and all the metadata, it's all encrypted. There is almost nothing that's not encrypted in a RESTIC repository.
Starting point is 00:34:10 So the security is very important to me personally there. So write down that password somewhere because if you lose it, you lose your backup. Yeah, sometimes we get discussions with people who'd like to use RESTIC but don't have the necessity to have a strong cryptography encrypt their data for example they are storing the repository on an already encrypted drive and they'd like to spare the cpu cycles to encrypt the data so that they would like to have an option
Starting point is 00:34:37 to turn off the password requirement for example but that's really hard to do with the current design the easiest would be to like use just use some dummy password but if the people are required to input some password even if it's like a password like test this this this alone makes it a bit harder for attackers to like just guess and just use the the default password so this is why we don't permit like using empty password for example so that you have some kind of hurdle for read attackers there. And even if you'd like to use a single character password, yeah. What happened with other backup problems was also that
Starting point is 00:35:16 once you have a code pass that RESTIC, for example, could be used with a repository without supplying a password then sometimes there were or there were bugs in the in the in the past for other backup programs so that attackers could for example remove the original repository for a user create a new repository that's not encrypted and on the next run of the backup tool the data is saved without encryption and so this was something that i'd like to prevent with restic so there is no code path in restic which leads to a repository with data that is not encrypted yeah i think that's a good stance to take i could definitely see where the pushback
Starting point is 00:35:58 would come because it's definitely a convenience versus security trade-off well yes some things you might want them to exist in duplicate, but you do not care about them being private or secret, right? And so you want the convenience. Maybe it's, I'm sure you have better reasons than I would have why people might want that, but I can see where that would be something that certain folks would want.
Starting point is 00:36:20 That being said, the other advantage of staying strong on that particular feature is you don't have to bifurcate anything in the code. Where it's like now, and I'm not sure how it's architected, maybe this is a simple place where it's like to encrypt or not to encrypt. And it's just like a toggle. But lots of times those kinds of decisions end up just kind of permeating the code base where you have to check in a bunch of places what you're going to do or not do how's restic are designed would it be a simple change or would it be a complex thing to allow for unencrypted backups it would be a really complex thing so
Starting point is 00:36:57 supplying a default password and just encrypting the data anyway that would be in rather a rather easy change but everything in a restEST-like repository is encrypted, which means that a chunk of data, for example, coming back to the example of your photo that you took at the 1st of January, all the data chunks are encrypted and then signed and then concatenated together as a file of multiple chunks and then uploaded to the repository.
Starting point is 00:37:25 And the metadata information for the folder that contains the file, it's a JSON document internally, it's encrypted, it is assigned, and then it's uploaded to the repository. So whenever you have these things, you'd need to, at every place, you would need to check
Starting point is 00:37:41 is it a repository that should be, should only contain encrypted data or is it not so you you'd like you'd have to insert this check at every at every place the other thing is that all the files are almost all the files in the repository are named after the sha 256 hash of the content so that on a server you can easily check with the sha-256 some tool that a file is unmodified that there is no bit rot and you can do that from the outside without even having a password for the repository and some design decisions within restic for example there's also a log file whenever you do a backup or you start a backup then restic uploads a log file to the repository to tell other clients that might be run concurrently that currently a backup is in progress so that you don't start removing just uploaded data chunks for example and the
Starting point is 00:38:36 creation of these log files for example require that the file name is always unique and we guarantee that by taking the the encryption properties into account for example the each encryption generates a new nonce value which is like a 16 byte random value which leads to a completely different file name because the sha2 hash of the content is completely different all these things that you would lose whenever you like rip out the encryption you could say that, okay, let's not encrypt, but let's sign the data, which is not the baddest idea, because you can verify that the data has not been modified, for example, by accident even. But you'd need to rip out the cryptography and the encryption everywhere,
Starting point is 00:39:20 and it's tightly integrated. And I think that's a good idea. That's a feature. In general, the RESTIC program and the community around it is pretty opinionated. We took that from the Go project, which is also pretty opinionated. We're not trying to cater every use case. This episode is brought to you by our friends at O'Reilly. Many of you know O'Reilly for their animal tech books and their conferences,
Starting point is 00:40:04 but you may not know they have an online learning platform as well. is brought to you by our friends at O'Reilly. Many of you know O'Reilly for their animal tech books and their conferences, but you may not know they have an online learning platform as well. The platform has all their books, all their videos, and all their conference talks. Plus, you can learn by doing with live online training courses and virtual conferences,
Starting point is 00:40:18 certification practice exams, and interactive sandboxes and scenarios to practice coding alongside what you're learning. They cover a ton of technology topics, machine learning, AI, programming languages, DevOps, data science, cloud, containers, security, and even soft skills like business management and presentation skills. You name it, it is all in there. If you need to keep your team or yourself up to speed on their tech skills, then check out O'Reilly's online learning platform. Learn more and keep your team skills sharp at O'Reilly.com slash changelog.
Starting point is 00:40:49 Again, O'Reilly.com slash changelog. so alex like i said at the beginning one of the things i'm impressed by is how long you've been working on this project rustic 0.12 your most recent release february 14th still trucking still making improvements that one had many speed improvements and a special thanks went out to alexander weiss or vice who did those so you have like a bunch of people helping you out here's like major release a lot of cool things done by alexander tell me about the community around restic and how you've built it in to be something that people are making major contributions to yeah that's a very interesting thing. And I'd like to thank all the community around RESTIC because when I started the product,
Starting point is 00:41:50 I was on my own with like two friends helping me out sometime, but I was the main developer. And I still think that most of the code is written by me, but I think at some points, that will be the point where I'm in the minority because somehow the project attracted a few very talented engineers.
Starting point is 00:42:11 And we even have people like hanging out in the discourse forum that we installed for RESTIC and just helping other people. And in the beginning, I made sure to set the right tone. So like I responded in a cheery way and whenever somebody hit a bug or something like that, then I would say like, oh, I'm sorry that you hit the bug. Here's what you can do.
Starting point is 00:42:31 And this was really important and sets tone for all the other interactions happening in RESTIC project spaces. I mentioned the discourse forum a few times already. And I think this is an excellent piece of software that helped us very much because it's almost no work to like moderate it because there are community moderators sometimes even they can like just flag things and which with the spam for example and whenever three or more people have flagged it who spend enough time around in a forum it will be all
Starting point is 00:43:02 automatically hidden from the public and so on. And setting the tone in the beginning really set the tone for the whole project in the GitHub issues, in the forum. And people are so helpful and so positive. It's amazing. And this attracted people who just hang around in the forum and helping people. They don't contribute code, but they don't need to. They are just trying to help other people.
Starting point is 00:43:27 And I think that's completely amazing. That blew me away whenever I see that. And we managed somehow, I'm not sure how, we managed to attract several great engineers. For example, there's Michael, who does a lot of the bug fixing and triaging things. There's Leo, who responds to issues and there is Alexander Weiss who,
Starting point is 00:43:48 yeah, from scratch, more or less re-implemented the garbage collection algorithm that at the beginning took a lot of time because I wrote the algorithm in the dumbest way that I can imagine in order to be really sure that no data that is still be used
Starting point is 00:44:03 is accidentally removed. And I took a week of vacation last year to just read through all the code that he did to really make sure that there are no bugs that I could spot. And afterwards, I merged it, and these speed improvements are completely awesome because we didn't have to change the repository format in any way this was like changing the repository format is out of the question for
Starting point is 00:44:30 most things um but setting people limits in in terms of technical limits like you just the data structures are this way and we have to keep backwards compatible changes and sometimes they get really great ideas on how to improve the speed without changing the repository format and without changing the barriers that I set them. And this is amazing. Let me back up for a second,
Starting point is 00:44:55 because you took a week off of vacation to work on RESTIC. I mean, talk about amazing. Yeah, yes, I did. Sometimes I like to call myself a recreational programmer and I had a bit of vacation days saved up at the end of last year and my wife just started working again after having kids and they she didn't have the vacation days so i just took them and i had a bit of spare time for myself and then i you know took a coffee got into the basement and started reading GitHub issues and pull requests, for example.
Starting point is 00:45:28 At the moment, unfortunately, the project is way too large for me to manage it myself. So we have a bunch of people helping out there. And at the moment, it's like I'm not contributing as much as I'd like to. We have a global pandemic going on and my life is crazy right now but it's great to know that i can jump back in and they will just ask questions that are more like management things like shall we do this or is it maybe a better idea to leave it out of the project and this just feels great to know that there are other people caring about the project and carrying and keeping the work going and improving it yeah that's spectacular i think you know they follow your lead here's a guy who's
Starting point is 00:46:10 willing to sit down and triage issues on his vacation and like look at prs i mean you you very much have showed that you're dedicated to this project even many many years after you began it and you know kudos to you on the community that you've built, because as you had that insight of like, well, you have to lead with regards to the way you want people to act. And like, you had to be out front with that because culture really does come from the core, right? And the first person that starts the project is the core. And so you built this cool community culture around RESTIC. Any other lessons learned that you've had over the years? Because you've been doing this for, what, seven years, eight years now, maybe, working on it?
Starting point is 00:46:50 Yeah, something like that. So the first released version was in, I think, April. I started working on it, I think, in April 2014 or something like that. So it's quite a while back. Yeah. And the first recommendation that I have for other projects is install a discourse forum. the software is really amazing and it's valuable to distinguish
Starting point is 00:47:10 between bug reports and feature requests and other things like users asking okay i have this setting and i'd like to backups to do backups this way is there a better way to do it or something like that so that you have a separate place for other discussions and the second thing is that sometimes people sound like harsh on the internet but it's not meant to be harsh and sometimes it really helps to like clearly point people out like okay you come across as very aggressive or very demanding or sarcasm doesn't help here can you please say it in another way or something like that and just write it in a github issue and some most of the time they respond like oh you're right i'm really sorry i wasn't in a bad mood or something like that so this is what happened and another trick that's i've just copied from another
Starting point is 00:48:02 open source project is that whenever you report a bug or a feature request for RESTIC, you get like on GitHub, you get a questionnaire of things that you'd like to do, like report the version number, which operating system are you on? What are you using for storing the repository? And at the end of the issue template,
Starting point is 00:48:21 there is the question that, did RESTIC help you today? Did it make you happy in any way? And at the end of like a bug report whenever i read through a bug report i can see that okay this this failed and the user got a strange error message that i didn't manage to format in a nice way and the program spit a backtrace at the user and they are confused and doesn't didn't know what to do Maybe it was important or something like that. And at the end of the issue template, you read like, okay, Resic is an amazing program.
Starting point is 00:48:50 It saved my ass several times already and just keep continuing what you do. And oftentimes you have like a really dry and maybe even bad sounding or negative sounding bug report. And at the end of it it there is like a really positive ending because the user is really happy and just like to improve the program and yeah like get a bit of help and how they can for example restore files and this trick is really amazing because it gives you a personal connection to the bug reporter and really makes it much
Starting point is 00:49:22 easier to gauge what what is the user just pissed at the program because it didn't work or is it just like okay you can fix this sometime it's not important anyway and um this is this is a really nice trick and if you look at the issue template in in rustic's repository you can i even included a link to the other repository that i got this trick from yeah that's spectacular i think it's always been advice that I've given and I try to practice when I open a bug report or I ask a question, why is this not working the way I expect it to work, that I try to provide some level of praise or positivity about the thing, either at the beginning or at the end, or if you can sandwich it, great. Sometimes you're not feeling all that positive
Starting point is 00:50:06 about it, so you have to work harder. But I think, and I've seen it happen, so I see other people do that as well, but I think if you're prompting somebody, you know, you're kind of actually giving them that explicit opportunity where maybe they weren't even thinking about it. A lot of times when it's time to report
Starting point is 00:50:22 a bug, you're, and it depends on the project, maybe with REST, it's not to report a bug, you're, and it depends on like project. Maybe with RESTIC it's not this way, but if it's a library you're using or a framework, sometimes you're hours into it, you know, and you've thrown up your hands and you can't figure it out. And maybe it's your own problem for a while, but then you realize, oh, it's not,
Starting point is 00:50:39 oh, it was the library or it was RESTIC's fault, you know? And it's just tough at that spot to like take a breath and look at the bigger picture but maybe you're you've been down that road because you've been using restic for all these years and backing up everything perfectly and then like you found this little issue and that prompts somebody to say oh no i love restic like this is like one of my favorite things ever i'm just really mad at this particular moment, you know? So I think giving those people that explicit, that prompt to have that opportunity is a great idea.
Starting point is 00:51:13 Yeah, it's really motivating to read that because you get all the issues. Usually for most projects, you got all the negativity, all the bugs or the missing features or whatever. And having every issue report and with some kind of positive note really helps tremendously and i was completely blown away by how how people use restic for example the cern c-e-r-n the the european atomic research institute i'm not sure what the correct name is i think it's in french and they are using restic i found that out by somebody who tweeted at me and said like okay hey here's a presentation
Starting point is 00:51:49 about restic at cern what are they doing and then the author chimed in and turns out they're using it for one of their computer pool installations for like 60k users something like that and sometime in i think it was november at one year several years back somebody opened an issue and said like okay yeah rustic's not working here and i said like okay can you debug this and and paste the output and he said like okay i i can do that but in order to download the debug binary uh it will take until tomorrow and i think, okay, do you have some kind of problem with the downstream bandwidth? Why are you using a remote repository and so on?
Starting point is 00:52:29 And it turned out they were on a ship cruising through the Arctic in a scientific expedition, and they were using RESTIC with Minio to back up all their research data. So they had only satellite internet with 64 kilobits of downstream bandwidth and Go binaries are great, but they are not that small.
Starting point is 00:52:50 So I made sure that to send them a source code patch and so they could build it locally. And this was just amazing knowing that my little backup program that I'm doing in my spare time for recreational purposes, like is used by scientific installations and scientific institutions to save really important data right well the fact is that
Starting point is 00:53:12 some data is so important that the backup it's like everything you know it's peace of mind right especially when you know you can restore it but having that backup is such a peace of mind. That's why I think it's not a surprise that so many people who talk about Rustic love it because it's like, this program has my back. This has my backup, literally. And maybe my job's on the line. Maybe nuclear research is on the line, right?
Starting point is 00:53:38 Maybe this science experiment is on, whatever it is. But if it's working the way i expect it to like i can sleep better at night and so not much software does that it gives you peace of mind and so i think it makes sense to me that this is like a hobby passion project that you've been able to sustain for so long no financial you know arrangements a lot of hard work over the years because like you're really affecting people's lives in a really positive way. And I'm sure when you hear those stories,
Starting point is 00:54:11 it has to feel so good. Oh yeah, it does. Unfortunately, sometimes it guilt trips me into spending more time on RESTIC than I'd like. So years ago, I've switched off all GitHub email notifications because my inbox on GitHub is completely unusable with a project with 12K stars.
Starting point is 00:54:30 And we also have many more pull requests open that I'd like to have, but we don't have the resources or the time to review them all. So sometimes people contribute something and it takes a lot of back and forth or they even don't get a response for several months this is an issue with several other open source product as well and sometimes it's like i in the evening i have a bit of spare time and i read an issue request for somebody who's lost
Starting point is 00:54:56 their like master thesis and their repository is broken because the ram was bad or something like that and at some point i spent half of my night writing a patch for it so that they can at least restore part of it and they were really grateful and it felt it felt amazing to like help them but i cannot do that every week that's that's the question so i turned off all notifications and i only look at rest stick whenever i have a dedicated hour or two to look at it. And at the last two or three months, I haven't really been able to do that regularly, but I'd like to do that.
Starting point is 00:55:33 But at the moment we have winter in Germany and so it's long nights and it's very dark and I just usually go to bed early and leave the GitHub notifications when I have the time for them. Yeah. Do you have an exit strategy? Do you have it?
Starting point is 00:55:51 Is there a future for RESTic beyond Alex or no? Are you eternally linked? That's a great question. At the beginning, I made sure to not link the project to my person so much. So I just, I created a GitHub organization for it, which is independent of my personal account and i also made sure that other people have administrative access so i have to
Starting point is 00:56:13 two of my best friends have administrative access to the organization in case i'm not available anymore and there are i think around 10 or 12 people having right access to the most important repositories for example restic itself so whenever there's somebody who submits a pull request and one of the other people who have right access approves the pull request and it can be merged even without my intervention and i made some notes i had there's a governance.md markdown file in the RESTIC repository to tell people how the project is structured. So at the moment, I'm the benevolent dictator for life, but that doesn't need to be that forever. So I can see that the RESTIC project is taken over
Starting point is 00:56:58 by somebody else at some point in time. So it works really well at the moment with me being in the loop for big decisions and for the day-to-day bug triaging. Many other people invest their resources. And at the moment, it works really well. But I can think of situations where I will step back whenever there's the need for it
Starting point is 00:57:21 and appoint somebody else as the new benevolent dictator for life there you go so let's talk about the future a little bit restic is at 0.12 as i mentioned that's uh seven or eight years in the making to get there is there a 1.0 ever going to be a thing and it looks like it's maybecember so you're hesitant for 1.0 i suppose because it's such a big thing to commit to but just curious like what does 1.0 look like or what is the next release of restick what's going on uh down the road initially i started with the zero point something releases to be able to at some point say like okay at this point we break the repository format and add something or change something that's not backwards compatible but this hasn't happened the last couple of years.
Starting point is 00:58:05 So the most important thing that I think in terms of the backwards compatibility for the repository would be to add compression. RESTIC does not support compression yet because the data would need to be compressed before it's encrypted by RESTIC. So you need to have something built into RESTIC. I can see a way in how to add compression to a repository,
Starting point is 00:58:32 but this would break compatibility with prior versions of RESTIC who don't know about compression. So this would be something, once we add that, I can think that it would warrant to release a 1.0 or something like that. So say like, okay, before we had everything was compatible and you can even use the newer version of RESTIC to restore from old repositories. But whenever you initialize a repository with like 1.0 and have compression support, then you cannot restore with an older version we also have a version field in the repository so that rest it can even give you a nice error message that it's unable to understand the repository format because it's too new so this would i think weren't a 1.0 unfortunately changing the repository format can open like a can of worms because there are so many things that could be
Starting point is 00:59:23 made better and personally i started working on this but i'm not sure where to stop like do i just add compression or do i also add like support for error correction for what error correction whenever you like you have a file where there's one bit flip and you cannot restore this chunk in this file because there's a bit flip and the signature doesn't match anymore and drastic says like okay the cipher text verification failed because something is wrong it would be nice to have like for whatever correction like read solomon code for example where you make every file 10 bigger in order to be able to like correct one or two bit flips
Starting point is 01:00:02 this is an interesting feature to have, but does it warrant another repository version or do we do this in one step? And I'm not sure where to go from there. And on the other hand, changing the repository format is not an easy thing to do because you have to keep so many things in mind. And until somebody else steps up and really does that, to my liking, I don't think we will get that for now.
Starting point is 01:00:26 But I hope to find the time in the future to really do that. I'd really like to do that. And I'd like to add compression. In the beginning, I didn't add it because there were concerns by several users with also a crypto background, which crypto means cryptography in this case, that adding compression would mean to increase the attack surface that attackers can can use for example there were several issues with compression
Starting point is 01:00:54 in the tls protocol which is something different because it's an interactive protocol and sometimes i have a like a man in the middle a person in the middle modifying the packets as they go this would be a bit different with a repository, but there's also like, yeah, I can access a repository, the attacker changes something, next day I access it again. So you have some kind of back and forth. And the other thing was that at the time
Starting point is 01:01:17 I designed the repository format, there was no like great compression algorithm already baked into the go standard library or available as a external library but this has changed because there is a person called klaus post uh he's working he's working in copenhagen and he's he does all kinds of completely crazy stuff with compression and he's he's also a rustic user by the way and there is like this issue 21 which is infamous and i've locked it for now um because there is so much discussion about shall we add compression to restic because this answer is obviously yes but people tend to get distracted
Starting point is 01:01:56 by discussing the merits of different compression algorithms over like uh yeah this is the classic bike shedding problem right klaus turned up in the issue and made a comment like okay it would be nice to have compression and then i responded and like okay what would be the compression algorithm and after a bit of back and forth we decided that the standard the std would be a great fit but there was no go library available for it we could use a c library and link it from go but i'd like to keep it go a go only project if possible to not have any c or c++ code in it because i like the memory safety guarantees that go gives me but unfortunately that would all be void as soon as you link any c code into it. And then Klaus started writing a compression library in Go
Starting point is 01:02:46 and implemented the C standard for Go. And it's almost as performant as the C version. The last time I looked, and sometimes he is on Twitter and tweeted like, okay, I had an evening of free time, and then I made the compression algorithm 10% faster, and he keeps doing that month after month after month. That's completely amazing. So this would be my obvious choice
Starting point is 01:03:09 for a compression algorithm. I will link up famous issue number 21. 167 comments by the time that you locked it. So if you want a long read and probably some fun back and forth and some real bike shedding action, you can find that in the show notes. That's funny.
Starting point is 01:03:26 The problem with compression algorithms is that there are so many of them. And if we were to add compression to RESTIC, there will be like three settings. The default one would be auto, which would leave RESTIC to decide if some chunk of data should be compressed or not. And the other things were completely off like i would get the speed as fast as possible and optimize for minimum size so whenever like i have a small upstream bandwidth of just one megabit or something like that i can make the most use of it so this was everything and i would like to avoid having the user being able to choose the compression algorithm
Starting point is 01:04:06 as a user other break-up programs do that but for restic we are opinionated and say like okay we will make the decisions for the users which means we don't cater to any use case but that's fine for us very cool alex anything else any ground we have not covered or anything on your notes that you want to make sure oh this has to be in the conversation that we haven't quite gotten to? I don't think so. I think we did not cover all the different commands that are available for RESTIC, but give it a try, kick the wheels and let us know how it goes. And sometimes just come by, hang out in the forum and just tell us what you like what you don't like that's perfectly fine excellent well uh listener know that all the links to all the things are in
Starting point is 01:04:51 the show notes you can go back and listen to alex on go time number 48 we've got restic in there relica we've got the github issue number 21 you can check out that issue template all the things so definitely follow up and check out restESTIC. It's got your back. Alex, thanks so much for coming back on the show and talking to us about backup. Thanks for all the work you put in over the years. I mean, taking vacation to work on an open source backup program is so epic.
Starting point is 01:05:17 I just appreciate your dedication to the program and all the value provided for backing up people's files all these years. It's pretty awesome. Yeah, thank you. You're welcome. And thanks for having me. That's it for this episode of The Change Law.
Starting point is 01:05:29 Thanks for tuning in. If you aren't subscribed yet to our weekly newsletter, you are missing out on what's moving and shaking in software and why it's important. It's 100% free. Fight your FOMO at changelog.com slash weekly. Huge thanks to our partners, Linode,ode Fastly and LaunchDarkly. When we need music, we summon the beat freak Breakmaster Cylinder. Huge thanks to
Starting point is 01:05:49 Breakmaster for all their awesome work. And last but not least, subscribe to our master feed at changelog.com slash master. Get all our podcasts in a single feed. That's it for this week. We'll see you next week. Game on!

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.