Screaming in the Cloud - Bringing FreeBSD to EC2 with Colin Percival
Episode Date: May 27, 2020About Colin PercivalColin is the founder of Tarsnap, a secure online backup service which combines the flexibility and scriptability of the standard UNIX "tar" utility with strong encryption,... deduplication, and the reliability of Amazon S3 storage. Having started work on Tarsnap in 2006, Colin is among the first generation of users of Amazon Web Services, and has written dozens of articles about his experiences with AWS on his blog.Colin has been a member of the FreeBSD project for 15 years and has served in that time as the project Security Officer and a member of the Core team; starting in 2008 he led the efforts to bring FreeBSD to the Amazon EC2 platform, and for the past 7 years he has been maintaining this support, keeping FreeBSD up to date with all of the latest changes and functionality in Amazon EC2.In his spare time, Colin serves as an alumni representative on the Senate of his alma mater, Simon Fraser University, where he frequently brings a perspective from the world of startups to the ivory tower.Links ReferencedCompany site: https://www.tarsnap.com/Twitter: https://twitter.com/cpercivaBlog: http://www.daemonology.net/blog/Patreon: https://www.patreon.com/cperciva
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, cloud economist Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud. No billing surprises. With simple, predictable pricing that's flat across 12 global data center regions
and a UX developers around the world love,
you can control your cloud infrastructure costs
and have more time for your team to focus on growing your business.
See what businesses are building on DigitalOcean and get started for free
at do.co slash screaming. That's do.co slash screaming.
That's do.co slash screaming.
And my thanks to DigitalOcean for their continuing support of this ridiculous podcast.
This episode is sponsored in part by N2WS.
You know what you care about many things, but never backups.
At least until right after you really, really, really needed to care about? Many things, but never backups. At least until right after you really,
really, really needed to care about backups. That's what N2WS does for your AWS account.
It allows you to cycle backups through different storage tiers. You can back things up cost
effectively and safely. For a limited time, N2WS is offering you $100 in AWS credits for setting
up their free trial, and I encourage you to give it a shot.
To learn more, visit snark.cloud slash N2WS.
That's snark.cloud slash N2WS.
Welcome to Screaming in the Cloud. I'm Corey Quinn.
I'm joined this week by Colin Percival.
Colin is the founder of Tarsnap, which is a secure
online backup service, as well as having been a staple in the EC2 history for the one true
operating system, FreeBSD. Colin, welcome to the show.
Good to be here.
So let's start at the very beginning. What is a FreeBSD for someone who might never have encountered such a thing in the wild?
FreeBSD, for people who have very little computing background, I often say it's like Linux, but it's not Linux.
Oh, I bet that irritates some people.
I'm sure that does irritate some people, and I don't like it when people refer to FreeBSD as being other Linux, which EC2 still does in some places.
But for people with somewhat more technical background, I say FreeBSD is Unix, and it's about as close as you can get to the natural successor to the original Unix.
So once upon a time, back when I was first starting out in my career, I found myself
at a university, and FreeBSD was what I wound up single-handedly deploying because of a
few different failure modes.
One, it turns out that when you have someone who pretty much bluffed their way through
the technical interview, and then you give them carte blanche to deploy whatever they want, you get some strange things happening.
Not that this was necessarily a bad decision, but years later when the statute of limitations has run its course, I can now say the reason that I went in that direction was because I had a mentor who was very anti-Linux and very pro-FreeBSD. And quite simply, he would help me
if I had a FreeBSD question,
but he would look down his nose at Linux.
Therefore, I was pretty much in a position of,
well, beggars can't be choosers.
So I made a full-throated endorsement of FreeBSD,
rolled it out, and ran it for a year.
Then I moved on to other jobs and haven't touched it in anger
or in production ever since, but I still miss it 15 years later or so. That makes sense to me.
To be honest, the reason that I started using PVSD was it was easier to install than OpenVSD.
The problem I ran into was that, I guess, how to frame this for someone who hasn't done a whole lot of work with either one.
Because I tend to assume that you don't need to have a background as a Linux or Unix administrator to listen to the show and get something out of it.
But from my perspective, it felt like FreeBSD was an environment where everything was very clearly ordered.
Everything belonged in a certain place.
There was a right way to do things.
It didn't have a manual. It had a handbook that told you how to go through any aspect of the system.
There was a start to it, a middle, an end, and it was great. Going from that to Linux felt like
suddenly I'm living in the middle of barely organized chaos. And I kept waiting for that
feeling to fade. It hasn't. I'll let you know if it ever does.
I don't know if I would say that in FreeBSD
there is always one right way to do things.
We have binary packages you can install
or you can build things from the port tree if you prefer.
You can even generate, build your own binary packages
if you want to build them and then install them just
to like make your life more complicated for instance i i would say that freebsd is developed
by people who try very hard to make sure that the options that they offer are good ones and
sometimes that means there is one good option and we tell people, this is what you should do.
Sometimes it means there are several good options and we tell people, pick one of these options.
But I will agree that in some other platforms, there are no clear good options or there are many options which are not good ones.
And people flounder and end up with things that are not good.
That's probably a fair way of assessing it. Now, back in the day when I was playing with these
things, it was all on-premises hardware. I would say servers, but that's putting it generously.
It turns out that when you have, and in this era, this was not an unreasonable operating system
choice, but a bunch of desktops that were running Windows XP. And then after three years, they were deprecated off the books and users
wouldn't tolerate how badly they performed. You could then repurpose them, install a different
operating system on that, and put it in your giant shelf of badly maintained servers. So we had 15
mail servers running like that. And one of the earlier projects that I had during my year there was to rip a lot of that out.
But I got to experience an awful lot of jank the fact, because you were effectively one of the driving forces behind getting FreeBSD working on EC2 in the early days.
I would say in the early days, I was the person that decided this is something that should happen.
And happened it did. But I guess my question is, what does it take to look at an existing offering like
EC2, where you could ask what operating systems they supported, and back then the answer was,
oh, both kinds, Winzos and Linux. And from there, okay, how do you go from having something like,
I guess, a Linux or Linux-y operating system, and then effectively doing a wholesale
replacement of the OS in a, I guess, first, in a way that works, and secondly, ideally,
in a way that doesn't offend the purest sensibilities of a number of Unix aficionados.
Well, I want to just correct one thing you said there. You said both kinds, Linux and Windows.
In fact, when I first decided I wanted to get FreeBSD working,
there was one version of Linux that was supported on EC2.
And then later, they were both kinds being CentOS and Ubuntu.
It was actually a few years before Windows came along.
And by that point, I was already trying and failing in a wide variety of different ways
to get 3BSD working.
Amazing.
I guess that's one of those history lessons
that I wound up avoiding
by virtue of not being at all involved in the cloud
back in those early days.
But, well, it's not often that I wind up getting exposure
to a trivia factor at AWS I didn't already know. Good work. Well, I mean, there's a lot of interesting trivia from back
then, like the fact that in the early para-virtualized days of EC2, you didn't just
have a machine image. You also had a kernel image and a RAM disk image, because you couldn't just say whatever's on this disk.
You had to give the power virtualized Xen the kernel it was going to run, and Linux at the
time needed a RAM disk with something, I don't know exactly what on it, before it could load
everything else off of the file system on disk.
Back in those days, there was the RAM disk.
They had that you had to pick what kernel you had to run through.
I don't want to say that it was complicated or Byzantine,
but there was a company, RightScale,
back before they were acquired and no one heard from them ever again,
where their entire business value was wrapping the EC2 APIs
into a front-end dashboard that a human being could understand,
and then charging a percentage of whatever you ran through it,
which sounds ridiculous today,
but pretty much everyone I knew,
to a large part, back in 2008, 2009,
was running through this
just because it was so complicated to get up and running.
The documentation wasn't there,
and the folks who were super involved with
it largely were themselves AWS employees, or effectively the next closest thing. How did you
dive in and get started with something like that back in those days where effectively it was the
digital equivalent of rubbing two sticks together to make fire? So in some ways it was easier to get
started back then because AWS was really small.
And pretty much everybody inside Amazon, or inside AWS at least, knew what everybody else was doing in there.
And they didn't have huge numbers of customers asking them for help.
So I sent an email to Jeff Barr.
And I said, hey, I want to get FreeBSD working on EC2.
And he wrote back to me and said, here's some people at Amazon you should talk to.
And for the first few years of trying to get things working, pretty much all of my contacts at Amazon were going through Jeff Barr.
I talked to him on Twitter and so on, but if I needed somebody, Jeff knows everybody.
I just sent Jeff an email. He connects me with the right people.
And the Amazon engineers were always incredibly enthusiastic.
I got the feeling that Amazon, as a corporate entity, didn't really appreciate FreeBSD.
The managers didn't know what FreeBSD was, except that they could tell they didn't have any customers using it. But all the
engineers, they love the idea of getting a different operating system running on their platform.
It seemed like it was almost a great hobbyist direction back then, in that people were excited
to see what the potential use cases of the platform were. Because this was back in the day
before you really had giant companies going all in on this.
Now, for that same level of excitement, people instead have to settle for watching me misuse
Route 53 as a database or similar. But back in those days, it was, is this even possible,
was the burning question in everyone's mind. You proved that it was. And what astonishes me is now,
years later, there is still a thriving free BSD offering on top of AWS. Is that entirely you?
Is there a larger community behind it now?
Is it officially supported by folks at Amazon?
So the FreeBSD offering on AWS now
is officially supported by the FreeBSD project.
So in the early days, it was me building disk images.
And at one point, about a two-hour round-trip time to
test anything, I needed to upload a 10-gigabyte disk image before I could boot it and see where
it failed to boot. These days, the VBSD release engineering team is doing all the builds just as
part of our standard release building process. Now, that's the actual building of the images.
Making things work, that's a completely
different matter. And there's been a lot of work, a little bit by me, but largely by other 3BSD
developers in the early days working on Xen, dealing with new sorts of Xen devices we needed
to handle. More recently, a lot of bug fixes on our NVMe driver because all the Nitro instances expose NVMe disks.
Amazon, I was very happy to hear
when they launched the Elastic Network Adapter a few years back,
they had a Linux driver,
and I got word that they were going to pay people
to port their Linux driver to FreeBSD.
And I mean, in FreeBSD, we're used to, there's a Linux driver over there, go ahead and try to port it.
Sometimes there's a Linux driver and here's some documentation for it.
It's absolutely wonderful when we have a company actually present us with a FreeBSD driver.
And Amazon paid for that to be done,
and in fact has been having people maintain it for us ever since.
So other than hobbyists and you,
who's using FreeBSD on top of AWS these days?
Are there public reference customers?
Is this mostly a bunch of hobbyists building interesting things?
I mean, you're running an entire business on top of this,
which is not nothing.
But who's
playing around in this space these days? To be honest, I don't know exactly who was running
FreeBSD on EC2. I can say that from the EC2 marketplace, we have around 3,000 or 4,000
instances running that were launched through the marketplace. I'm sure far more than that, they were just launched by somebody copying
and pasting the AMI ID into the console
or onto the command line.
There are companies that we know use FreeBSD like NetApp.
I would assume that some of their cloud offerings
also run on FreeBSD because why would they not use
the same platform for their cloud offerings?
But large companies have been very reticent to talk to me about what it is that they're doing
with FreeBSD on EastU. It's one of the things I really regret not hearing from these large customers.
This episode is sponsored in part by Chaos Search. Now their name isn't in all caps, so they're definitely worth talking to.
What is Chaos Search?
A scalable log analysis service that lets you add new workloads in minutes, not days or weeks.
Click. Boom. Done.
Chaos Search is for you if you're trying to get a handle on processing multiple terabytes
or more of log and event data per day at a disruptive
price. One more thing for those of you who've been down this path to disappointment before,
Chaos Search is a fully managed solution that isn't playing marketing games when they say
fully managed. The data lives within your S3 buckets, and that's really all you have to care
about. No managing of servers, but also no data movement.
Check them out at chaossearch.io and tell them Corey sent you.
Watch for the wince when you say my name.
That's chaossearch.io.
That's always been one of the challenges I've seen in the FreeBSD universe, to be fair,
is that because the license is such, the BSD license is you can use the source code,
you can do whatever you want to it,
and you don't have to recontribute any changes back,
it becomes very uncertain to be able to attribute
who is using this in any meaningful way.
In fact, the only way that I was able to find FreeBSD jobs
when I went looking was by looking specifically
for the term FreeBSD in job descriptions.
That's how I learned companies like, for example, Juniper were big proponents of FreeBSD.
But there was remarkably little representation in the common community style of Circle.
That definitely is an issue.
The license being more generous and also, to be honest, the fact that the license is so brief
does make it harder to identify who's using BSD code.
If you buy a TV and it comes with a copy of the GPL, it gives you some idea of what software is running in it.
BSD license, you might not even notice because it's half a page rather than 10 pages.
Yeah, when the entire license fits in a tweet, one starts to wonder.
Exactly. I don't think the BSD
license is quite that small, but same
idea, yes. So, yeah,
it is harder to tell who's running FreeBSD.
As far as a large company
who's using FreeBSD now,
if you look at FreeBSD developers
and where they work, I mean, it's
clear. Companies like
Juniper, I don't know if they
have FreeBSD developers right now, but they certainly had many in the past.
Netflix, of course, has many FreeBSD developers and goes to conferences and talks about the work they're doing on FreeBSD,
getting 200 gigabits per second of TLS throughput from their movie streaming devices. So there certainly are large companies out there using FreeBSD,
being open about the fact they're using FreeBSD and contributing changes back.
But I'm sure there are others out there that are quieter about it.
Which is very fair.
So let's talk instead for a minute about the company you actually run,
because it turns out that volunteering your spare time to get FreeBSD working on EC2
is not, in fact,
your primary vocation these days. You run a service called Tarsnap. What is Tarsnap?
Tarsnap, well, the slogan is online backups with Julie Paranoid. It is an online backup service
with a tar command line front end. So you type in a command that,
actually it could just be a tar command
except with the word tar snap in the front instead of tar.
And say you want to create an archive
containing certain files or directories,
it bundles those all up,
it deduplicates them, it compresses them,
and then it encrypts everything before it uploads it to the storage service, which is ultimately backed by Amazon S3.
Gotcha. So you say that they're backups for the truly paranoid.
Everyone likes to think of themselves as being paranoid with backups, but in my experience, everyone cares an awful lot about backups right after they really needed to care about backups. And even then, they are diligent about making sure that things back
up, but they never test or restore. So it leads you to a fun place where backups for the truly
paranoid mean different things for different folks. What does it mean for you? So I started this when I was a FreeBSD security
officer. And as FreeBSD security officer, I would get advance notice of security vulnerabilities
that affected FreeBSD. So problems in send mail, problems in bind, problems in open SSL.
And at a certain point, I was looking at all of the vulnerabilities I had
sitting on my laptop waiting to be fixed
because, of course, we always coordinate these disclosures
and we pick some dates so that everybody can roll out their patches
at the same time.
And I was thinking to myself,
if somebody got their hands on my laptop,
bad things could happen
because they could exploit that one and they could exploit that one and they could
exploit that one, they could avoid that other vulnerability. And then I thought to myself,
wait a minute, if somebody got their hands on my backups, we would be in trouble as well. Now,
like most people, I didn't really do very good backups at the time. But then I was thinking,
well, if I start doing backups more often, then that means
there's more copies of all this scary information sitting around somewhere. How do I do this
securely? So I looked around at what I could find online in 2006, and there really wasn't anything
out there that I could trust to do backups securely. I was previously a security officer.
I had quite a background in security and cryptography at that point.
And just based on my expertise in the field, I didn't trust what was out there.
So I asked around some of my friends and posted on my blog, I asked, if I build this myself,
would anybody else want to use it?
Lots of people said, yes, this is something we would pay for. So it happened. I was looking for work at the time. I had a job offer from Google to go
down to San Francisco and do research. But I wasn't a lot enthusiastic about that offer for
a few reasons. So I decided, well, okay, I'll build it myself and see how it goes. So Tarzab is very much a startup in the open source tradition of
scratching your own itch. I had a problem. Some of the people said that they had the same problem,
so I decided to fix it. And fix it, you did. It's been around for a while. You have some
very impressive name brand customers who are publicly referenced on your site.
Stripe is, I guess, the canonical example.
If you take a look at how much they care about the sanctity of their backups, I don't feel like there's really enough words to express the answer to that question.
If you get, effectively, the internet's payment system's data, there is disaster and hellfire and brimstone, and nothing looks the
same tomorrow if that happens. So it's obviously validated and tested by folks who take their
workloads seriously. I guess the question is, why did you go down the path of A, using FreeBSD for
this, and B, building it on top of EC2 instead of a bunch of different
options that you could have potentially gone with. So in 2006, when I decided I wanted to do this,
I knew, I mean, I'm a software guy. I knew that I did not want to be dealing with physical hard
drives, and I definitely didn't want to be driving down to the data center to swap out failed hard
drives. So I wanted something out there that could store the data for me and not lose it.
S3 launched earlier that year.
So I said to myself, okay, S3 sounds like the backend I want to use for this.
And then, well, I need to have some code running in front of that. Oh, look, here's this elastic compute cloud service
that lets you have servers that are really close to S3
and can push bits in and out of S3 without paying any bandwidth costs.
So it was just a natural connection there.
EC2 was what I needed to be able to use S3 efficiently.
And one thing sort of leads to another.
And as I think as anyone tends to learn sooner or later,
they kind of wind up staying
wherever they wind up initially building something out,
barring a tremendous strategic reason to change providers.
So one thing that I found interesting
that I saw a while back,
and I'll link to it in the show notes,
was Patrick McKenzie wound up doing an entire analysis of Tarsnap and writing a—an essay doesn't really encapsulate the entirety of what he wound up writing.
It was more or less a day-long teardown of your entire market positioning and effectively giving a laundry list of things he would change if he
were doing the marketing piece for Tarsnap. What led to, first, him doing that? And secondly,
what was your response when you wound up going through all of the copious detail that he wound
up putting in there? I can't remember the exact history leading up to that, although we had exchanged comments about
TarSnap on Hacker News for a couple
years leading up to that. He did ask me,
by the way, was I okay with him doing this? And I was
very enthusiastic, and I still am very enthusiastic
that blog post actually brought TarSnap more customers
than anything anybody has ever written
by probably a factor of 10.
Just wait until this podcast goes out and we'll see if we can beat it.
That would be fantastic.
But as far as my opinions on what he wrote about TarSnap,
I think his view of TarSnap is somewhat different from mine.
He and also Thomas Tachik, who shares very similar opinions to Patrick about TarSnap,
have said that the worst thing for a small business to be is a utility.
This idea of pricing TarSnap the same way that you pay your power bill is just terrible as far as they're concerned.
My view is exactly the opposite.
I think backups should be a utility.
And if people can pay their Tarsnap bill the same way that they pay their AWS bill,
that is not bad in my opinion.
I would say that there's definitely an argument that could be made in either direction. The joy of looking at things from a utility perspective is that, okay, great, you wind up paying for things that turn on, turn off.
And we've seen companies move away from this. in most respects and beat your CPU to death whenever something touched a disk, when it used to just be a folder
that would sync between various computers
and have the same contents in it at all times,
like magic, that felt like a utility.
Now, of course, it's a platform
and it certainly worked for them.
They've gone public and done super well,
but they clearly have departed from their roots
of being do one thing, do it well in a utility fashion.
So maybe that means that Tarsnap is not fated to become a publicly traded company worth billions of dollars.
That is quite possible.
And honestly, I don't really mind if Tarsnap fails to become a publicly traded company.
Being a publicly traded company is an awful lot of work, and I don't think I would really want to do that. No, no, there's certainly a list of things I want to
deal with versus don't want to deal with. And paperwork is very clearly in the second category.
That's why the company is never just me. It's always good to have people who are better at
things that I suck at. So something else you've written recently that I wanted to talk about was IMDS-
FilterD. That's Indigo Mike Delta Sierra-FilterD. And one thing that I love about IMDS is that the
first sentence in the readme tells you how to pronounce it, which is a rarity around anything
that touches AWS. Usually it leads to warfare, character assassination,
actual assassination, and I still stand by my AMI pronunciation. But what is IMDS FilterD?
So IMDS FilterD is a filtering daemon for the instance metadata service. Amazon refers to the
instance metadata service as IMDS. I'm not quite sure why metadata gets two letters instead of one,
but maybe they think meta and data are different words.
I'm not sure.
In any case, they call it IMDS, so I call it IMDS.
And IMDS filter D is a daemon which restricts access
to whichever parts of the instance metadata service
you would like to restrict access to.
And it does this based on rules that you provide with user IDs and also group IDs, if you want.
So this means that you could say this web proxy should not be accessing IAM credentials. We do not want people to use this web proxy
to get the credentials to access S3
and steal all of the information
on 100 million credit card holders.
Or you could tell it,
user nobody should not be accessing things.
So that privilege-separated SSHD that you've got running,
the pre-auth process, if there's a vulnerability in there,
somebody should not be able to exploit a pre-auth vulnerability
in order to steal those same IAM credentials.
In general, I would say you probably want to let root access everything
because, well, root is root and can turn off the filtering daemon
if it wants to anyway.
But it's essentially a way of fixing the fact that in the early days of IAM,
they decided the right way to expose credentials was via the instance metadata service,
which is accessible via HTTP from any process on the system.
Honestly, I think that was the worst security mistake Amazon has ever made in AWS,
but they haven't fixed it.
So I figured, well, I need to step in and I need to fix it.
Step in and fix it, you did.
They wound up releasing the v2 endpoint,
but that's going to take, as I believe you've mentioned, ages before that is globally supported to the point where the v1 endpoint can be turned off.
That's going to be a painful thing for a lot of shops.
Getting to the point that people can turn off version 1 access is going to be painful because, yeah, you need to have code that supports v2 before you can block version 1.
Also, v2 doesn't solve the problem completely.
It solves the problem of I have a misconfigured proxy,
but it doesn't solve the problem of
somebody managed to break into my server,
but within a sandbox.
They're running as user nobody,
or they found a bug in Apache,
so they're able to run code as the www user.
Those users should not have access to IAM credentials,
unless there's some credential you need them to have,
but in general, they shouldn't have access to those credentials.
And even with version 2 of the instance metadata service,
right now, they do have access because they can make the necessary requests.
Excellent. And I will throw a link to that in the show notes as well.
Last question before I let you go.
You wind up doing an awful lot of work for the larger community in order to make FreeBSD on EC2 run.
If people want to support you, how can they do that?
So a couple of years ago, I set up a Patreon.
The original idea when I set it up was just,
this will be a way that people can cover things like my travel expenses.
Because there have been times I've considered going to conferences and said,
you know, it might be useful for me to go somewhere like Amazon reInvent.
But I don't really want to pay for that out of my own pocket.
As it turns out, now I'm an Amazon community hero,
so Amazon pays for me to go to reInvent.
But there have been other events I've considered going to
and decided not to because I didn't want to pay for it myself.
And at this point also, it would be nice
if the community could pay for some of the
time I spend working on this, because I do have a day job, and the more time I spend working on
getting Fubiace working on EC2 and fixing issues as they arise, the less time I get to spend on
Terraspair. That's absolutely something that is worth supporting. I think that we take people
doing what amounts to volunteer work in the open source space far too much for granted. So absolutely thrilled to
want to throw in a link into that. Great. Colin, thank you so much for taking the time to speak
with me. If people want to hear more about what you have to say, where can they find you?
They can follow me on Twitter at cperciva or follow my blog, demonology.net slash blog. Excellent. And we will
absolutely toss links to that in the notes as well. Thanks once again for taking the time to
speak with me. I appreciate it. Great to talk to you. Colin Percival, founder of Tarsnap and
effectively one man force of nature in the FreeBSD ecosystem on AWS.
I'm cloud economist Corey Quinn,
and this is Screaming in the Cloud.
If you've enjoyed this podcast,
please leave a five-star review in Apple Podcasts.
If you've hated this podcast,
please leave a five-star review in Apple Podcasts
and a comment explaining why FreeBSD
is your favorite distribution of Linux.
This has been this week's episode of Screaming in the Cloud.
You can also find more Corey at screaminginthecloud.com
or wherever FineSnark is sold.
This has been a humble pod production
stay humble