Screaming in the Cloud - Building Reliable Open-Source Social Media with Jake Gold
Episode Date: June 27, 2023Jake Gold, Infrastructure Engineer at Bluesky, joins Corey on Screaming in the Cloud to discuss his experience helping to build Bluesky and why he’s so excited about it. Jake and Corey disc...uss the major differences when building a truly open-source social media platform, and Jake highlights his focus on reliability. Jake explains why he feels downtime can actually be a huge benefit to reliability engineers, and why how he views abstractions based on the size of the team he’s working on. Corey and Jake also discuss whether cloud is truly living up to its original promise of lowered costs. About JakeJake Gold leads infrastructure at Bluesky, where the team is developing and deploying the decentralized social media protocol, ATP. Jake has previously managed infrastructure at companies such as Docker and Flipboard, and most recently, he was the founding leader of the Robot Reliability Team at Nuro, an autonomous delivery vehicle company.Links Referenced:Bluesky: https://blueskyweb.xyz/Bluesky waitlist signup: https://bsky.app
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the
Duckbill Group, Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
Welcome to Screaming in the Cloud.
I'm Corey Quinn.
In case folks have missed this,
I spent an inordinate amount of time on Twitter over the last decade or so,
to the point where my wife, my business partner, and a couple of friends all
went in over the holidays and got me a leather-bound set of books titled The Collected Works of Corey
Quinn. It turns out that I have over a million words of shitpost on Twitter. If you've also been
living in a cave for the last year, you'll notice that Twitter has basically been bought and driven
into the ground by the world's saddest manchild.
So there's been a bit of a diaspora as far as people trying to figure out where community lives.
Jake Gold is an infrastructure engineer at Blue Sky, which I will continue to be mispronouncing as Blue Ski because that's the kind of person I am, which is, as best I can tell, one of the leading contenders, if not the leading contender,
to replace what Twitter was for me. Jake, welcome to the show.
Thanks a lot, Corey. Glad to be here.
So there's a lot of different angles we can take on this. We can talk about the policy side of it.
We can talk about social networks and things we learn watching people in large groups with
quasi-anonymity.
We can talk about all kinds of different nonsense, but I don't want to do that because I
am an old school Linux systems administrator. And I believe you came from the exact same path,
given that as we were making sure that I had the right person on the show,
you came into work at a company after I'd left previously. So not only are you good at the whole Linux server thing,
you also have seen exactly how good I am not at the Linux server thing.
Well, I don't remember there being any problems at Truecar where you worked before me.
But yeah, my background is doing Linux systems administration,
which turned into sort of Linux programming.
And these days we call it site reliability engineering. But yeah, I discovered Linux in the late 90s as a teenager and installing
Slackware on 50 floppy disks and things like that. And I just fell in love with the magic of being
able to run a web server. I got a hosting account at my local ISP and I was like, how do they do
that? And then I figured out how to do it. I ran Apache and it was like still one of my core memories of getting, you know, HDBD running and being able to access it
over the internet and tell my friends on IRC. And so I've done a whole bunch of things since then,
but that's still like the part that I love the most.
The thing that continually surprises me is just when I think I'm out and we've moved into a fully
modern world where, oh, all I do is I write code anymore, which I didn't realize I was doing until I realized if you call YAML code, you can get away with anything.
And I get dragged, I'm also getting dragged back in. It's the falling back to fundamentals in these
weird moments of, yes, yes, immutable everything, infrastructure is code. But when the server's
misbehaving and you want to log in and get your hands dirty, the skill set rears its head yet
again.
At least that's what I've been noticing,
at least as far as I've gone down
a number of interesting IoT-based projects lately.
Is that something you experience
or have you evolved fully and not looked back?
Yeah, no, what I try to do is on my personal projects,
I'll use all the latest cool, flashy things,
any abstraction you want, I'll try out everything.
And then what I do is at work,
I kind of have like a one or two year sort of lagging adoption of technologies like when i've
actually shaken them out of my own stuff then i use them at work and but yeah i think there's one
of my favorite quotes is like programmers first learn the power of abstraction then they learn
the cost of abstraction and then they're ready to program and that's how i view infrastructure
very similar thing where you know certain abstractions like container orchestration or, you know, things like that can be super powerful if
you need them. But like, you know, that's generally very large companies with lots of teams and things
like that. And if you're not that, it pays dividends to not use overly complicated, overly
abstracted things. And so that tends to be more I fall out most of the time.
I'm sure someone's going to consider this to be heresy, but if I'm tasked with getting a web application up and running in short order, I'm putting it on an old school traditional three-tier
architecture. You have a database server, a web server or two, maybe a job server that lives
between them, because is it the hotness? No. Is it going to be resume bait? Not really. But you know,
it's deterministic as far as where things live. When something breaks, I know where to find it.
And you can miss me with the, well, that's not web scale response because yeah, by the time I'm
getting something up overnight to this has to serve the entire internet, there's probably a
number of architectural iterations I'm going to be able to go through. The question
is, is what am I most comfortable with and what can I get things up and running with that's tried
and tested? I'm also remarkably conservative on things like databases and file systems because
mistakes at that level are absolutely going to show. Now, I don't know how much you're able to
talk about the blue ski infrastructure without getting yelled at by various folks.
But how modern versus reliable?
I guess that's probably a fair axis to put it on.
Modernity versus reliability.
Where on that spectrum does the official blue ski infrastructure land these days?
Yeah, so I mean, we're in a fortunate position of being an open source company working on an open protocol.
And so we feel very comfortable talking about basically everything. Yeah. And so and I've talked about this a bit on the app. But the basic idea we have right now is we're using AWS, we have auto scaling groups. And those auto scaling groups are just EC2 instances running Docker CE, the community edition for the runtime for containers. And then we have a load balancer in
front and a Postgres multi AZ instance in the back on RDS. And it is really, really simple.
And when I talk about the difference between like a reliability engineer and a normal software
engineer is software engineers tend to be very feature focused, you know, they're adding
capabilities to a system. And the goal of and the mission of a reliability team is to focus
on reliability, right? Like that's the primary thing that we're worried about. So what I find
to be the best resume builder is that I can say with a lot of certainty that if you talk to any
teams that I've worked on, they will say that the infrastructure I ran was very reliable. It was
very secure and it ended up being very scalable because, you know, the way you solve the sort of
iteration thing is you just version your infrastructure, right? And I think this works here. And it ended up being very scalable because, you know, the way you solve the sort of iteration
thing is you just version your infrastructure, right? And I think this works really well.
You just say, hey, this was the way we did it now. And we're going to call that V1. And now
we're going to work on V2. And what should V2 be? And maybe that does need something more
complicated. Maybe you need to bring in Kubernetes. Maybe you need to bring in a super cool reverse
proxy that has all sorts of capabilities that your current one doesn't. Yeah, but by versioning it, it takes away a lot of the interpersonal issues that can happen
where like, hey, we're replacing Jake's infrastructure with Bob's infrastructure or whatever.
I just say it's V1, it's V2, it's V3. And I find that solves a huge
number of the problems with that dynamic.
But yeah, at Blue Sky, the big thing that we are focused on is
federation is scaling for us because the idea is not for us to run the entire global infrastructure for Ad Proto, which is the protocol that Blue Sky is based on.
The idea is that it's this big open thing like the web, right?
Like Netscape popularized the web, but they didn't run every web server.
They didn't run every search engine, right?
They didn't run all the payment stuff.
They just did all of the core stuff. You know, they created SSL, right? Which became TLS. And they did all the things
that were necessary to make the whole system large, federated, and scalable, but they didn't
run it all. And that's exactly the same goal we have. The obvious counter example is no, but then
you take basically their spiritual successor, which is Google, and they build the security.
They run a lot of the servers. They have the search engine. They have the payments infrastructure. And then they turn
a lot of it off for fun and, I would say, profit, except it's the exact opposite of that.
But I digress. I do have a question for you that I love to throw at people whenever they start
talking about how their infrastructure involves autoscaling. And I found this during the pandemic in that a lot of people believed in
their heart of hearts that they were auto-scaling, but people lie mostly to themselves. And you would
look at their daily or hourly spend of their infrastructure and their user traffic dropped
off a cliff and their spend was so flat you could basically eat off of it and set a table on top of
it. If you pull up Cost Explorer
and look through your environment, how large are the peaks and valleys over the course of a given
day or week cycle? Yeah, no, that's a really good point. I think my basic approach right now is that
we're so small, we don't really need to optimize very much for cost. You know, we have this sort of
base level of traffic and it's not worth a huge
amount of engineering time to do a lot of dynamic scaling and things like that. The main benefit we
get from auto scaling groups is really just doing the refresh to replace all them, right? So we're
also doing the immutable server concept, right, which was popularized by Netflix. And so that's
what we're really getting from auto scaling groups. We're not even doing dynamic scaling,
right? So it's not keyed to some metric, you know, the number of instances that we have at
the app server layer. But the cool thing is you can do that when you're ready for it, right?
The big issue is, you know, okay, you're scaling up your app instances, but is your database
scaling up, right? Because there's not a lot of use in having a whole bunch of app servers if the
database is overloaded. And that tends to be the bottleneck for kind of any complicated kind of
application like ours. So right now,
the bill is very flat. You could eat off it if it wasn't for the CBN traffic and the load balancer
traffic and things like that, which are relatively minor. I just want to stop for a second and marvel
at just how educated that answer was. I talk to a lot of folks who are early stage who come and
ask me about their AWS bills and what sort of things should they concern themselves with.
And my answer tends to surprise them, which is you almost certainly should not unless
things are bizarre and ridiculous.
You are not going to build your way to your next milestone by cutting costs or optimizing
your infrastructure.
The one thing that I would make sure to do is plan for a future of success, which means
having account segregation where it makes
sense, having tags in place so that when, aha, this thing's gotten really expensive, what's
driving all of that can be answered without a six-week research project attached to it.
But those are baseline AWS Hygiene 101. How do I optimize my bill further is usually the right
answer is, go bill. Don't worry about the small stuff.
What's always disturbing is people have that perspective
and they're spending $300 million a year.
But it turns out that not caring about your AWS bill
was in fact a zero interest rate phenomenon.
Yeah, so we do all of those basic things.
I think I went a little further than many people would
where every single one of our,
so we have different projects, right?
So we have the big graph server, which is sort of like the indexer for the whole network
and we have the pds which is the personal data server which is kind of where all of people's
actual social data goes your likes and your posts and things like that and then we have a dev staging
sandbox prod environment for each one of those right and there's more services besides but the
way we have it is those are all in completely separated VPCs with no peering whatsoever between them.
They are all on distinct IP addresses, IP ranges, so that we could do VPC peering very easily across
all of them. That's someone who's done data center work before with overlapping IP address ranges and
swore never again. Exactly. That is what I have been burned. I have cleaned up my mess and other people's messes. And there's nothing less fun than renumbering a large,
complicated network. But yeah, so we have all these separate VPCs. And so it's very easy for
us to say, hey, we're going to take this whole stack from here and move it over to a different
region, a different provider. And the other thing is that we're doing is we're completely
cloud agnostic, right? I really like AWS. I think they are the market leader for a reason.
They're very reliable.
But we're building this large federated network.
So we're going to need to place infrastructure in places where AWS doesn't exist, for example.
Right.
So we need the ability to take an environment and replicate it in wherever.
And of course, they have very good coverage, but there are places they don't exist.
And that's all made much easier by the fact that we've had this very strong separation of concerns.
I always found it fun that when you had these decentralized projects that were invariably NFT
or cryptocurrency driven over the past five or six years or so, and then AWS would take a US East
One outage in a variety of different and exciting ways. And all these projects would go down hard.
It's, okay, you talk a lot about decentralization for being, having hard dependencies on one
company in one data center effectively doing something right.
And it becomes a harder problem in the fullness of time.
There is the counter argument in that when US East 1 is having problems, most of the
internet isn't working.
So does your offering need to be up and running at all costs?
There are some people for whom that answer is very much, yes, people will die if what we're running is not up and running.
Usually a social network is not on that list.
Yeah, one of the things that is surprising, I think, often when I talk about this as a reliability engineer,
is that I think people sometimes over-index on downtime.
You know, they just they think it's a much bigger deal than it is.
You know, I've worked on systems where there was credit card processing where you're losing a million dollars a minute or something. And like in that case, OK, it matters a lot because you can put a real dollar figure on it.
But it's amazing how a few of the bumps in the road we've already had with Blue Sky have turned into sort of fun events, right? Like we had a bug in our
invite code system where people were getting too many invite codes and it was sort of caused a
problem, but it was a super fun event. We all think back on it fondly, right? And so outages
are not fun, but they're not life and death generally. And if you look at the traffic,
usually what happens is after an outage, traffic tends to go up. And a lot of the people that join, they're just they're talking about the fun outage that they missed because they weren't even on the network.
Right. So it's like I also like to remind people that eBay for many years used to have like an outage Wednesday.
Right. Whereas they could put a huge dollar figure on how much money they lost every Wednesday.
And yet eBay did quite well. Right. Like it's amazing what you can do if you relax the constraints of downtime a little bit. You can do maintenance things that
would be impossible otherwise, which make the whole thing work better the rest of the time,
for example. I mean, it's 2023 and the Social Security Administration's website still has
business hours. They take a nightly four to six hour maintenance window. It's like the last person
out of the office turns off the server or something. I imagine it's some horrifying mainframe job that needs to wind up sweeping after itself or
running some compute jobs. But yeah, for a lot of these use cases, that downtime is absolutely
acceptable. I am curious as to, as you just said, you're building this out with an idea that it
runs everywhere. So you're on AWS right now because, yeah, they are the market leader for a reason.
If I'm building something from scratch,
I'd be hard-pressed not to pick AWS for a variety of reasons.
If I didn't have cloud expertise,
I think I'd be more strongly inclined toward Google,
but that's neither here nor there.
But the problem is, is these large cloud providers
have certain economic factors that they all treat similarly
since they're competing with each other.
And that causes me to believe things that
aren't necessarily true.
One of those is that egress bandwidth
to the internet is very expensive.
I've worked in data centers. I know
how 95th percentile commit
bandwidth billing works.
It is not overwhelmingly expensive,
but you can be forgiven for believing that it
is looking at cloud environments.
Today, BlueSky does not support animated GIFs, however you want to mispronounce that word.
They don't support embedded videos.
And my immediate thought is, oh, yeah, those things would be super expensive to wind up sharing.
I don't know that that's true.
I don't get the sense that those are major cost drivers.
I think it's more a matter of
complexity than the rest. But how are you making sure that the large cloud provider economic models
don't inherently shape your view of what to build versus what not to build?
Yeah, no, I kind of knew where you're going as soon as you mentioned that, because anyone who's
worked in data centers knows that the bandwidth pricing is out of control. And I think one of the cool things that Cloudflare did is they stopped charging for egress bandwidth in
certain scenarios, which is kind of amazing. And I think it's the other thing that a lot of people
don't realize is that, you know, these network connections tend to be fully symmetric, right?
So if it's a gigabit down, it's also a gigabit up at the same time, right? There's two gigabits
that can be transferred per second. And then the other thing that I find a little bit frustrating
on the public clouds is that they don't really pass on
the compute performance improvements that have happened
over the last few years, right?
Like computers are really fast, right?
So if you look at a provider like Hetzner,
they're giving you these monster machines
for $128 a month or something, right?
And then you go and try to buy that same thing
on one of the public, the big cloud providers,
and that's the equivalent is 10 times that, right? And then if you add in the bandwidth,
it's another multiple depending on how much you're transferring.
You can get Mac minis on EC2 now, and you do the math out and the Mac mini hardware is paid for in
the first two or three months of spinning that thing up. And yes, there's value in AWS's
engineering and being able to map IAM and EBS to it.
In some use cases, yeah, it's well worth having, but not in every case.
And the economics get very hard to justify for an awful lot of work cases.
Yeah, I mean, to your point, though, about limiting product features and things like that,
one of the goals I have with doing infrastructure at Blue Sky is to not let the infrastructure be a limiter on our product decisions.
And a lot of that means
that we'll put servers on Hetchner, we'll code low servers for things like that. I find that there's
a really good hybrid cloud thing where you use AWS or GCP or Azure, and you use them for your
most critical things, your relatively low bandwidth things, and the things that need to be the most
flexible in terms of region and things like that, and security. And then for these sort of bulk services, pushing a lot of video content,
right, or pushing a lot of images, those things you put in a colo somewhere and you have these
sort of CDN-like servers, and that kind of gives you the best of both worlds. And so,
you know, that's the approach we'll most likely take at Blue Sky.
I want to emphasize something you said a minute ago about Cloudflare,
where when they first announced R2, their object store alternative, when it first came out, I did an analysis on this to explain to people just why this was as big as it was.
Let's say you have a one gigabyte file and it blows up and a million people download it over the course of a month.
AWS will come to you with a completely straight face, give you a bill for $65,000 and expect you to pay it.
The exact same pattern with R2 in front of it,
at the end of the month,
you will be faced with a bill for 13 cents rounded up
and you will be expected to pay it.
And something like nine to 12 cents of that initially
would have just been the storage cost on S3
and the single egress fee for it.
The rest is there was no
egress cost tied to it. Now, is Cloudflare going to let you send petabytes to the internet and not
charge you on a bandwidth basis? Probably not, but they're also going to reach out with an upsell,
and they're going to have a conversation with you of, would you like to transition to our enterprise
plan, which is a hell of a lot better than I got slash dotted or whatever the modern version of that is.
And here's a surprise bill that's going to cost as much as a Tesla.
Yeah. I mean, I think, I think one of the things that the cloud providers should
hopefully eventually do, and I hope Cloudflare pushes them in this direction,
is to start the original vision of AWS. When I first started using it in 2006 or whenever
it launched was, and they said this, they said, they're going to lower your bill every so often,
you know, as Moore's law makes it their bill lower. And that kind of happened a little bit
here and there, but it hasn't happened to the same degree that, you know, I think all of us
hoped it would. And I would love to see a cloud provider and, you know, Hetzner does this to some
degree, but I'd love to see these really big cloud And, you know, Hetzner does this to some degree,
but I'd love to see these really big cloud providers that are so great in so many ways,
just pass on the savings of technology to the customer.
So we will use more stuff there.
I think it's a very enlightened viewpoint
is to just say, hey, we're going to lower the cost,
increase the efficiency,
and then pass it on to customers.
And then they will use more of our services as a result.
And I think Cloudflare is kind of leading the way in there,
which I love. I do need to add something there, because otherwise we're going to get letters, and I don't think we want that, where AWS reps will of course reach out
and say that they have cut prices over a hundred times, and they're going to ignore the fact that
a lot of these were a service you don't use in a region you couldn't find on a map if your life
depended on it, now is going to be 10%
less great but let's look at the general case where from c3 to c4 if you get the same size instance
it cut the price by a lot c4 to c5 somewhat c5 to c6 was effectively a no change and now from c6 to
c7 it is six percent more expensive like for like. And they're making noises about price performance is
still better. But there are an awful lot of us who say things like, I need 10 of these servers
to live over there. That workload gets more expensive when you start treating it that way.
And maybe the price performance is there, maybe it's not. But it is clear that the bill always
goes down is not true. Yeah. And I think for certain kinds
of organizations, it's totally fine the way that they do it. They do a pretty good job on price
and performance. But for sort of more technical companies, especially, it's just you can see the
gaps there where that Hetzner is filling and that co-location is still filling. And I personally,
you know, if I didn't need to do those things, I wouldn't do them, right? But the fact that you need to do them, I think,
says kind of everything. Tired of wrestling with Apache Kafka's complexity and cost?
Feel like you're stuck in a Kafka novel, but with more latency spikes and less existential dread by
at least 10%? You're not alone. What if there was a way to 10x your streaming data performance without having to rob a bank?
Enter Red Panda.
It's not just another Kafka wannabe.
Red Panda powers mission-critical workloads without making your AWS bill look like a phone
number.
And with full Kafka API compatibility, migration is smoother than a fresh jar of peanut butter. Imagine cutting as much as
50% off your AWS bills. With Red Panda, it's not a pipe dream, it's reality. Visit go.redpanda.com
slash duckbill today. Red Panda, because your data infrastructure shouldn't give you Kafka-esque
nightmares. There are so many weird AWS billing stories
that all distill down to you not knowing
this one piece of trivia about how AWS works,
either as a system, as a billing construct,
or as something else.
And there's a reason this has become my career
of tracing these things down.
And sometimes I'll talk to prospective clients
and they'll say, well, what if you don't discover
any misconfigurations like that in our account?
It's, well, you would be the first company I've ever seen where that was not true.
So I honestly, I want to do a case study if we do.
And I've never had to write that case study just because it's the tax on not having the forcing function of building in data centers.
There's always this idea that in a data center, you're going to run a power space capacity at some point.
It's going to force a reckoning.
The cloud has what distills down to infinite capacity.
They can add it faster than you can fill it.
So at some point, it's always just keep adding more things to it.
There's never a, let's clean out all of the cruft story.
And it just accumulates and the bill continues to go up and to the right.
Yeah, I mean, one of the things that they've done so well is handle the provisioning part,
right? Which is kind of what you're getting at there. One of the hardest things in the old days,
before we all used AWS and GCP, is you'd have to sort of requisition hardware and there'd be this
whole process with legal and financing. And there'd be this big lag between the time you need a bunch
more servers in your data center and when you actually have them.
Right. And that's not even counting the time it takes to rack them and get them all networked.
The fact that basically every developer now just gets an unlimited credit card they can just use, that's hugely empowering.
And it's for the benefit of the companies they work for almost all the time.
But it is an uncapped credit card.
I know they actually support controls and things like that.
But in general, the way we treat it.
Not as much as you would think, as it turns out.
But yeah, that's a problem.
Because again, if I want to spin up $65,000 an hour worth of compute right now,
the fact that I can do that is massive.
The fact that I can do that accidentally when I don't intend to is also massive.
Yeah, yeah.
It's very easy to think you're going to spend a certain amount and then, oh, traffic's a lot higher or, oh, I didn't realize when you enable that thing, it charges you an extra fee or something like that. So it's very opaque. It's very complicated. All of these things are the result of just building more and more stuff on top of more and more stuff to support more and more use cases, which is great. But then it does create this very sort of opaque billing problem, which know, you're helping companies solve. And I totally get why they need your help.
What's interesting to me about distributed social networks is that I've been using Mastodon for a little bit, and I've started to see some of the challenges around a lot of these things, just from an infrastructure and architecture perspective. Tim Bray, former distinguished engineer at AWS, posted a blog post yesterday.
And okay, if Tim wants to put something up there that he thinks people should read, I
advise people generally read it.
I have yet to find him wasting my time.
And I click it and got a server over resource limits.
It's like, wow, you're very popular.
You wound up getting, got effectively slashed on it.
And he said, no, no, whenever I post a link to Mastodon, 2000 instances all hit it at the same time. And it's, ooh, yeah, the hug
of death. That becomes a challenge. Not to mention the fact that depending upon architecture and
preferences that you make, running a Mastodon instance can be extraordinarily expensive in
terms of storage, just because it'll, by default, attempt to cash everything that it encounters for a period of time. And that gets very heavy very quickly.
Does the AT protocol, A-T protocol, I don't know how you pronounce it officially these days,
take into account the challenges of running infrastructures designed for folks who have
corporate budgets behind them? Or is that really a future problem for us to worry about
when the time comes? No, yeah, that's a core thing that we talked about a lot in the recent
sort of architecture discussions. I mean, they go back quite a ways, but there were some changes
made about six months ago in our thinking. And one of the big things that we wanted to get right
was the ability for people to host their own PDS, which is equivalent to like hosting a WordPress
or something. It's where you post your content. It's where you post your likes and all that kind of thing.
We call it your repository or your repo. But we wanted to make it so that people could self-host
that on a $4, $5, $6 a month droplet on DigitalOcean or wherever, and that not be a problem,
not go down when they got a lot of traffic. And so the architecture of AdProto in general,
but the Blue Sky app on App Proto,
is such that, yeah, you really don't need a lot of resources.
The data is all signed with your cryptographic keys.
Not something you have to worry about as a non-technical user,
but all of the data is authenticated.
That's what its authenticated transfer protocol.
And because of that, it doesn't matter where you get the data, right?
So we have this idea of this big indexer that's looking at the entire network called the BGS,
the big graph server.
And you can go to the BGS and get the data that came from somebody's PDS.
And it's just as good as if you got it directly from the PDS.
And that makes it highly cacheable, highly conducive to CDNs and things like that.
So no, we intend to solve that problem entirely.
I'm looking forward to seeing how that plays out,
because the idea of self-hosting always kind of appealed to me when I was younger, which is why
when I met my wife, I had a two-bedroom apartment because I lived in Los Angeles, not San Francisco,
and could afford such a thing. And the guest bedroom was always, you know, 10 to 15 degrees
warmer than the rest of the apartment because I had a bunch of quote-unquote servers there,
meaning deprecated desktops that my employer had no use for and said, it's either
going to e-waste or your place if you want some. And okay, why not? I'll build my own cluster at
home. And increasingly over time, I found that it got harder and harder to do things that I liked
and that made sense. I used to have a partial rack in downtown LA where I ran my own mail server, among other things. And when I switched to Google for email solutions, I suddenly
found that I was spending five bucks a month at the time instead of the rack rental. And I was
spending two hours less a week just fighting spam in a variety of different ways. Because that is
where my technical background lives. Being able to not have to think about problems like that and just do the fun part
was great. But I worry about the centralization that that implies. I was opposed to it at the
idea because I didn't want to give Google access to all of my mail. And then I checked and something
like 43% of the people I was emailing were at Gmail hosted addresses. So they already had my email
anyway. What was I really doing by not engaging with them? I worry that self-hosting is going to
become passe. So I love projects that do it in sane and simple ways that don't require
massive amounts of startup capital to get started with.
Yeah, the account portability feature of AppProto is super, super core.
You can back up all of your data to your phone. The app doesn't do this yet,
but it most likely will in the future. And you can back up all of your data to your phone,
and then you can synchronize it all to another server. So for whatever reason,
you're on a PDS instance and it disappears, which is a common problem in the Mastodon world,
it's not really a problem. You just sync all that data to a new PDS and you're back where you are. You didn't lose any
followers. You didn't lose any posts. You didn't lose any likes. And we're also making sure that
this works for non-technical people. So you don't have to host your own PDS, right? That's something
that technical people can self-host if they want to. Non-technical people can just get a host from
anywhere and it doesn't really matter where your host is. But we are absolutely trying to avoid the fate of SMTP and other protocols.
The web itself, right, it's hard to launch a search engine because, first of all, the
bar is billions of dollars a year in investment.
And a lot of websites will only let you crawl them at a high rate if you're actually coming
from a Google IP, right?
They're doing reverse DNS lookups and things like that to verify that you are Google.
And the problem with that is now there's sort of a decentralization with a search engine
that can't be fixed.
With AppProto, it's much easier to scrape all of the PDSs, right?
So if you want to crawl all the PDSs out on the AppProto network, they're designed to
be crawled from day one.
It's all structured data.
We're working on sort of handling how you handle rate limits and things like that still.
But the idea is that it's very easy to create an index of the entire network, which makes
it very easy to create feed generators, search engines, or any other kind of sort of big
world networking thing out there.
And then without making the PDSs have to be very high power, right?
So they can be low power and still scrapable,
still crawlable.
Yeah, the idea of having portability is super important.
Question I've got, you know,
while I'm talking to you,
we'll turn this into technical support hour as well,
because why not?
I tend to always historically
put my Twitter handle on conference slides.
When I had the first template made,
I used it as soon as it came in.
And there was an extra N in the Quinny Pig username at the bottom.
And of course, someone asked about that during Q&A.
So the answer I gave was, of course, N plus one redundancy.
But great.
If I were to have one domain there today and change it tomorrow, is there a redirect option
in place where someone could go and find that on Blueski and, they'll get redirected to where I am now.
Or is it just one of those 404 sucks to be you moments?
Cause I can see validity to both.
Yeah.
So the,
the way we handle it right now is if you have a something.beastguy.social
name and you switch it to your own domain or something like that,
we don't yet forward it from the old.beastguy.social name,
but that is totally feasible.
It's totally possible.
Like the way that those are stored in your, what's called your did record or did document
is that there's like a list that's currently only has one item in general, but it's a list
of all of your different names, right?
So you could have different domain names, different subdomain names, and it would all
point back to the same user.
And so, yeah,
so basically the idea is that you'll have these aliases and they will forward to the new one,
whatever the current canonical one is. Excellent. That is something that concerns
me because it feels like it's one of those one-way doors in the same way that picking
an email address was a one-way door. I know people who still pay money to their ancient
crappy ISP because they have a few mails that come in once
in a while that are super important. I was fortunate enough to have jumped on the bandwagon
early enough that my vanity domain is 22 years old this year and my email address still works,
which great. Every once in a while, I still get stuff to like variants of my name. I don't go
use anymore since 2005 and it's usually spam, but every once in a blue moon, it's something important. Like, Hey, I don't remember. We went
to college together many years ago. It's holy crap. The world is smaller than we think.
Yeah. I mean, I love that we're using domains. I think that's one of the greatest decisions we
made is, is that you own your own domain. You're not really stuck as a, in our namespace, right?
Like one of the things with traditional social networks is you're sort of theirdomain.com slash your name, right? And
with the way that AppProto and Blue Sky work is you can go and get a domain name from any registrar.
There's hundreds of them. You know, we like Namecheap. You can go there, you can grab a domain
and you can point it to your account. And if you ever don't like anything, you can change your domain,
you can change which PDS you're on.
It's all completely controlled by you.
And there's really no way we as a company
can do anything to change that.
That's all sort of locked into the way
that the protocol works,
which creates this really great incentive
where if we want to provide you services
or somebody else wants to provide you services,
they just have to compete on doing a really good job.
You're not locked in.
And that's one of my favorite features of the network. I just want to point something out
because you mentioned, oh, we're big fans of Namecheap. I am too for weird half-drunk domain
registrations on a lark. You're like, why am I poor? It's like $3,000 a month. My budget goes
to domain purchases. Great. But I did a quick who is on the official Blue Sky domain, and it's hosted at Route 53,
which is Amazon's, of course, premier database offering. But I'm a big fan of using an enterprise
registrar for enterprise-y things. Wasabi, if I recall correctly, wound up having their primary
domain registered through GoDaddy. And the public domain that their bucket equivalent would serve
data out of got
shut down for 12 hours because some bad actor put something there that shouldn't have been.
And GoDaddy is not an enterprise registrar, despite what they might think. For God's sake,
the word daddy is in their name. Do you really think that's enterprise? Good luck.
So the fact that you have a responsible company handling these central singular points of failure
speaks very well
to just your own implementation of these things because that's the sort of thing that everyone
figures out the second time yeah yeah i think i think there's a big difference between corporate
domain registration and corporate dns and like your personal handle on social networking i think
a lot of the consumers sort of domain registries or registrars are great for consumers.
And I think if you're running a big corporate domain, you want to make sure it's transfer locked and there's two-factor authentication and doing all those kind of things right.
Because that is a single point of failure.
You can lose a lot by having your domain taken.
So I agree with you on that.
Oh, absolutely.
I am curious about this to see if it's still the case or not.
Because I haven't checked this in over a year and they did fix it. Okay. As of at least when
we're recording this, which is the end of May, 2023, Amazon's authoritative name servers are
no longer half an Oracle. Good for them. They now have a bunch of Amazon specific name servers on
them instead of, you know, their competitor that
they clearly despise. Good work. Good work. I really want to thank you for taking the time
to speak with me about how you're viewing these things and honestly giving me a chance to go
ambling down memory lane. If people want to learn more about what you're up to, where's the best
place for them to find you? Yeah, so I'm on Blue Sky. It's invite only. I apologize for that right now. But if you check out bsky.app, you can see how to sign up for the waitlist. And we are
trying to get people on as quickly as possible. And I will, of course, be talking to you there.
And we'll put links to that in the show notes. Thank you so much for taking the time to speak
with me. I really appreciate it. Thanks a lot, Corey. It was great. Jake Gold, infrastructure engineer at BlueSky slash
BlueSky. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this
podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've
hated this podcast, please leave a five-star review on your podcast platform of choice,
along with an angry comment that will no doubt result in a surprise $60,000 bill after you post it.
If your AWS bill keeps rising and your blood pressure is doing the same,
then you need the Duck Bill Group.
We help companies fix their AWS bill by making it smaller and less horrifying.
The Duck Bill Group works for you, not AWS.
We tailor recommendations to your business
and we get to the point.
Visit duckbillgroup.com to get started.