Screaming in the Cloud - Building Reliable Open-Source Social Media with Jake Gold

Episode Date: June 27, 2023

Jake Gold, Infrastructure Engineer at Bluesky, joins Corey on Screaming in the Cloud to discuss his experience helping to build Bluesky and why he’s so excited about it. Jake and Corey disc...uss the major differences when building a truly open-source social media platform, and Jake highlights his focus on reliability. Jake explains why he feels downtime can actually be a huge benefit to reliability engineers, and why how he views abstractions based on the size of the team he’s working on. Corey and Jake also discuss whether cloud is truly living up to its original promise of lowered costs. About JakeJake Gold leads infrastructure at Bluesky, where the team is developing and deploying the decentralized social media protocol, ATP. Jake has previously managed infrastructure at companies such as Docker and Flipboard, and most recently, he was the founding leader of the Robot Reliability Team at Nuro, an autonomous delivery vehicle company.Links Referenced:Bluesky: https://blueskyweb.xyz/Bluesky waitlist signup: https://bsky.app

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud. Welcome to Screaming in the Cloud. I'm Corey Quinn.
Starting point is 00:00:33 In case folks have missed this, I spent an inordinate amount of time on Twitter over the last decade or so, to the point where my wife, my business partner, and a couple of friends all went in over the holidays and got me a leather-bound set of books titled The Collected Works of Corey Quinn. It turns out that I have over a million words of shitpost on Twitter. If you've also been living in a cave for the last year, you'll notice that Twitter has basically been bought and driven into the ground by the world's saddest manchild. So there's been a bit of a diaspora as far as people trying to figure out where community lives.
Starting point is 00:01:19 Jake Gold is an infrastructure engineer at Blue Sky, which I will continue to be mispronouncing as Blue Ski because that's the kind of person I am, which is, as best I can tell, one of the leading contenders, if not the leading contender, to replace what Twitter was for me. Jake, welcome to the show. Thanks a lot, Corey. Glad to be here. So there's a lot of different angles we can take on this. We can talk about the policy side of it. We can talk about social networks and things we learn watching people in large groups with quasi-anonymity. We can talk about all kinds of different nonsense, but I don't want to do that because I am an old school Linux systems administrator. And I believe you came from the exact same path,
Starting point is 00:01:56 given that as we were making sure that I had the right person on the show, you came into work at a company after I'd left previously. So not only are you good at the whole Linux server thing, you also have seen exactly how good I am not at the Linux server thing. Well, I don't remember there being any problems at Truecar where you worked before me. But yeah, my background is doing Linux systems administration, which turned into sort of Linux programming. And these days we call it site reliability engineering. But yeah, I discovered Linux in the late 90s as a teenager and installing Slackware on 50 floppy disks and things like that. And I just fell in love with the magic of being
Starting point is 00:02:35 able to run a web server. I got a hosting account at my local ISP and I was like, how do they do that? And then I figured out how to do it. I ran Apache and it was like still one of my core memories of getting, you know, HDBD running and being able to access it over the internet and tell my friends on IRC. And so I've done a whole bunch of things since then, but that's still like the part that I love the most. The thing that continually surprises me is just when I think I'm out and we've moved into a fully modern world where, oh, all I do is I write code anymore, which I didn't realize I was doing until I realized if you call YAML code, you can get away with anything. And I get dragged, I'm also getting dragged back in. It's the falling back to fundamentals in these weird moments of, yes, yes, immutable everything, infrastructure is code. But when the server's
Starting point is 00:03:19 misbehaving and you want to log in and get your hands dirty, the skill set rears its head yet again. At least that's what I've been noticing, at least as far as I've gone down a number of interesting IoT-based projects lately. Is that something you experience or have you evolved fully and not looked back? Yeah, no, what I try to do is on my personal projects,
Starting point is 00:03:38 I'll use all the latest cool, flashy things, any abstraction you want, I'll try out everything. And then what I do is at work, I kind of have like a one or two year sort of lagging adoption of technologies like when i've actually shaken them out of my own stuff then i use them at work and but yeah i think there's one of my favorite quotes is like programmers first learn the power of abstraction then they learn the cost of abstraction and then they're ready to program and that's how i view infrastructure very similar thing where you know certain abstractions like container orchestration or, you know, things like that can be super powerful if
Starting point is 00:04:09 you need them. But like, you know, that's generally very large companies with lots of teams and things like that. And if you're not that, it pays dividends to not use overly complicated, overly abstracted things. And so that tends to be more I fall out most of the time. I'm sure someone's going to consider this to be heresy, but if I'm tasked with getting a web application up and running in short order, I'm putting it on an old school traditional three-tier architecture. You have a database server, a web server or two, maybe a job server that lives between them, because is it the hotness? No. Is it going to be resume bait? Not really. But you know, it's deterministic as far as where things live. When something breaks, I know where to find it. And you can miss me with the, well, that's not web scale response because yeah, by the time I'm
Starting point is 00:04:55 getting something up overnight to this has to serve the entire internet, there's probably a number of architectural iterations I'm going to be able to go through. The question is, is what am I most comfortable with and what can I get things up and running with that's tried and tested? I'm also remarkably conservative on things like databases and file systems because mistakes at that level are absolutely going to show. Now, I don't know how much you're able to talk about the blue ski infrastructure without getting yelled at by various folks. But how modern versus reliable? I guess that's probably a fair axis to put it on.
Starting point is 00:05:32 Modernity versus reliability. Where on that spectrum does the official blue ski infrastructure land these days? Yeah, so I mean, we're in a fortunate position of being an open source company working on an open protocol. And so we feel very comfortable talking about basically everything. Yeah. And so and I've talked about this a bit on the app. But the basic idea we have right now is we're using AWS, we have auto scaling groups. And those auto scaling groups are just EC2 instances running Docker CE, the community edition for the runtime for containers. And then we have a load balancer in front and a Postgres multi AZ instance in the back on RDS. And it is really, really simple. And when I talk about the difference between like a reliability engineer and a normal software engineer is software engineers tend to be very feature focused, you know, they're adding capabilities to a system. And the goal of and the mission of a reliability team is to focus
Starting point is 00:06:25 on reliability, right? Like that's the primary thing that we're worried about. So what I find to be the best resume builder is that I can say with a lot of certainty that if you talk to any teams that I've worked on, they will say that the infrastructure I ran was very reliable. It was very secure and it ended up being very scalable because, you know, the way you solve the sort of iteration thing is you just version your infrastructure, right? And I think this works here. And it ended up being very scalable because, you know, the way you solve the sort of iteration thing is you just version your infrastructure, right? And I think this works really well. You just say, hey, this was the way we did it now. And we're going to call that V1. And now we're going to work on V2. And what should V2 be? And maybe that does need something more
Starting point is 00:06:56 complicated. Maybe you need to bring in Kubernetes. Maybe you need to bring in a super cool reverse proxy that has all sorts of capabilities that your current one doesn't. Yeah, but by versioning it, it takes away a lot of the interpersonal issues that can happen where like, hey, we're replacing Jake's infrastructure with Bob's infrastructure or whatever. I just say it's V1, it's V2, it's V3. And I find that solves a huge number of the problems with that dynamic. But yeah, at Blue Sky, the big thing that we are focused on is federation is scaling for us because the idea is not for us to run the entire global infrastructure for Ad Proto, which is the protocol that Blue Sky is based on. The idea is that it's this big open thing like the web, right?
Starting point is 00:07:36 Like Netscape popularized the web, but they didn't run every web server. They didn't run every search engine, right? They didn't run all the payment stuff. They just did all of the core stuff. You know, they created SSL, right? Which became TLS. And they did all the things that were necessary to make the whole system large, federated, and scalable, but they didn't run it all. And that's exactly the same goal we have. The obvious counter example is no, but then you take basically their spiritual successor, which is Google, and they build the security. They run a lot of the servers. They have the search engine. They have the payments infrastructure. And then they turn
Starting point is 00:08:08 a lot of it off for fun and, I would say, profit, except it's the exact opposite of that. But I digress. I do have a question for you that I love to throw at people whenever they start talking about how their infrastructure involves autoscaling. And I found this during the pandemic in that a lot of people believed in their heart of hearts that they were auto-scaling, but people lie mostly to themselves. And you would look at their daily or hourly spend of their infrastructure and their user traffic dropped off a cliff and their spend was so flat you could basically eat off of it and set a table on top of it. If you pull up Cost Explorer and look through your environment, how large are the peaks and valleys over the course of a given
Starting point is 00:08:51 day or week cycle? Yeah, no, that's a really good point. I think my basic approach right now is that we're so small, we don't really need to optimize very much for cost. You know, we have this sort of base level of traffic and it's not worth a huge amount of engineering time to do a lot of dynamic scaling and things like that. The main benefit we get from auto scaling groups is really just doing the refresh to replace all them, right? So we're also doing the immutable server concept, right, which was popularized by Netflix. And so that's what we're really getting from auto scaling groups. We're not even doing dynamic scaling, right? So it's not keyed to some metric, you know, the number of instances that we have at
Starting point is 00:09:26 the app server layer. But the cool thing is you can do that when you're ready for it, right? The big issue is, you know, okay, you're scaling up your app instances, but is your database scaling up, right? Because there's not a lot of use in having a whole bunch of app servers if the database is overloaded. And that tends to be the bottleneck for kind of any complicated kind of application like ours. So right now, the bill is very flat. You could eat off it if it wasn't for the CBN traffic and the load balancer traffic and things like that, which are relatively minor. I just want to stop for a second and marvel at just how educated that answer was. I talk to a lot of folks who are early stage who come and
Starting point is 00:10:01 ask me about their AWS bills and what sort of things should they concern themselves with. And my answer tends to surprise them, which is you almost certainly should not unless things are bizarre and ridiculous. You are not going to build your way to your next milestone by cutting costs or optimizing your infrastructure. The one thing that I would make sure to do is plan for a future of success, which means having account segregation where it makes sense, having tags in place so that when, aha, this thing's gotten really expensive, what's
Starting point is 00:10:31 driving all of that can be answered without a six-week research project attached to it. But those are baseline AWS Hygiene 101. How do I optimize my bill further is usually the right answer is, go bill. Don't worry about the small stuff. What's always disturbing is people have that perspective and they're spending $300 million a year. But it turns out that not caring about your AWS bill was in fact a zero interest rate phenomenon. Yeah, so we do all of those basic things.
Starting point is 00:10:59 I think I went a little further than many people would where every single one of our, so we have different projects, right? So we have the big graph server, which is sort of like the indexer for the whole network and we have the pds which is the personal data server which is kind of where all of people's actual social data goes your likes and your posts and things like that and then we have a dev staging sandbox prod environment for each one of those right and there's more services besides but the way we have it is those are all in completely separated VPCs with no peering whatsoever between them.
Starting point is 00:11:29 They are all on distinct IP addresses, IP ranges, so that we could do VPC peering very easily across all of them. That's someone who's done data center work before with overlapping IP address ranges and swore never again. Exactly. That is what I have been burned. I have cleaned up my mess and other people's messes. And there's nothing less fun than renumbering a large, complicated network. But yeah, so we have all these separate VPCs. And so it's very easy for us to say, hey, we're going to take this whole stack from here and move it over to a different region, a different provider. And the other thing is that we're doing is we're completely cloud agnostic, right? I really like AWS. I think they are the market leader for a reason. They're very reliable.
Starting point is 00:12:08 But we're building this large federated network. So we're going to need to place infrastructure in places where AWS doesn't exist, for example. Right. So we need the ability to take an environment and replicate it in wherever. And of course, they have very good coverage, but there are places they don't exist. And that's all made much easier by the fact that we've had this very strong separation of concerns. I always found it fun that when you had these decentralized projects that were invariably NFT or cryptocurrency driven over the past five or six years or so, and then AWS would take a US East
Starting point is 00:12:41 One outage in a variety of different and exciting ways. And all these projects would go down hard. It's, okay, you talk a lot about decentralization for being, having hard dependencies on one company in one data center effectively doing something right. And it becomes a harder problem in the fullness of time. There is the counter argument in that when US East 1 is having problems, most of the internet isn't working. So does your offering need to be up and running at all costs? There are some people for whom that answer is very much, yes, people will die if what we're running is not up and running.
Starting point is 00:13:15 Usually a social network is not on that list. Yeah, one of the things that is surprising, I think, often when I talk about this as a reliability engineer, is that I think people sometimes over-index on downtime. You know, they just they think it's a much bigger deal than it is. You know, I've worked on systems where there was credit card processing where you're losing a million dollars a minute or something. And like in that case, OK, it matters a lot because you can put a real dollar figure on it. But it's amazing how a few of the bumps in the road we've already had with Blue Sky have turned into sort of fun events, right? Like we had a bug in our invite code system where people were getting too many invite codes and it was sort of caused a problem, but it was a super fun event. We all think back on it fondly, right? And so outages
Starting point is 00:13:56 are not fun, but they're not life and death generally. And if you look at the traffic, usually what happens is after an outage, traffic tends to go up. And a lot of the people that join, they're just they're talking about the fun outage that they missed because they weren't even on the network. Right. So it's like I also like to remind people that eBay for many years used to have like an outage Wednesday. Right. Whereas they could put a huge dollar figure on how much money they lost every Wednesday. And yet eBay did quite well. Right. Like it's amazing what you can do if you relax the constraints of downtime a little bit. You can do maintenance things that would be impossible otherwise, which make the whole thing work better the rest of the time, for example. I mean, it's 2023 and the Social Security Administration's website still has business hours. They take a nightly four to six hour maintenance window. It's like the last person
Starting point is 00:14:42 out of the office turns off the server or something. I imagine it's some horrifying mainframe job that needs to wind up sweeping after itself or running some compute jobs. But yeah, for a lot of these use cases, that downtime is absolutely acceptable. I am curious as to, as you just said, you're building this out with an idea that it runs everywhere. So you're on AWS right now because, yeah, they are the market leader for a reason. If I'm building something from scratch, I'd be hard-pressed not to pick AWS for a variety of reasons. If I didn't have cloud expertise, I think I'd be more strongly inclined toward Google,
Starting point is 00:15:15 but that's neither here nor there. But the problem is, is these large cloud providers have certain economic factors that they all treat similarly since they're competing with each other. And that causes me to believe things that aren't necessarily true. One of those is that egress bandwidth to the internet is very expensive.
Starting point is 00:15:34 I've worked in data centers. I know how 95th percentile commit bandwidth billing works. It is not overwhelmingly expensive, but you can be forgiven for believing that it is looking at cloud environments. Today, BlueSky does not support animated GIFs, however you want to mispronounce that word. They don't support embedded videos.
Starting point is 00:15:53 And my immediate thought is, oh, yeah, those things would be super expensive to wind up sharing. I don't know that that's true. I don't get the sense that those are major cost drivers. I think it's more a matter of complexity than the rest. But how are you making sure that the large cloud provider economic models don't inherently shape your view of what to build versus what not to build? Yeah, no, I kind of knew where you're going as soon as you mentioned that, because anyone who's worked in data centers knows that the bandwidth pricing is out of control. And I think one of the cool things that Cloudflare did is they stopped charging for egress bandwidth in
Starting point is 00:16:28 certain scenarios, which is kind of amazing. And I think it's the other thing that a lot of people don't realize is that, you know, these network connections tend to be fully symmetric, right? So if it's a gigabit down, it's also a gigabit up at the same time, right? There's two gigabits that can be transferred per second. And then the other thing that I find a little bit frustrating on the public clouds is that they don't really pass on the compute performance improvements that have happened over the last few years, right? Like computers are really fast, right?
Starting point is 00:16:53 So if you look at a provider like Hetzner, they're giving you these monster machines for $128 a month or something, right? And then you go and try to buy that same thing on one of the public, the big cloud providers, and that's the equivalent is 10 times that, right? And then if you add in the bandwidth, it's another multiple depending on how much you're transferring. You can get Mac minis on EC2 now, and you do the math out and the Mac mini hardware is paid for in
Starting point is 00:17:17 the first two or three months of spinning that thing up. And yes, there's value in AWS's engineering and being able to map IAM and EBS to it. In some use cases, yeah, it's well worth having, but not in every case. And the economics get very hard to justify for an awful lot of work cases. Yeah, I mean, to your point, though, about limiting product features and things like that, one of the goals I have with doing infrastructure at Blue Sky is to not let the infrastructure be a limiter on our product decisions. And a lot of that means that we'll put servers on Hetchner, we'll code low servers for things like that. I find that there's
Starting point is 00:17:50 a really good hybrid cloud thing where you use AWS or GCP or Azure, and you use them for your most critical things, your relatively low bandwidth things, and the things that need to be the most flexible in terms of region and things like that, and security. And then for these sort of bulk services, pushing a lot of video content, right, or pushing a lot of images, those things you put in a colo somewhere and you have these sort of CDN-like servers, and that kind of gives you the best of both worlds. And so, you know, that's the approach we'll most likely take at Blue Sky. I want to emphasize something you said a minute ago about Cloudflare, where when they first announced R2, their object store alternative, when it first came out, I did an analysis on this to explain to people just why this was as big as it was.
Starting point is 00:18:33 Let's say you have a one gigabyte file and it blows up and a million people download it over the course of a month. AWS will come to you with a completely straight face, give you a bill for $65,000 and expect you to pay it. The exact same pattern with R2 in front of it, at the end of the month, you will be faced with a bill for 13 cents rounded up and you will be expected to pay it. And something like nine to 12 cents of that initially would have just been the storage cost on S3
Starting point is 00:19:02 and the single egress fee for it. The rest is there was no egress cost tied to it. Now, is Cloudflare going to let you send petabytes to the internet and not charge you on a bandwidth basis? Probably not, but they're also going to reach out with an upsell, and they're going to have a conversation with you of, would you like to transition to our enterprise plan, which is a hell of a lot better than I got slash dotted or whatever the modern version of that is. And here's a surprise bill that's going to cost as much as a Tesla. Yeah. I mean, I think, I think one of the things that the cloud providers should
Starting point is 00:19:37 hopefully eventually do, and I hope Cloudflare pushes them in this direction, is to start the original vision of AWS. When I first started using it in 2006 or whenever it launched was, and they said this, they said, they're going to lower your bill every so often, you know, as Moore's law makes it their bill lower. And that kind of happened a little bit here and there, but it hasn't happened to the same degree that, you know, I think all of us hoped it would. And I would love to see a cloud provider and, you know, Hetzner does this to some degree, but I'd love to see these really big cloud And, you know, Hetzner does this to some degree, but I'd love to see these really big cloud providers that are so great in so many ways,
Starting point is 00:20:08 just pass on the savings of technology to the customer. So we will use more stuff there. I think it's a very enlightened viewpoint is to just say, hey, we're going to lower the cost, increase the efficiency, and then pass it on to customers. And then they will use more of our services as a result. And I think Cloudflare is kind of leading the way in there,
Starting point is 00:20:28 which I love. I do need to add something there, because otherwise we're going to get letters, and I don't think we want that, where AWS reps will of course reach out and say that they have cut prices over a hundred times, and they're going to ignore the fact that a lot of these were a service you don't use in a region you couldn't find on a map if your life depended on it, now is going to be 10% less great but let's look at the general case where from c3 to c4 if you get the same size instance it cut the price by a lot c4 to c5 somewhat c5 to c6 was effectively a no change and now from c6 to c7 it is six percent more expensive like for like. And they're making noises about price performance is still better. But there are an awful lot of us who say things like, I need 10 of these servers
Starting point is 00:21:11 to live over there. That workload gets more expensive when you start treating it that way. And maybe the price performance is there, maybe it's not. But it is clear that the bill always goes down is not true. Yeah. And I think for certain kinds of organizations, it's totally fine the way that they do it. They do a pretty good job on price and performance. But for sort of more technical companies, especially, it's just you can see the gaps there where that Hetzner is filling and that co-location is still filling. And I personally, you know, if I didn't need to do those things, I wouldn't do them, right? But the fact that you need to do them, I think, says kind of everything. Tired of wrestling with Apache Kafka's complexity and cost?
Starting point is 00:21:53 Feel like you're stuck in a Kafka novel, but with more latency spikes and less existential dread by at least 10%? You're not alone. What if there was a way to 10x your streaming data performance without having to rob a bank? Enter Red Panda. It's not just another Kafka wannabe. Red Panda powers mission-critical workloads without making your AWS bill look like a phone number. And with full Kafka API compatibility, migration is smoother than a fresh jar of peanut butter. Imagine cutting as much as 50% off your AWS bills. With Red Panda, it's not a pipe dream, it's reality. Visit go.redpanda.com
Starting point is 00:22:35 slash duckbill today. Red Panda, because your data infrastructure shouldn't give you Kafka-esque nightmares. There are so many weird AWS billing stories that all distill down to you not knowing this one piece of trivia about how AWS works, either as a system, as a billing construct, or as something else. And there's a reason this has become my career of tracing these things down.
Starting point is 00:22:59 And sometimes I'll talk to prospective clients and they'll say, well, what if you don't discover any misconfigurations like that in our account? It's, well, you would be the first company I've ever seen where that was not true. So I honestly, I want to do a case study if we do. And I've never had to write that case study just because it's the tax on not having the forcing function of building in data centers. There's always this idea that in a data center, you're going to run a power space capacity at some point. It's going to force a reckoning.
Starting point is 00:23:28 The cloud has what distills down to infinite capacity. They can add it faster than you can fill it. So at some point, it's always just keep adding more things to it. There's never a, let's clean out all of the cruft story. And it just accumulates and the bill continues to go up and to the right. Yeah, I mean, one of the things that they've done so well is handle the provisioning part, right? Which is kind of what you're getting at there. One of the hardest things in the old days, before we all used AWS and GCP, is you'd have to sort of requisition hardware and there'd be this
Starting point is 00:23:57 whole process with legal and financing. And there'd be this big lag between the time you need a bunch more servers in your data center and when you actually have them. Right. And that's not even counting the time it takes to rack them and get them all networked. The fact that basically every developer now just gets an unlimited credit card they can just use, that's hugely empowering. And it's for the benefit of the companies they work for almost all the time. But it is an uncapped credit card. I know they actually support controls and things like that. But in general, the way we treat it.
Starting point is 00:24:26 Not as much as you would think, as it turns out. But yeah, that's a problem. Because again, if I want to spin up $65,000 an hour worth of compute right now, the fact that I can do that is massive. The fact that I can do that accidentally when I don't intend to is also massive. Yeah, yeah. It's very easy to think you're going to spend a certain amount and then, oh, traffic's a lot higher or, oh, I didn't realize when you enable that thing, it charges you an extra fee or something like that. So it's very opaque. It's very complicated. All of these things are the result of just building more and more stuff on top of more and more stuff to support more and more use cases, which is great. But then it does create this very sort of opaque billing problem, which know, you're helping companies solve. And I totally get why they need your help. What's interesting to me about distributed social networks is that I've been using Mastodon for a little bit, and I've started to see some of the challenges around a lot of these things, just from an infrastructure and architecture perspective. Tim Bray, former distinguished engineer at AWS, posted a blog post yesterday.
Starting point is 00:25:27 And okay, if Tim wants to put something up there that he thinks people should read, I advise people generally read it. I have yet to find him wasting my time. And I click it and got a server over resource limits. It's like, wow, you're very popular. You wound up getting, got effectively slashed on it. And he said, no, no, whenever I post a link to Mastodon, 2000 instances all hit it at the same time. And it's, ooh, yeah, the hug of death. That becomes a challenge. Not to mention the fact that depending upon architecture and
Starting point is 00:25:56 preferences that you make, running a Mastodon instance can be extraordinarily expensive in terms of storage, just because it'll, by default, attempt to cash everything that it encounters for a period of time. And that gets very heavy very quickly. Does the AT protocol, A-T protocol, I don't know how you pronounce it officially these days, take into account the challenges of running infrastructures designed for folks who have corporate budgets behind them? Or is that really a future problem for us to worry about when the time comes? No, yeah, that's a core thing that we talked about a lot in the recent sort of architecture discussions. I mean, they go back quite a ways, but there were some changes made about six months ago in our thinking. And one of the big things that we wanted to get right
Starting point is 00:26:39 was the ability for people to host their own PDS, which is equivalent to like hosting a WordPress or something. It's where you post your content. It's where you post your likes and all that kind of thing. We call it your repository or your repo. But we wanted to make it so that people could self-host that on a $4, $5, $6 a month droplet on DigitalOcean or wherever, and that not be a problem, not go down when they got a lot of traffic. And so the architecture of AdProto in general, but the Blue Sky app on App Proto, is such that, yeah, you really don't need a lot of resources. The data is all signed with your cryptographic keys.
Starting point is 00:27:11 Not something you have to worry about as a non-technical user, but all of the data is authenticated. That's what its authenticated transfer protocol. And because of that, it doesn't matter where you get the data, right? So we have this idea of this big indexer that's looking at the entire network called the BGS, the big graph server. And you can go to the BGS and get the data that came from somebody's PDS. And it's just as good as if you got it directly from the PDS.
Starting point is 00:27:34 And that makes it highly cacheable, highly conducive to CDNs and things like that. So no, we intend to solve that problem entirely. I'm looking forward to seeing how that plays out, because the idea of self-hosting always kind of appealed to me when I was younger, which is why when I met my wife, I had a two-bedroom apartment because I lived in Los Angeles, not San Francisco, and could afford such a thing. And the guest bedroom was always, you know, 10 to 15 degrees warmer than the rest of the apartment because I had a bunch of quote-unquote servers there, meaning deprecated desktops that my employer had no use for and said, it's either
Starting point is 00:28:09 going to e-waste or your place if you want some. And okay, why not? I'll build my own cluster at home. And increasingly over time, I found that it got harder and harder to do things that I liked and that made sense. I used to have a partial rack in downtown LA where I ran my own mail server, among other things. And when I switched to Google for email solutions, I suddenly found that I was spending five bucks a month at the time instead of the rack rental. And I was spending two hours less a week just fighting spam in a variety of different ways. Because that is where my technical background lives. Being able to not have to think about problems like that and just do the fun part was great. But I worry about the centralization that that implies. I was opposed to it at the idea because I didn't want to give Google access to all of my mail. And then I checked and something
Starting point is 00:28:59 like 43% of the people I was emailing were at Gmail hosted addresses. So they already had my email anyway. What was I really doing by not engaging with them? I worry that self-hosting is going to become passe. So I love projects that do it in sane and simple ways that don't require massive amounts of startup capital to get started with. Yeah, the account portability feature of AppProto is super, super core. You can back up all of your data to your phone. The app doesn't do this yet, but it most likely will in the future. And you can back up all of your data to your phone, and then you can synchronize it all to another server. So for whatever reason,
Starting point is 00:29:38 you're on a PDS instance and it disappears, which is a common problem in the Mastodon world, it's not really a problem. You just sync all that data to a new PDS and you're back where you are. You didn't lose any followers. You didn't lose any posts. You didn't lose any likes. And we're also making sure that this works for non-technical people. So you don't have to host your own PDS, right? That's something that technical people can self-host if they want to. Non-technical people can just get a host from anywhere and it doesn't really matter where your host is. But we are absolutely trying to avoid the fate of SMTP and other protocols. The web itself, right, it's hard to launch a search engine because, first of all, the bar is billions of dollars a year in investment.
Starting point is 00:30:16 And a lot of websites will only let you crawl them at a high rate if you're actually coming from a Google IP, right? They're doing reverse DNS lookups and things like that to verify that you are Google. And the problem with that is now there's sort of a decentralization with a search engine that can't be fixed. With AppProto, it's much easier to scrape all of the PDSs, right? So if you want to crawl all the PDSs out on the AppProto network, they're designed to be crawled from day one.
Starting point is 00:30:42 It's all structured data. We're working on sort of handling how you handle rate limits and things like that still. But the idea is that it's very easy to create an index of the entire network, which makes it very easy to create feed generators, search engines, or any other kind of sort of big world networking thing out there. And then without making the PDSs have to be very high power, right? So they can be low power and still scrapable, still crawlable.
Starting point is 00:31:08 Yeah, the idea of having portability is super important. Question I've got, you know, while I'm talking to you, we'll turn this into technical support hour as well, because why not? I tend to always historically put my Twitter handle on conference slides. When I had the first template made,
Starting point is 00:31:21 I used it as soon as it came in. And there was an extra N in the Quinny Pig username at the bottom. And of course, someone asked about that during Q&A. So the answer I gave was, of course, N plus one redundancy. But great. If I were to have one domain there today and change it tomorrow, is there a redirect option in place where someone could go and find that on Blueski and, they'll get redirected to where I am now. Or is it just one of those 404 sucks to be you moments?
Starting point is 00:31:49 Cause I can see validity to both. Yeah. So the, the way we handle it right now is if you have a something.beastguy.social name and you switch it to your own domain or something like that, we don't yet forward it from the old.beastguy.social name, but that is totally feasible. It's totally possible.
Starting point is 00:32:07 Like the way that those are stored in your, what's called your did record or did document is that there's like a list that's currently only has one item in general, but it's a list of all of your different names, right? So you could have different domain names, different subdomain names, and it would all point back to the same user. And so, yeah, so basically the idea is that you'll have these aliases and they will forward to the new one, whatever the current canonical one is. Excellent. That is something that concerns
Starting point is 00:32:33 me because it feels like it's one of those one-way doors in the same way that picking an email address was a one-way door. I know people who still pay money to their ancient crappy ISP because they have a few mails that come in once in a while that are super important. I was fortunate enough to have jumped on the bandwagon early enough that my vanity domain is 22 years old this year and my email address still works, which great. Every once in a while, I still get stuff to like variants of my name. I don't go use anymore since 2005 and it's usually spam, but every once in a blue moon, it's something important. Like, Hey, I don't remember. We went to college together many years ago. It's holy crap. The world is smaller than we think.
Starting point is 00:33:14 Yeah. I mean, I love that we're using domains. I think that's one of the greatest decisions we made is, is that you own your own domain. You're not really stuck as a, in our namespace, right? Like one of the things with traditional social networks is you're sort of theirdomain.com slash your name, right? And with the way that AppProto and Blue Sky work is you can go and get a domain name from any registrar. There's hundreds of them. You know, we like Namecheap. You can go there, you can grab a domain and you can point it to your account. And if you ever don't like anything, you can change your domain, you can change which PDS you're on. It's all completely controlled by you.
Starting point is 00:33:49 And there's really no way we as a company can do anything to change that. That's all sort of locked into the way that the protocol works, which creates this really great incentive where if we want to provide you services or somebody else wants to provide you services, they just have to compete on doing a really good job.
Starting point is 00:34:03 You're not locked in. And that's one of my favorite features of the network. I just want to point something out because you mentioned, oh, we're big fans of Namecheap. I am too for weird half-drunk domain registrations on a lark. You're like, why am I poor? It's like $3,000 a month. My budget goes to domain purchases. Great. But I did a quick who is on the official Blue Sky domain, and it's hosted at Route 53, which is Amazon's, of course, premier database offering. But I'm a big fan of using an enterprise registrar for enterprise-y things. Wasabi, if I recall correctly, wound up having their primary domain registered through GoDaddy. And the public domain that their bucket equivalent would serve
Starting point is 00:34:44 data out of got shut down for 12 hours because some bad actor put something there that shouldn't have been. And GoDaddy is not an enterprise registrar, despite what they might think. For God's sake, the word daddy is in their name. Do you really think that's enterprise? Good luck. So the fact that you have a responsible company handling these central singular points of failure speaks very well to just your own implementation of these things because that's the sort of thing that everyone figures out the second time yeah yeah i think i think there's a big difference between corporate
Starting point is 00:35:15 domain registration and corporate dns and like your personal handle on social networking i think a lot of the consumers sort of domain registries or registrars are great for consumers. And I think if you're running a big corporate domain, you want to make sure it's transfer locked and there's two-factor authentication and doing all those kind of things right. Because that is a single point of failure. You can lose a lot by having your domain taken. So I agree with you on that. Oh, absolutely. I am curious about this to see if it's still the case or not.
Starting point is 00:35:45 Because I haven't checked this in over a year and they did fix it. Okay. As of at least when we're recording this, which is the end of May, 2023, Amazon's authoritative name servers are no longer half an Oracle. Good for them. They now have a bunch of Amazon specific name servers on them instead of, you know, their competitor that they clearly despise. Good work. Good work. I really want to thank you for taking the time to speak with me about how you're viewing these things and honestly giving me a chance to go ambling down memory lane. If people want to learn more about what you're up to, where's the best place for them to find you? Yeah, so I'm on Blue Sky. It's invite only. I apologize for that right now. But if you check out bsky.app, you can see how to sign up for the waitlist. And we are
Starting point is 00:36:29 trying to get people on as quickly as possible. And I will, of course, be talking to you there. And we'll put links to that in the show notes. Thank you so much for taking the time to speak with me. I really appreciate it. Thanks a lot, Corey. It was great. Jake Gold, infrastructure engineer at BlueSky slash BlueSky. I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that will no doubt result in a surprise $60,000 bill after you post it. If your AWS bill keeps rising and your blood pressure is doing the same,
Starting point is 00:37:14 then you need the Duck Bill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duck Bill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.