Two's Complement - Squashing Compilers

Episode Date: August 10, 2025

Matt uses you as his therapist to vent about three days fighting systemd and boot time. Ben patiently listens while Matt explains why mounting things shouldn't consume 200% CPU. AWS sponsorship news p...rovides a silver lining.

Transcript
Discussion (0)
Starting point is 00:00:00 There are no millions. the intro. I was like, oh, because I was chatting to you before pre-show and then, yeah, got distracted. How long could you go not saying that before people would just be like, is this thing broken? Are we on the right podcast? Is this the right podcast? I don't even know. I have a funny story about that. Maybe we've already said it, but one of my pals, one of my C++ pals, his name is Ben, Ben Dean, right? And he's also British and, you know, and so like he was listening to the first episode of podcast and he just nearly fell off his chair because the first thing I do in might say hi Ben and he's like how what I mean I'm sure that other people
Starting point is 00:01:03 called Ben exist I appreciate that but like it was just him telling the story that made me laugh so it's like did I forget to leave a call or a Google meet somewhere and now someone is just talking to me yeah hey you you left your phone on you bum dialed me yeah exactly yeah we've all been there shouting as loud as we can trying to get someone's attention but yeah not that but anyway yeah I want to wonder how long we could get away with. But it's what we do. It's what we say. Oh, another thing. I met somebody who is a listener or our listener, singing. The listener. You met the listener. And it occurred to me, that's about the fourth person now who's told me that they are our listener.
Starting point is 00:01:45 Oh, so it's four listeners. We might have to accept that we have more than one now. it's it's so funny given how the internet is all about tracking and every you know YouTube video tells you how many people have watched it and all that kind of good stuff it's really hard to know how many listener there really are out there yes yes it seems broken because of one of the you know classic problems of computer science cash invalidation yeah yeah everyone wants to cash your I mean I suppose that That's the thing, like videos, nobody wants, would want to cash, you know, gigabytes of videos. But you're like, hey, a couple of meg of MP3 file.
Starting point is 00:02:27 Sure, Spotify, we'll take a copy of that and we'll hand it out to everyone when they press play. Yeah. And then, yeah, then we get you to try and sign up for our Spotify podcast publisher account that then lets you see how many plays you've had on Spotify. That's great. But you can also listen to it on Apple and Google and YouTube and off of our website. Oh, and I, you can pay, there are these places that will, you know, charge you decent amounts of money to tell you how many people listen to your podcast. But I, I'm just like, somebody should write a web scraper that goes to all of these places and gets more than one place, right? And by somebody, I mean probably an LLM that I could just tell it to do that thing.
Starting point is 00:03:10 Anyway, that's not what we're here to talk about today. No, we're not. What are we talking about today? We're talking about the last three days of my life. which have been supposedly preparing for conference presentations, which I have in September. I've got three conference presentations come up, which is great. And while you're sort of writing slides and throwing out ideas,
Starting point is 00:03:34 you're kind of like, well, I want a background window that I can kind of just tap into occasionally. And there's tons of stuff in Compiler Explorer, which is just like, hey, run this thing, wait for it to finish, have a look at the output, see if it makes sense, push it to production if it does, you know, updating libraries, that kind of. good stuff. And so we've hit August now, which I believe will be actually when this podcast comes out in a rare departure from the norm. But we've hit August and I've long since had a calendar reminder to say, hey, we should really upgrade to Ubuntu 24 for all of the production nodes for Compiler Explorer. And we tend to drag our feet a little bit because we've been bitten before.
Starting point is 00:04:14 this is where we should see a big oh boy yeah a big foreshadowing thing you should pop up here and in fact 10 years ago we were bitten by an issue I was doing some research of trying to work out
Starting point is 00:04:28 what happened and I found my own blog post from 10 years ago oh yeah you know that kind of thing where you're like oh who had this problem oh I had this problem former me former me so anyway
Starting point is 00:04:41 you know how hard can it be to upgrade the operators system. We have everything scripted through the wazoo. We use Packer to build our images, all of our images, you know, apps install all the things they need and then they shut themselves down. Great. And then, you know, we make an image, a machine image out of that. And that's what starts up and is a compiler explorer. It's pretty straightforward. And we do this fairly often because if you don't, then it rots as you know, you're relying on a ton of stuff that's, you know, install this from this random website.
Starting point is 00:05:15 So anyway, that wasn't a problem. So in theory, it was just change a 22 to a 24 and then rerun the packer. And it failed the first time. Yeah. For an easy to diagnose reason, luckily, I'd already hit this before because I started a 24 upgrade for a different part of the system and hit it. I was like, oh, I must remember this. And of course, completely failed to remember to do it this time round.
Starting point is 00:05:42 which was to do with the way app armor has been updated and some of the jailing things that we do you know, obviously we run things in secure environments and app armor has opinions about which things can write, can run. And certainly our own jailing code needs to be configured so that it can do the things that are usually dodgy. You know, like, hey, I want to make a whole new namespace. I want to make all of these sort of isolated environments. It's like that shouldn't just be allowed to happen randomly. So AppArma comes in and tells you, no, you can't do that. And then Compiler Explorer doesn't work.
Starting point is 00:06:16 So we fixed that. Cool. Deployed, you know, anecdotally, it took a while to start up. And I was like, ah, watched pot, you know, never boils kind of thing. But to sort of make this slightly less of a shaggy dog story than it already is going to be, it turns out that our boot up time went from not a great couple of minutes to, you know four or five minutes it was timing things out yeah and even more interestingly while the node was booting up it was unresponsive like SSH would
Starting point is 00:06:52 time out and I'm like what is going on here what an earth could it be doing at startup that could you know bring it to its knees like that so what it ended up being is that at startup we mount around 2,000 squash-affess images. And I don't know if we've talked about this before on the podcast, but Compiler Explorer has a very unusual. Yeah, okay, so let's do a bit of an interview. This is definitely a therapy session, by the way. Thank you for being my therapy. And thank you listener for being, uh, well, yeah, you spend three days of your life on this. You deserve to be able to vent about it. I'm just, yeah, I'm going to feel better. Catharsis, yes. Yeah, uh-huh. So Compiler Explorer has many compilers.
Starting point is 00:07:43 They are immutable because once I've installed GCC 12.1, I never need to do anything with it again. Again, with a massive footnote there's sometimes we break and we have to redo it, whatever. But they also need to be shared amongst up to about 40 different machines. And that constitutes about two and a half three terabytes of binary files. Yeah, yeah. There's just not really a good solution for I'm sharing that amount of data. with the kind of access patterns that are, well, it's a compiler, it's an executable. I'm going to run it from NFS or wherever you're scoring it.
Starting point is 00:08:20 So, you know, the first thing, you know, when we first started out with Compiler Explorer, every node, it was, all the compilers were built into that AMI image. So that the image that I said I was building from 22 to 24 would actually contain the compilers. They were actually apt installed at one stage. And that was okay, but it does not scale. Right. Because, you know, effectively every build gets slightly slower than the previous builders. More and more and more images, more compiler images get sort of unpacked onto it.
Starting point is 00:08:48 And that once it took more than 24 hours to complete the AMI image, I knew that we were in trouble and I thought we have to come up with a different solution. So we used NFS and NFS was great for a while. We hit some funny performance issues with NFS just out of the gate. But ultimately we got that kind of settled down. And then things like Boost, which is C++ header only library, it makes a lot of use of the fact that you can include a file from itself, you know, you can self-reference the file. And so it has a bunch of pre-processed trickery that includes the file multiple times and hash-defined something to be, you know, N plus 1. So it can like include the same file 50 times to get expanding out of certain things that you can't do in the template system.
Starting point is 00:09:35 Those kinds of tricks, right? either way what that is is tiny text files including tiny other text files which is like the worst case scenario for NFS right you know you've got this massive even if you cache the file contents NFS will always go and fetch the or pretty quickly we'll go and fetch the metadata to see if the thing has changed on the remote side right before it will serve up the cashed content it has locally and at that point you might as well have just read the darn file because it's you know 80 bytes long, right? So latency, massive, massive problem.
Starting point is 00:10:09 And so boost was timing out. And so our first solution to this was we would sync a few libraries to the local disk when we started up so that they were local and then substitute the path. Great, lovely, but not scalable because we have thousands of libraries. And then the boot up time was getting longer and longer and longer as well. So take two was for every compiler or library that is, a final build, like a 12.1, a released compiler, we install it on NFS, but then we also build a squash-Fs image of that compiler.
Starting point is 00:10:45 Okay. That squash-office image is also on NFS. Okay. So far, you're thinking like this is just manifestly worse, which, you know, maybe it is. And then at startup, we mount the squash-fess image over top of the, the, NFS location where it is also stored. So it looks like we have one unified Opti compiler explorer, all the compilers in the world.
Starting point is 00:11:12 But some of those directories in there have actually been mounted over top and are actually being served from a squash-fs image that's on NFS as well. Right. And the intention here is to sort of solve the tiny file problem. And instead of shipping across an 80-byed file, you're shipping across a squash-fS for a particular compiler and all of the bits come along with it in one swale foop. In one swell foop, it's actually better than that
Starting point is 00:11:38 because the way Squashafs works is, yes, it does do packing of smaller files into an area. It also compresses them. And as far as the kernel is concerned, Squashafest is like a block device file system like a hard disk. And so the kernel is caching and loading things as like 4K pages, 8K, whatever size. it's reading and writing. Squashafest doesn't know that the underlying image is actually on a mutable network file system. And so it caches them forever. It's like, yeah, fine. I'll keep this in my page cache, right? I read page seven of this file system to hand to Squashafest that then
Starting point is 00:12:19 unpacks it and then the files are also cached, right? But the raw block level accesses are cached essentially forever, we've laundered the fact that behind the scenes, it's an NFS drive. And so we get much better caching, much better performance of like reading the metadata for things because, yeah, like you read the directory contents and it gives you like a 4K block that probably has all of the directories for everything that's inside that squash office image. And then the tiny files are all packed together as well. So like we're winning on every level. And it was a huge improvement in the compile time.
Starting point is 00:12:55 But it's not free to mount the image in the first place. And obviously, it takes up a little bit of memory on the machine to have like 4,000 or 3,000 mounts, right? So each mount, you know, takes up a certain amount of kernel space. And, you know, part of the acceleration is that pre-cashing effectively of that sort of top level of the directory of every compiler and library. So over the years, you know, we went from tens to hundreds to thousands of compilers. and our boot-up time became dominated by, well, it takes 50 milliseconds to mount each squash-fs image. Yeah.
Starting point is 00:13:30 Times 2,000. Suddenly, that's an appreciable amount of time. Yeah, yeah. And so then you're like, what if we do this in the background while we're starting up? And then that didn't work out. Now I know why. Ah.
Starting point is 00:13:44 So anyway, that was the situation that we're in. That's why we mount thousands of files at startup. And we have any number of ways of, of thinking about consolidating the number of squash-fess images we have so that we can combine all the GCCs together in one image or other. But there are a number of other issues with that, which is a whole other podcast episode. And every time I describe this problem to people, by the way, everyone's like, yeah, I don't know what, I can't think of a solution to this problem. This is the general problem of like, I need to ship immutable binaries with low latency to many places.
Starting point is 00:14:19 And I can't, and I need to take advantage of the immutability as much. as possible and yeah and then the management of that right i've got thousands of these things right anyway so so the the problem turned out to be when you mount a an image on a modern Ubuntu system d comes along and says oh you've added a mount i need to make an ad hoc unit which is kind of what it's kind of dependency tracking node in its graph. I don't know much about system D. I've learned a lot more about it in the last few days. But what it effectively does is it creates a node in some dependency graph so that when you shut the system down, it knows to unmount it and it knows who depends on it and things like that, right? Sort of that stuff. Because you can also phrase the
Starting point is 00:15:14 entirety of like, you know, et cetera, F stab or certain services to say, hey, this service needs this thing and this thing is a network mount and that network mount needs networking and networking needs to be so system d can like follow this graph and say well if you're turning on this service i'll make sure all the mounts it needs are in place and when you turn it off i can unmount them as well so it makes a ton of sense i discovered actually side note the other day that system d also does that for shared memory directories interesting which which bit me in a very painful way uh because uh of technology that you're familiar with, it uses shared memory, I'm aware of that. And it's suddenly disappeared when we turned the service off. And we're like, what just happened? Oh my gosh. Yeah. Yeah. System D has its fingers
Starting point is 00:16:00 in a lot of pies. Mm-hmm. So when you mount an ad hoc thing, be it apparently shared memory or a squashy face image, which itself is mounted through loop back, which is another sort of mount, which is, you know, all this kind of stuff. System D tracks. it and I don't know what system D is doing to take so long because this is the rub. System D essentially takes 100% CPU twice over. So on our two core machine that we run these things on, I can run top. When I actually got it, I said to you, the machine was unresponsive, right? Because it's all in kernel land.
Starting point is 00:16:40 Locks are being taken out, left, right and center. You know, we're trying to mount these things in parallel at sensible levels because we want to try and mount them and deal with the latency, if it takes 50 milliseconds and most of that is network latency, I should be at a fire off two or three mounts at once and get the squash effess, you know, read the route directory and then have the mounts, whatever. Even if the kernel is like sequencing them, you'd think I'd like to have some mounts on the go at once. But no, every time that comes in, System D does a ton of work. And so PID 1, which is, you know, in it. Right. but it's called System D on these systems,
Starting point is 00:17:18 takes 100% CPU and a second process called System D, which is the one that is per user, I think, takes 100% CPU while this is going on. And that is not the case on Ubuntu 22. It does take some CPUs like I measure 30 and 40% of those, which again is like, what on earth are you doing? But I'm sure it's got this massive dependency graph that it's running through.
Starting point is 00:17:45 So, yeah, the short answer is Ubuntu 24, either the system D has changed or some aspect of the configuration, and every time you mount something, it takes a little bit of time, which is probably not a big deal, unless you're doing 3,000 of them back to back, or trying to do four of them at a time. And then it jams the system up completely. And the knock-on effects for this when I rolled out the 24 was, A, our machines took ages to boot, but they did, eventually come in just under the, the threshold of them getting whacked by machine didn't come, uh, yeah, you know, responsive, essentially. Yeah. Exactly. But of course, they've been chewing a, you know, 200% CPU for like those three minutes while they were booting up. And the way that we do our auto scaling for our cluster is we take the average CPU of the, which is a terrible metric, but it's also the simplest one to do in AWS. Now, we've got, One of our committers is working hard on getting a much better way of scaling up and scaling down and using metrics that makes sense.
Starting point is 00:18:53 But the only sensible one that you can go to that just has a drop-down entry in AWS is add or remove nodes from the cluster to keep the average CPU at blah. Yeah. And so we've got that set to like 30%, 25% right now, which is, again, not ideal. But it does mean that now suddenly you get into this runaway situation where, a little bit of load comes in, you fire up a new node, and for the three minutes, it's taking 100% CPU, which pulls the average up even further. And before you know it, you have 40 nodes that are booted up. And then once it hits that maximum of 40, obviously the CPU thing plunges down,
Starting point is 00:19:36 and so it quickly stops dropping them all. But then, yeah, so we rolled back to Ubuntu 22, and I spent the last two days, or day and a half, trying to turn off system D, try to disable this part of system D, try to add every mount option known to mankind to the end of the squashy festing to say, for the love of God and all that's holy, don't track this. I don't care. I'm going to mount it and then I'm going to throw it away. And neither me, my internet searches or any of the LLMs that I ask could come up with a way. It just doesn't seem like there's a way to do it. In fact, by the end of one session, And Claude was saying, you really need to file this as a bug.
Starting point is 00:20:16 And I'm like, I don't, I'm probably doing it wrong. I still think I'm doing it wrong, right? And, you know, for the longest time, having 3,000 things mounted is not really an ideal situation. And so that was fun. Yeah, you're pulling the face of, oh. I just think, is there really no way to tell System D just not to track these things? I couldn't find it. You can do all these sort of repression.
Starting point is 00:20:43 and things and it still didn't you know it was still being thrown through it and it's monitoring some mount thing uh the best i could do was kill minus stop on the second system d process like the user level system d process which essentially is like a break pointing it yeah then mount them all and then kill minus cont uh the process and that got rid of one of the 100 percenters but that process is sort of lazily on demand created so until you start doing work that needs it. It's not there. And so my scripts were dying because they were trying to send it the stop. And it was like, well, that pit doesn't exist yet. And then I'd log on to the machine and finally get through. Yeah, computers, man. I mean, how do they even work? I mean, how even?
Starting point is 00:21:31 Wow. And so what I would ideally like is to not, well, first of all, I would like to have a much clever way of managing a large nest of squash-fess images. And one, in fact, this whole approach to mounting squash FS images through NFS whatever was something that I was doing around the same time that the company-wide solution at Aquatic was being developed. So there is no surprise that the thing I've just been describing to you is familiar to you, at least in part, because for our audience, and I don't believe it's revealing any IP. We'll have to do some very creative cutting if it is. But Aquatic has a solution for soaring environments by effectively putting them in squash-office images. And, you know, that's not new either. That's what snap images are.
Starting point is 00:22:22 That's what flat packs are or various types of things. They just mount them. So, like, this is one of the ways that one solves a, hey, I just want an immutable bundle of things. Yeah. And then one of the people who worked on that from Aquatic, who is also a compiler, explorer, a committer, has come up with some solutions that are a bit more clever about, like, having a list of sim links that point from the file system to a well-known path, and that well-known path that has auto-fs, that auto-mounts the thing on demand.
Starting point is 00:22:53 And so you still present this apparent file system that looks like it's got every compiler known to mankind, but only when you actually CD into the directory or try and run it, does the sim link get resolved? And now that squash-fs image gets mounted and it appears in that position and then you can carry on with your life. And that's great. And obviously, if we designed it from the beginning for that, we'd be able to just retrofit.
Starting point is 00:23:14 It would have been fine. But retrofitting it into our current setup is really, really, really difficult. And then you still have these problems. So there's that that gives you on-demand mounting, which is one thing you can do. And you know, you could try and configure AutoFS to do this, but there are a number of reasons why it's difficult, which have much too complicated it to go into now. So it doesn't just work out of the gate, although it sounds like it ought to.
Starting point is 00:23:37 But, But we don't really want 3,000 squash-affess images. That's a pain. I do want to have, like, here are all the GCCs. And what you could do is mount sub-parts of those images into this unified tree, which means that, like, hey, I've got all the GCCs. But now GCC 15 has just come out. I don't want to have to rebuild that image because that's, you know, 500 gigabytes of GCC.
Starting point is 00:24:04 And squash-fess is immutable. You can't add things after the fact. you have to unpack it and then repack it again. So if your solution for, well, I'm adding one more GCC is unpack all the GCCs and then repack all the GCCs with the new one. Then you're kind of back to that original AMI problem we had that is it's going to get incrementally worse every time you add a new thing. So what you really want to be able to do, again, is like have this, well, all of the GCCs
Starting point is 00:24:29 for the last, you know, 10 years are in older GCCs, but they are mounted in, each one individual. There's one image that has like the old ones. And then periodically you add kind of layers. This is sort of like a layer of fester. Yeah, right. You consolidate the layers. You can have a process that goes away and says, hey, layers three through nine.
Starting point is 00:24:47 I can now net them out and make a new layer three. And then I rewrite the file system to be these things. So that's where we want to go ultimately with this. But there isn't a quick MVP for it that gets me out of my current hot potato situation right now. So you're back to a one 222. We're back to Ubuntu22, although, although, I had a, I had a sort of an idea. So while banging my head on my keyboard, trying to go like, how even does system D notice when I mount things or when I do stuff to the system, I was like, wait a second. what if I wrote something that looked at file system accesses?
Starting point is 00:25:40 So I have, for every file that's stored in a squash-office image somewhere, it is also out naked in NFS because if you have your squash access isn't around or whatever, or for those things we genuinely update quicker than the squash-offs images, it allows me to have access to the files that are just in that opt to compiler explorer, right? So if I could post hoc, that is, you know, run some trace through the whole system and say, hey, any time I notice somebody access as a file that is on NFS, that is inside one of the directors that I have a squash FS image for, that's what I'm going to choose to mount it.
Starting point is 00:26:17 And for the first few times, they're still going. They're still reading from NFS, but once that mount has finished. But eventually when it mounts, yeah, we sweep in, sweep, that's not a word, swap in or flip in or something, I don't know, what those, sweep in. Fly in. Fly in the mounted squash-off S image over the top. And so that is what I've been doing for the last two hours, which is why I was slightly like,
Starting point is 00:26:39 let's just talk about it. It's top of mind for me right now. And that's showing some early promise. So on this world, what we would do is we would not mount anything. Yeah. And then we'd run this demon. And all it does is sit there and watch file accesses and then sort of lazily bring in the squash appraisalages.
Starting point is 00:26:55 And obviously, in the worst case, a compiler explorer node will eventually mount all of the images. but likely as not it'll never get close. I was going to actually I was going to ask you about that is like, does that mean that as the compiler explorer nodes are running, they're just slowly accumulating these squash-fs mount points. And do you need to like clean them up on any sort of regular basis? Or do you just restart the machines every once in a while?
Starting point is 00:27:20 Like, how does that work? Well, so in the current situation, situation, we just mount them all at startup and they're up forever. So that was, that's, yeah, right. And the only memory they're really consuming in that state is just that small amount of kernel memory that you were talking about before. Yeah, which isn't that small. I can't remember what it was, but it was like a trivial enough, not sorry, non-trivial enough that you can see that the machine's like memory has gone down having finished mounting it. Yeah. Oh, well, which is less than ideal.
Starting point is 00:27:48 Mm-hmm. So, yeah, at the moment we pay the cost for all of them, even though, you know, we have some like GCC1.23. Yeah. How often do people use that? Probably not very often. And again, we have 40 nodes. They're recycled really quickly. It's very unlikely that any one node will need all 3,000 in its lifetime.
Starting point is 00:28:08 So in this new world order, the way that I was imagining it is at least for V1. We just mount them and leave them up because it's no worse than what we have before. You know, it might take an extra half a second, even a second to mount the access the first time. But we're not holding up the compiler in that case. It's just going through the slow NFS part. Yeah, yeah. And also, you know, obviously the squash-fess image is extra NFS accesses that we didn't need to do otherwise.
Starting point is 00:28:36 But the hope is it will net out pretty quickly. And then by the time we either run it again or even by the time it's finished, you know, the first reading of like the elf and it's starting to look at the DLLs that it needs to load in, then it's going to pull the DLLs from the squash-Fest image. So that is the hope. see how it goes. I have already found one situation where the squash-fs image is not actually up to date with respect to the changes on the disc, which is like, oh, well, this is going to throw a wrench, a spanner in the works. Yeah. Yeah. So that is an issue. Interesting. I guess you
Starting point is 00:29:13 could have the demon that you're writing actually do that, right? It would, maybe. I guess so. I think I'm just going to go kind of caveat emptor. We'll find them as we hit them. Or maybe it'll be something we do is a post process, it's just running and looking for things that are out of date. Yeah. You know, and I guess the other thing you could do with that is, and you probably have this already, but you could, you could farm that thing for usage statistics, right? Yes. In fact, about a year and a half ago, we changed the privacy policy and our, our backend to track
Starting point is 00:29:52 statistics because we don't like to track things. That's not what we're in. I don't care what you're doing with it really. But it is incredibly useful to say how often do we use this compiler versus a compiler, which I think is a fair use of non-identifiable. For exactly like problems like this where you're just like I'm optimizing these things. I want to optimize them based on the usage, not on like, you know. Exactly. And so certainly we can do things like down the line. we can do things, you know, like, hey, let's have a cluster that only does legacy compilers, right? And then that cluster just sits there and lives there forever. It's there's two machines that run all the time.
Starting point is 00:30:28 They sit there in their old timey world and requests for GCC-1-2-3. You know, they're on the front porch with their shotgun across the lap. Yeah, that's right, waiting for, you know, the youth. Yeah. And then we can even have, you know, conversely, some faster nodes that are serving the GCC, you know, 15.1s that have just come out and the trunk builds and things like that. And but our management is not good enough. At the moment, having multiple clusters is painful for us.
Starting point is 00:30:58 And so we have kind of two or maybe three clusters, you know, one for the GPU things, one for the armed compilers and then one for everything else. And retrofitting in everything without breaking a site is so hard. You know, this is why, you know, when we did our talk, a conversation, you used to, you talked a little bit at your sort of branch-based deployment, you know, Yeah, yeah. That was like, oh, man, I wish. I wish I thought of that ahead of times.
Starting point is 00:31:22 Just twist the knife. I mean, you know, I think it sounds like at the scale that you guys are at right now, just spinning up one of those environments would be prohibitive in terms of cost. But maybe you could structure it in a way where it wasn't quite so bad, you know. So cost is not such a big deal anymore. And I say that with a massive footnote. People are surprised at how relatively cheap compiler explorer is to run. We're currently at around about three grand a month.
Starting point is 00:31:47 burn rate of AWS stuff, although it's just gone up, but it's gone up for good reasons. The good reasons is that I've sent a message out a blog post, in fact, that's what we call these things, a message. I've made a blog post kind of... You made one internet, and you sent it out into the webs. I did. I sent it out into the webs about the cost breakdown of compiler explorer. I did a big sort of dive into it so I could justify, you know, we're very lucky.
Starting point is 00:32:16 we have a lot of commercial sponsors. We have a lot of people who donate on Patreon and GitHub sponsors and my dog's barking, which I can't be bothered to edit out. So we have a lot of money coming, which is fantastic. And I like to be very upfront and open about what we do with the money, as much as I can within the reasonableness of the fact that it's still my private finances at some level still, right? I'm sort of hand-waving and gesturing about this stuff and stuff.
Starting point is 00:32:43 So I like people to know where the money's going. And so telling people like, this is what we're spending it on and this is how much it breaks down to. It was interesting. And so it ended up on Hacker News, which was great. And one person said, have you ever considered talking to, you know, the Grafana people or the solar winds or whatever? You know, the people that we pay money to for subscriptions for like monitoring stuff. And I was like, yeah, kind of. But, you know, there's something to be said for it's not that much.
Starting point is 00:33:09 It's not a huge, you know, that's cost to me, you know, 40 bucks a month. That's kind of noise. And I don't want to give up too much of my, you know, I don't want to sell out. I would rather pay 40 bucks a month than them say, hey, you have to put an ad on the top. If I say, say thank you. And I'm like, I don't know if they would do that. But we went back and forth on that. And then it's like, oh, you do know that AWS have an open source budget.
Starting point is 00:33:32 And I'm like, oh, yeah. The what now? And so I looked it up and it was a blog post from like 2012 that made some. mention of, you know, like, hey, if you're an open source project, contact us, we might be able to help you. Here's a form to fill in. And I'm like, okay. Now, the form has looked complicated. So given how old the blog post was, I just emailed the email address that it said. They said, hey, is this thing still on? Right. Right. I'm, you know, I'm the creator of compiler explorer. I'm interested in talking if this thing's still on. I immediately got a email bounce.
Starting point is 00:34:11 And I thought, well, there you go. That tells me. I'm glad. I did this rather than spending all the time to look out. So I thought nothing more of it. An hour and a half later, somebody replied going, oh, wow, yes, we use compiler explorer. We'd love to help, definitely fill in the form and send it to us. And I look back at the bounce and it was like, oh, it must be one person's inbox is full of the distribution list that it ultimately ended up. So I filled it in and thought, and it asked for what is a year's amount of money that your site or your project. might need. And so I was like, I guess three grand times 12 or four grand. I can't remember exactly.
Starting point is 00:34:49 I think it was three, three and a half grand. So 30 grand, which, you know, it's like monstrous amount of money. And I was expecting him to just say, no, seriously, how much do you need? And I thought, again, nothing more. Someone much more business oriented got back to me and said, we'll be back, we'll get back to you within, thank you for your, you know, your email. we'll be back to you within 30 days. And I thought, all right, this is the, now it's gone into bureaucracy. We'll never see it again. Right.
Starting point is 00:35:18 And I just happened to log into my AWS account and I saw a 36 grand credit had been applied to it. And I had to email them back and say like, okay, before I even talk 20, what are there, do I have to tell people about this? Do I have to go out of my way to like, thank you? What's the deal here? And the woman eventually got back to me, oh, sorry, I meant to tell you that you'd been approved. I'm like, were you just going to learn? Wow! And so the short version, this is such
Starting point is 00:35:47 a off-topic thing, is that, yeah, we, AWS are now funding the cost that I put forward, you know, the year cost as it was last year, which means immediately you're like, oh, maybe we can start, you know, running more instances and whatever, because I have the cash for it now. So it feels like it's more like I'm treating it like a subsidy. But what I'm mindful of is they may not renew it at the end of this year, in which case, you know, I have to be able to scale everything back again. So there's a bit of thought. Yes.
Starting point is 00:36:18 So I don't know. It's an interesting situation to be in, but it's an amazing situation in the short term. It means like I'm looking at like Redis caching, whereas before I was like, ah, I don't really think that I can justify, you know, another hundred bucks a month just to have something that I might not use or these kinds of things. So it's very exploitative. So I'm excited about that. But how do we get to this?
Starting point is 00:36:41 I'm forgetting, yeah. Oh, yeah, you were saying about how expensive it might be. Yeah, you know, the branch-based stuff might be too expensive. But now it might be okay. You know, I honestly, I look at my load balancer now. So load balances cost, what, 10 bucks a month plus the transfer. Yeah, it's the data that you really pay for there, right? Yeah. It is, yeah. And I've often wanted to have multiple load balances, you know, one for each. I used to have subdomains for like staging environment and things like that. And then I made it so that every subdomain ends up in Godbot or, you know, comes to Compiler Explorer.
Starting point is 00:37:15 So you'd have to be like, say, dub, dub, dub, dot staging, dot Godbot, dot, and then you're into multi-level DNS and that's a pain in the bum because you can't do wild cards and all that kind of stuff, you know, but I stopped doing that because I was originally using it to route to a different load balancer, but, you know, to have one load balancer per environment was expensive. So everything now goes to the same load balancer and it's URL match to go off to its merry way. And that's not scaling. all that well now. I've got multiple of them and there's other things on there. And, you know, there was a time when 10 bucks a month for another low balance, it was like meaningful. And now
Starting point is 00:37:51 not to, you know, put too fun of, whatever, that's noise. I don't know about it. So maybe I should go, maybe I should explore branch-based development. You can spend 10 bucks a month to make your life all easier. I would suggest that you give it. Yeah. Yeah, that is, that is the cost, I think, right now is, you know, the trade-off between the cost of time. And I have supposedly three presentations to prepare for and not be doing compilograms for stuff. And then I've got kind of two and a half clear months before I have to go and work for a real job. What? I know.
Starting point is 00:38:25 A job. That sounds terrible. I know. It does, doesn't it? It sounds awful. So, yeah, I'm sort of very much top of mind thinking about how we're going to, how I'm going to go back to work, Ben. I don't know. Yeah. I think in the first week you're going to be like, this is awesome.
Starting point is 00:38:46 That's what I predict. You're just going to go right. I remember why I love this. Yeah. I think you're absolutely right. I'm pretty bullish about it. You know, I check in with my, my new gig from time to time. And, you know, I always come away feeling excited. Boyd, as I would say. Or booed, would you say as a yank? I would probably say buoyed. But I wouldn't say either of those words. I would just say excited. You say excited, yeah, that was... It's just too nautical. I'm not... That says a man who works for a company called Aquatic.
Starting point is 00:39:21 Yeah, booey. We have a service actually at Aquatic called Bowie, and I can, like, trip over it every time I say it or spell it. It's... So in British English, that is boy. It's always been boy. Like, you know, what is the property of being able to float? It is...
Starting point is 00:39:39 Booie. Listener, you look at the contortions on Ben's faces he tried to justify pronouncing it that way. Yeah. Okay, point taken. All right, but that, you know, very few of these language-based justifications hold water if you start looking too deeply because English is not very logical.
Starting point is 00:40:04 Well, none of them hold water was buoyant because they float. That's what you're doing. it's never mind i'll go home maybe we should actually somehow we've been talking for well i've been talking and you've been very kindly and listener has been very kindly listening to me vent my my spleen oh no i mean i i love
Starting point is 00:40:22 i love these wars war stories i think we should do more of these it's like let me tell you about this bug that consumed two days of my life yeah i mean i think it's valuable for some sometimes to hear them i mean it's fun to tell them but sometimes it's nice to hear them because then you secretly you go back to your desk you're like i don't feel so bad about spending four hours tracking down this thing now yeah you know it's like the old thing about like you know hacking and programming in movies is like people with like you know one hand on each keyboard and like you know all the things scrolling by on the screen and the charts and graphs
Starting point is 00:40:55 and in reality it's just staring at a stack trace for 30 minutes going like I am bad at my job well in case there was any doubt you're not bad at your job I don't think I'm bad at my job but yeah feeling that way occasionally. It's like, oh, well, how did this ever work? I don't understand that would even ever work, let alone what's happening now. Yeah, like, who wrote this to get blame? And you're like, oh, oh, it's me.
Starting point is 00:41:22 Yeah. All right, friend, well, short of starting a whole new conversation, I think we should finish it up here, unless there's anything you want to park in words of wisdom. This is a good episode. I like it. I dig it. Fantastic, mate.
Starting point is 00:41:36 I will. I will. I will get the minions to edit and put it out soon. Perfect. And the minions, I mean, me. There are no millions. Right. Cool.
Starting point is 00:41:49 All right, friend. Until next time. Until next time. You've been listening to Toos Complement, a programming podcast by Ben Rayleigh and Matt Godbob. Find the show transcript and notes at www.org. Contact us on Mastodon. We are at Toos complement at hackyderm.io. Our theme music is by Inverse Phase. Find out more at InversePhase.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.