CoRecursive: Coding Stories - Story: Godbolt's Rule - When Abstractions Fail

Episode Date: November 4, 2025

What do you do when your code breaks and the only fix is to dig into the runtime below? Matt Godbolt lives for that. Tile-based renderers, color-coded scanlines, zero-copy NICs—each story is a clue ...that leads past the abstraction to the real machine. He shares the rule that guides him: master your layer, learn the one below, and know the outline of the layer under that.  Matt Godbolt's journey proves the real breakthroughs are hideen behind the abstrations where you are comfortable and familiar. Episode Page Support The Show Subscribe To The Podcast Join The Newsletter  

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, this is co-recursive and I'm Adam Gordon-Bell. Today I want to talk about abstractions and when they break and how to see past them. Like you have this idea of how a network request works or how memory works. It's very simple and concise. But it's also a lie or at very least a simplification. I was reading this book called Database Internals and about half of the book digs into this core challenges of database design, which is how to spend less time reading and writing to disk.
Starting point is 00:00:37 There are all these tricky tradeoffs and a lot of the big breakthroughs in databases in the back end of a database come from finding smarter ways to handle disk input and output for a particular type of workload. So that things work fast and you don't lose any data. But here's the thing, it's all very tied to how disks work, right? Whether a spinning old-fashioned disk or a modern SSD disk, how they perform and what they're good at and what they tell the computer they can do. But here's a thing that surprised me.
Starting point is 00:01:09 It kind of blew my mind because once you dig into actual how a hard drive works, especially a solid state drive, it only gets more confusing because the disk interface, it's actually a lie. It's such a simplification. It's just this convenient abstraction, but it doesn't match what's actually happening under the hood in some pretty important ways. It's like realizing that car you've been tuning to get faster is actually three motorcycles, was bolted together under a plastic shell.
Starting point is 00:01:35 And the reason that some of your clever tuning worked and some of it didn't is because you didn't know about that illusion. You didn't know about the reality underneath. You weren't debugging the real thing. You were debugging its shadow. Abstractions help us to give us an easy concept to hold onto, but sometimes they trip us up. And so as you can tell, I was excited about this concept, and I had to talk to somebody about this and I knew just the person.
Starting point is 00:01:58 Because the craziest abstraction, the one that blew my mind is on AWS. A lot of data lives in RDS where it's running Postgres or MySQL, but that disk that it's optimized to write to is actually another machine. The abstraction is such a lie that you don't even realize that every time you write to disk and your database it's actually a network request. It's like a Scuzzy hard disk where you rip out the Scuzzy controller and just plug it into the network and say, okay, as long as someone sends a packet, a scuzzy-looking packet to this,
Starting point is 00:02:27 I can read and write a block. It's like bonkers stuff, yeah. Yeah, because that is often like, the bottleneck on so many systems and they just put a network in the middle of it. I mean, they're pretty good. It's a fast network. It's not the same as having, it always used to confuse me. And there's this provisioned diops and whatever.
Starting point is 00:02:46 You're like, why does this matter? It's in the machine, surely. It's like, well, this one is because it's, you know, a physical disc that you're provisioning in a machine that happens to have a disc in it. But everything else is virtualized through a file system that's really writing to what it thinks is local storage, but is actually. a network blast off an ice-guzzy packet or whatever it's actually using behind the scenes to racks and racks and racks of discs yeah so and then you know you can take the abstraction
Starting point is 00:03:13 even further those discs right they present somewhere like here's a number of cylinders sectors whatever but it's like that's not really what's happening either they if it's if it's sSD backed then the SSD has a mapping table between where you say to put the data and where it's actually going to put the data because it has to do where leveling it says you know right over and over and stuff. I don't want to keep writing to the same place over and over again. I need to keep moving where I'm writing around the physical characteristics of the system in order to not have the boot sector get completely screwed because you keep rewriting, whatever you do, you know, some sector like that. And so that's a lie as well. And then, you know, back to your point about,
Starting point is 00:03:51 you know, traditional databases, you write a file and say flush and the operating system says, it's good. And then you're like, but it isn't, is it? No, because it's not yet been flushed to the discontroller. Oh, yeah, it has been flushed to the discontroller. Oh, yeah, it has been flush of the disc controller. Oh yeah, but the disc controller has a cache. Oh, that's cool. Oh, right, but it's been written to the drive, yes. Yeah. No, because the drive has several CPUs on there that are doing some of this like scheduling of I.O. and whatever, and they've stored it for now and they're waiting for the disc head to come back round before they actually put it in disk to be faster. You know, when does it actually get to the disk? Just tell me.
Starting point is 00:04:23 I was need to know. That's Matt Godbolt. And yeah, to me, he is the king of pulling back these abstractions. You know, I knew when I realized that disks were more complicated than I thought, metaphorically, you know, I was discovering my car is actually three motorcycles, that Matt would already know this, and he would have more facts, like that one of the motorcycle engines was not, in fact, a motorcycle engine, but really a lawnmour engine running sideways. That's just how curious he is. He's the person who gets into these details. And that's the same curiosity that led him to build compiler explorer, a tool that shows you exactly what assembly your compiled code turns into. It lets you sort of pull back the veil on the compiler abstraction. I feel like Matt is pure
Starting point is 00:05:02 curiosity, emotion. When you talk to him, you can't help but feel energized. I, you know, everyone knows I get excited about it. And you know, my eyes light up and, and a curious thing. The inside of my nose gets itchy when I'm excited. There is something about what we do as software engineers that is magical. We can use magic words to make a computer do something that we couldn't do or to make it do something, I don't know, sort of list of things, thousands of things that would take you forever. It just feels powerful and it feels exciting. But, also just like the constraints you have in these systems where it's like, well, it's not really supposed to work this way or it's hard to get it to work at all, but you kind of fight through
Starting point is 00:05:44 those constraints. Or you use the constraints as a positive and kind of come out of it and go like, oh, this is. And when you're down really deep down in the hardware, there's loads of funny little constraints that you can take advantage of. And I think there's just an intellectual like riddle or solve, a puzzle to solve there. Where you're like, hey, wait a second. If we use this thing, we can, we've already squared it. It's already in the R2 register. Don't, touch the R2 register. We're going to do something else over here. I don't know. It feels like you're solving some cool Suduko puzzle. You get that endorphin rush. Oh, you know how Ethernet works? We can do this clever thing with it. And it contributes to something that's fun and interesting
Starting point is 00:06:17 that you can explain to, I don't know, your mom and say, hey, I've made a game. She's like, that's nice, dear. Right. So that's today's show. Matt is back. If you haven't listened before, he was here in episode 57, which was five years ago, crazy enough. And there, Matt talked about pulling things apart and about building games and we're going to do the same a bit today, but this time we're taking that curiosity all the way down to the metal to wear hardware and software blur. And sometimes with Matt, I can get a bit out of my depth, but I always find it inspiring and I always find it interesting. So let's do it. Matt's story starts when he was in university. And towards the end of 95, beginnings of 96, I started looking for a job. And
Starting point is 00:07:04 there was this thing called IRC, which is internet relay chat, and I was in one of the many chat rooms, and I happened to mention in passing. And somebody said, hey, you should try, you know, apply to where I work. It's a cool game development place. So I did. Got in, and they were like, when can you start? I'm like, well, no, I haven't actually finished university yet. So I don't think I can start until I've graduated. I said, okay, well, how about you come in the summer, do some work for us, and then come back again once you've graduated. So that was, that's what I did. And so when I joined in 1996, they didn't know what to do with me. Although I had applied for and got a programming job, there wasn't really any intern-like work. So I was just a
Starting point is 00:07:45 game tester at Argonaut Games. Argonaut was a small studio in North London, and they were the studio that could mix hardware and software in ways that most places didn't. They built a super effects chip that made 3D possible on the Super Nintendo, for instance. It was an interesting office It was essentially what you would imagine that you would get if you took a whole bunch of people who had been programming in their bedrooms from when they were like teenagers, 12, 13, 14 upwards. And as soon as they had reached the age where you could legally employ them, you had transported them from their bedroom into this building. The environment was a converted car dealership. So it was a very weird, very long, thin building that was designed to have, you know, you could walk in and there were cars that would have been parked down the side inside. And so it was, I mean, or you know, programmers, artists, designers, level creators, creators, creators. And so everyone was self-taught. There wasn't really any formal training. There was no software engineering, the various limited software engineering ideas at the time. And, and it was, It was, yeah, it was pretty fast and loose.
Starting point is 00:09:01 The hours were very, very, very long. The pressure from publishers was high. And as you might imagine, in a more liberal, chill, artsy environment, there was a lot of things going on whose legality were questionable, especially in the evenings once the management had gone home. I mean, there were very, very incredibly motivated people. They believed in what they were doing. They knew what they had to do and they went on and got it done.
Starting point is 00:09:28 I think, you know, we did pick up software engineering practices as time ticked on. But at the time, the big project at Argonaut was Croc, the legend of Gobos for the original PlayStation. And Matt was seated in the Croc Room. The crock room was a room that had probably 12 people in it, 12, 14 people, something like that, and sort of dotted around. There was like the animator with the one precious SGI machine that ran the animation. animation software and so she was like set up with that. Everyone else had, you know, some form of like 486-ish-era, a DOS computer. Desks were covered with crap, frankly, you know, anything and everything that was around. You could tell where the animators were and where the artists were because they
Starting point is 00:10:16 would have every kind of manga doll or, you know, you know, transformer or it was, you know, like a shrine to the things that they found really interesting and exciting. And it was a real, I opened for me, having sort of come from a very boring background, that, you know, wow, there are people out there that are like arty, and I'm used to nerds. And these are different, these are nerds in a different way. They're a very different kind of nerd, but they are extremely talented in a very non-overlapping with me until that point. And so that was a huge eye-opening experience. Kroc actually started as a Yoshi game for the Nintendo 64, but after some deal fell through, the dinosaur got turned into a crocodile
Starting point is 00:10:58 and the game shifted to the PlayStation. And Matt's first job was testing this PlayStation version of this little crocodile. That was my first introduction to Argonaut Games was playing through a VHS recorder, which was an amazing transformation in the testing ability. If you imagine you're playing a game and it crashes
Starting point is 00:11:15 and you tell somebody and they're like, what did you do? And you're like, I didn't do anything. It just crashed. And they're like, no, I bet you did something. I didn't do anything. So it became an argument. And it's like you went to the VHAT, you rewound the tape, right?
Starting point is 00:11:25 You went, look, you see how was just standing there, nothing was going on, and then it just froze. That summer, Matt got thrown in to the deep end of game development. It was hands-on and it was unpredictable, and he loved it. And then, when he came back after college, he wasn't a tester anymore. He got his first real programming job, which was to take that same little green crocodile, who honestly looks a lot like Yoshi, and make him run on the PC. One of the things that we had was a tie-in deal with Intel, and they wanted to sort of
Starting point is 00:11:56 put us in the ship the game alongside the motherboards for certain of their new chips upcoming chips as like a you know hey look it runs great on and of course back then you know PlayStation was like head and shoulders bit above of what most PCs were doing or at least you know without spending a huge amount of money on these newfangled GPUs of which you know how different the world is these days so they were like no no the CPU is fast enough to do this work being Intel let's show you them by by showcasing it so I was very lucky to be given one of the engineering samples of the then new Pentium 2. And so I got to play with this beast, 266 megahertz, if I remember, right?
Starting point is 00:12:36 And we had, you know, if you put a bunch of game developers in a room and connect all their computers up, what's going to happen every lunchtime? You're going to play Quake, right, or Doom or anything. And so I was, for a short period of time, I was the best Quake player in the entirety of the Argonaut, but only because my frame weight was four times quicker than everyone else is. and so I have no skill. I was just that it was like shooting fish in a barrel, right? They couldn't keep up with me.
Starting point is 00:13:01 That changed pretty quickly. But yeah, so the day and day out was take fairly grotty C code that has been beaten to work on the PlayStation, along with tons of assembly code that they had also written for all the various GPU-like chips that are in there and port it to work in Visual Studio, and compile in Visual Studio and then connect it to direct X, which was this relatively new thing back then. So I learned far too much about how direct input works.
Starting point is 00:13:35 I did the front end. So I had to deal with all of the keyboard remapping and reading the mouse and the joysticks and all of those things, along a bunch of other bits and pieces that were just bits in between the bits. By then Matt and Argonaut had grown up. The company had moved into a more normal office space, and for the first time it felt like a real studio. not just a bunch of bedroom coders all working together. But at the same time, the industry was changing.
Starting point is 00:14:00 PCs were slowly growing in importance, and John Carmack and others were showing what was possible with 3D technology, with games like Doom and Quake. And studios started to realize that they might need their own in-house engines, that they shouldn't just be coding every game from scratch. Games were getting too complex. And for Argonaut, this meant trying something new, building a 3D engine called B-Render, short for Blazing Render.
Starting point is 00:14:23 But then the Dreamcast was about to launch and everybody was excited and a game producer pitch an idea for a new game and a new engine that would be built just for this Dreamcast Sega's new exciting console. And my ears pricked up because by complete coincidence the producer of that game who was pitching the idea to Sega happened to have a spare. It was the only seat spare in the office and so he was sat in the same room as the ATL group who were doing technical. support for this Brenda product and then me in the corner doing Croc. And I heard him talking about this new project. And I said, oh, I could write an engine. I mean, honestly, the hubris of youth. And so I was like 22 didn't really know what I was doing. Again, no formal training in this, but then neither did anybody else. I didn't know what I didn't know about 3D or matrix transformations or high performance anything. Right. I was, I was. I was. I was. I was. I was. I was. I was. I was.
Starting point is 00:15:23 You know, I'd done a ton of programming as a kid growing up, and, you know, it was all in assembly up until that point. I've sort of learned C kind of as a begrudging, I guess I need to learn this thing. So that's my background. So I felt I was well positioned. And this guy, Nick Clark, his name, gave me the break and said, yeah, sure. And then he threw me in with a couple of other programmers who'd just finished a Sega Saturn port of Crock. Like, we'd all suffered through the same pain of porting Crock to a new platform. that project became the game Red Dog, the Dreamcast project that gave Matt his first real shot at building a game engine.
Starting point is 00:15:59 And before long, he was all in. Here's what made the Dreamcast stand out. Its graphics pipeline didn't use a traditional full-screen frame buffer. Most systems, they render each frame of a game into this off-screen frame buffer. It's a big block of memory that stores the color values for every pixel. And then once that frame is complete, the buffer is swapped onto the display, where you can see it. you know, it has, for every pixel it has what color that pixel is stored in memory. And then a typical graphics accelerator draws triangles into that by being given a triangle one,
Starting point is 00:16:31 giving a triangle one at a time. And as it gets that triangle, it goes, okay, well, here's the extent of that triangle. I'm just going to fill in those pixels now. Okay, next triangle. And there may be cues and there may be, you know, other bits and pieces. But broadly, that's what's happening. Every time a triangle is drawn, the frame buffer has been updated. And that frame buffer may be the the back buffer and then you may switch it around to suddenly go, ha ha, here's all the things I've just drawn, but that's it. The power VR chip that lives inside the Dreamcast was different. Every time you gave it a triangle, it goes, that's lovely, thank you for this, I will note that down.
Starting point is 00:17:07 And you're like, well, did you draw it? It goes, nope, but I remember to draw it later. And then at the end, provided you didn't run the graphics unit out of its memory for storing all these extra triangles your storing and the link lists and everything, and it would look through the list of triangles that happened to overlap that 16 by 16 grid and it would only draw them into a tiny 16 by 16 buffer, which could be in like chip cache, right? Rather than being on the main video memory of the system,
Starting point is 00:17:32 it was in a, and it could be done in floating point because why the heck not? I'm going to store 32 bits per pixel or 16 bits per pixel in floating point. I'm going to render all of them. And when I finished, I've got the perfect 16 by 16 tile that needs to go in the top left-hand corner of the frame buffer. and now, and only now, do I dither it down to the horrible 16 bit that is all we can afford to store our screen in, and so then I send it out to screen, and I start working on the next one.
Starting point is 00:17:56 And so it was this deferred rendering thing. It was fascinating. In other words, Red Dogg, the game looked cleaner because of the engine, but also really because of the Dreamcast Power VR chip, which didn't render like other consoles. It processed each 16 by 16 tile in full detail, and then it could cull out all the hidden surfaces before doing the shading.
Starting point is 00:18:18 And then the dithering was only done when they wrote those final pixels. So by designing the engine around this tile-based approach, Matt unlocked sharper colors and smoother gradients and a quality that could rival the PlayStation 2 or even the Xbox.
Starting point is 00:18:32 But once that part was working, weird bugs started to appear and the development workflow on the Dreamcast started to become very top of mind. When we develop video games for a console, you get a very special version of the console. It may look like just a tower PC
Starting point is 00:18:48 that just happens to have all the components of that console inside of it in some way. It might look like a retail version of the console with a different colour or slightly taller or things like that. And so the Dreamcast was like a mini tower PC and it connected back to the host through Scuzzy. It looked like a Scuzzy device as far as the PC was concerned
Starting point is 00:19:10 and actually there was some clever stuff going on to communicate both of the debugging of it. but also so the PC could pretend to be a CD-ROM drive to the Dreamcast so that you could boot your game off of the CD-ROM drive. As they got closer to shipping the game, the team started burning real discs. These were called GD-ROMs, but they were basically CD-ROMs. And these let them see how the game would actually run on real production Dreamcast hardware. So you put it in the machine, you burn the disc, and now you can take that disc,
Starting point is 00:19:40 and for the first time, you can put it into a nearly retail Dreamcast. This Dreamcast has had its copy protection taken out, so it will boot anything you put into it. So you put it into the disc drive, you close the lid, you turn it on and it spins up and nothing happens. And you're like, but, but I have a get. The game works on the emulator. It works fine. It works perfectly well on the emulator over here where I'm emulating. This is sorry, this is the CD emulator on a real Dreamcast reading the CD that's actually your piece. an ISO image you've built on your PC, why does it not work on a real Dreamcast? What the heck is going on here? And so how do you debug that? You know, normally you'd bring out your debugger, you put print Fs in or whatever, but it's a retail Dreamcast or near enough a retail Dreamcast. And you need to know
Starting point is 00:20:34 where it crashed. So the solution I came up with was, so there is a single hardware register inside the Dreamcast that you could just poke to at any time, you know, in C star volatile char star blah or D word star blah equals some number. And that controlled the TV output colour when there was no picture. So there is no picture when you haven't configured the screen. There's no picture around the border, the top and the sides of the picture. But if you look at old video game hardware, you know, the 64t screen was actually sort of in the middle of the CRT screen. And the bit around the outside was the border, right? And that was out of spec, really, and it was meant to be black and there's some signal processing stuff, but you could set the colour to be whatever you like that.
Starting point is 00:21:20 Taking an aside to this, even the story, for the longest time, profiling on these machines, right? How do you profile your code? You want to have a really low overhead, how long did this part of the code take? So we would use that register. While the game was running, we would say, okay, I'm about to do the transformation, then he would do the transformation code and you would set the border colour to be red and then you would do the transformation code. I'm about to do the AI, you'd set it to green,
Starting point is 00:21:48 you do that code. And then because the beam of the CRT is tracing out the whole time and that right into that register takes instantaneous effect, that little border colour around the outside, you're using it as a time measurement, right? How many scan lines of a 60 hertz TV screen did it take before the next thing happened? And you know as well that it's very visually representative because as you get closer to the bottom of the screen, you're getting closer to dropping a frame and being slower than a frame rate, a frame refresh, right? So it's a very visible thing. And so you would talk, any developer of like My Vintage, you worked in a game industry, would refer to things in scan lines. That was your unit of time currency was always, I managed to shave a scan line off of that routine or whatever. You know, that was what you were doing. It would be one, I don't know, 512th of a 60th of a second type of thing.
Starting point is 00:22:37 It's awesome because it's so visual, right? Like, it really gives you an intuition for what's happening. And it's super low overhead because you're just changing, writing a single value somewhere. It's brilliant. Anyway, so we knew that this existed and we used it for this purpose and it was very valuable for that. But that's what we ended up using to debug this issue. So I peppered the code with different colored, the initialization code with different colors. And then we burnt one of these disks.
Starting point is 00:23:02 And now, these are 1x speed and they're like 850 meg disks because we packed. it through full of the rafter. So you're sat on tenter hooks waiting for the stupid thing to to burn. And the technology of the time was incredibly sensitive to movement. So if you banged into it, it would go, oh, fault, the lens would lose track of where it was writing. And our, admittedly, by this time we're in our nice headquarters in North London, there was like an area that we kind of marked out. It was like, do not walk around here because the suspended floor is bouncy enough that it will knock the machine around and we'll maybe lose a burn. So this was sort of behind my desk and I was like frantically changing the code,
Starting point is 00:23:42 building a new image, burning the disc, putting it in the machine. It's yellow. Okay, now I kind of had a chart. What does yellow mean? Yellow means it got to this line of code, but clearly didn't get to the thing that changes it to purple that's afterwards. Okay. So now we put, we do a new build with the colours reset and wherever that yellow was, we start with red and we move through the spectrum.
Starting point is 00:24:04 And you kind of, I was going to say binary search, but, you know, how many colours were distinguishable search through. And eventually we found it. And it was the absolute classic of C programs of the year. Although there was a little bit of C++ in there as well. But from an empty, from a cold boot, like literally turning the machine off, putting the disc in and turning it on, memory is just random numbers, right?
Starting point is 00:24:26 You hit this issue. And I don't even remember exactly which line it was. I have got the source code. We have open the source code to this. And I can't find where it was for the life of me. But it was a relatively single line change. You're like, oh, yeah, equals zero on the end of a line or something like that. And then that fixed everything.
Starting point is 00:24:41 And you're like, crikey, you know. Do you remember what color it was? I don't remember what color it was. No, no, unfortunately not. No, I don't. I don't remember which colors we used or how many I could actually distinguish, you know, unambiguously. And the irony was that a very similar bug had happened on the Croc Saturn version that I alluded to before, where a hardware register, so this was an uninitualized memory in the CPUs like domain,
Starting point is 00:25:09 this was one of the many registers that was inside the graphics unit was not being set. And exactly the same thing. If you booted it through the normal screen, or that a hardware register has been set up. But if you just booted from cold, it wasn't set correctly. And the worry about this, or the worst part of this is this got to the retail, right? They didn't notice this until they'd already shipped them to stores. And then it was only on cold boots with the ones that had come from the factory. The issue.
Starting point is 00:25:40 And the thing is, almost worse again, it's not that it didn't work. It's the fact that actually some of the graphics were inside out, which if you have a cute crocodile with a tongue and a tooth, you know, inside his head, the wrong way round, that's kind of disturbing. A whole generation of children were terrified of this sort of garish-looking. inside out crocodile head, which, you know, whoops, they put it, actually for the first set of things, they put it, printed a little slip in saying, you know, hey, to play croc, turn it on with the lid open first so that it goes into the, hey, insert a disc thing, which is enough to
Starting point is 00:26:15 initialize the registers, then put the disc in and close the lid. So the same thing twice. Back then making a console game felt like chasing a moving target, that moving target being PCs, because PC hardware was evolving at warp speed. There was new three, 3D cards, there was new drivers, there was new buzzwords every few months. Quake 2 and the Unreal Engine dropped, and all of a sudden overnight, everything else looked kind of dated. And consoles couldn't upgrade. You got one shot with the hardware you shipped on, and that gap was starting to show,
Starting point is 00:26:46 especially for the Red Dog team. So they said, look, I don't think this game's up to scratch. This is Sega, who are publishing it for us. We need dynamic lighting, and this was on like a Thursday. or, you know, I think we're going to have to reconsider whether this is okay to go forward or not. And I didn't get any sleep that weekend, but by Monday morning, we had a full lighting system. It wasn't too bad to actually put in. I think we were kind of relenting on it because we're like, we don't have the time budget to spend doing extra lighting.
Starting point is 00:27:17 So we want to put more explosions and things in rather than light the scene all the time. But that was a fun, yeah, time of being under the gun. But yeah, it came out. It was, it was reasonably well received. And, you know, there's something nice about going into a store and seeing something that you made. You know, obviously, I'd seen Croc. Croc had done really, really well. But that wasn't my game. It was a game I worked on. Whereas Red Dog, I was there from the beginning. You know, I felt like I was part of the team that came up with the ideas. We argued about what the arcs should look like. I designed the engine. I designed all of the tooling for it. I wrote the build system. You know, there's a load of stuff around it that felt like, oh, yeah, this is, this is mine. And it worked out. So very fortunate. And again, that was just because one guy happened to mention it loud enough. And I put my hand up and I said, I can do that. And he believed me. So I'm endlessly thankful to him. Their next project, after Red Dog, was something fun.
Starting point is 00:28:18 We prototyped a sort of real-time strategy-ish, team-based combat game using the Red Dog engine. on the Dreamcast, which was wonderful because we could throw away everything and use all the stuff we'd learn. And I was able to get something running at 60 frames a second at some beautiful thing rather than the 30 frames that the Red Dog was. So it was really sweet and it was lovely and whatever. But, you know, the Dreamcast was probably dead on arrival when it even ships. You know, it was not well received commercially overall. So unfortunately, it was never to be on the Dreamcast. So anyway, we showed the company, and I'm forgetting who it was now. And they were like, yeah, that looks good, but we've got this SWAT license. So we were lucky we got one of
Starting point is 00:28:59 the SWAT, you know, special weapons and tactics, US police, can we kind of like put them together? And we've never done a SWAT game on a console before. So we've pitched some ideas. We came in, and it became SWAT Global Striking. And it was originally an Xbox exclusive. Instead of reusing the Red Dog Codebase, the team built a new Xbox engine with one big focus. So we want really beautiful, sharp shadows. And we want lights that are very, very dynamic so you can shoot out any light because this is a, it's a game, you know, like you want to be going stealthy, so maybe you want to take out all the lights and then you go in with your night vision goggles and stuff. So ideas we had for the engine would actually become
Starting point is 00:29:36 gameplay elements later on. And so we had this really cool and convoluted system for doing our lighting, which I still think it, fab, I don't know that anything else has ever done it, but it was definitely not the way the rest of the industry went. And so what we did is for every light in the scene, we effectively shot out rays and worked out what it could see, right? What would be the geometry that it could see? And then we stored a separate mesh of the weird, spidery mass of where the light would land on all of the surfaces. So instead of storing it as a texture, we stored it as actual geometry. So you would have this area of lighting. Now, that was hugely computationally expensive at the time. You're taking scene geometry that's got thousands and thousands of
Starting point is 00:30:22 polygons, you're taking every light and you're saying, what can, what geometry can this light reach and then cut out the shadows, effectively, where the light can't be reached, store that as well. But it meant that we could store each light individually, and so we had this, it was great. Turning lighting into geometry is challenging, but it could be done with an understanding of the Xbox hardware, and it totally paid off. They ended up getting crisp shadows, shootable light bulbs, and scenes that reacted to the players in real time. The line between engine design is, and game design was blurring together. Anyway, this was fab.
Starting point is 00:30:56 We were having a whale of a time. We learned so much about how big, big, big C++ projects end up looking because we were writing one. But then a spanner came or a wrench, depending on your side of the Atlantic, in the works, and said, well, this Xbox exclusive title, Xbox, eh, not doing so well. It's looking like a Dreamcast all over again. we should probably also do this for PlayStation 2. And that was when the bottom fell out of our world
Starting point is 00:31:26 because we're like, but all of these tricks that we're doing aren't really PlayStation E. The PlayStation 2 was powerful, but it was super low level. Developers had to juggle these DMA engines and vector units manually with little support for the type of per pixel effects that the Xbox could do. The Xbox GPU could handle all the fancy lighting out of the box. it had these shaders, but the PS2, you could push pixels fast, but it lacked the shaders.
Starting point is 00:31:54 But we were able to get the same lighting system in, which is like a really difficult thing. We were able to come up with a way of remapping the 3D texture into 2D dynamically so that we could get the same kind of light fall off. And then the blending, oh, so the blending was a real trick. we essentially lie to the hardware and say, hey, you see this screen over here, you think it's the screen, right? We've just drawn all these lights to it.
Starting point is 00:32:24 But actually, it's not a 32-bit per-pixel screen like you thought it was. We're just going to tell you that it's an 8-bit-per-pixel alpha image, right? So you're basically telling it it's the wrong format. Now, because of some really strange hardware characteristics and for convenience of being able to, when it's in 32-bit mode, write out red, green and blue to separate RAM chips in the system so that that blending was fast so that it could effectively be reading and writing red and green and blue in parallel.
Starting point is 00:32:54 They're in literal physical different chips. When you changed it from one way to the other, you suddenly became exposed to that. So if you could somehow get a microscope and peer into the memory of that 32-bit color image, you would see that what you would see is blocks of all the red pixels grouped together, then all of the blue pixels, then all of the green pixels, but in weird banding and alternating again, because this was really a hardware optimization that was meant to be hidden from you,
Starting point is 00:33:19 but you'd lied to it and said it's an 8-bip. But what that meant was you could pluck out the red pixels by drawing thousands and thousands and thousands of tiny triangles that drew out the regions where you knew the red pixels were going to be, and that gave you an alpha image of the red colour, and then you could just draw, using that as the source texture. So this is where I'm reading from.
Starting point is 00:33:45 You draw a big red triangle over the main screen with this texture on it. And that effectively gives you that multiply, but just for the red component. Then you do it again for the green and then you do it again for the blue. And each of these is like that. But again, luckily there were these funny little DSPs.
Starting point is 00:34:01 And you could write a program that generates these millions of stupid little pointless triangles just to draw. Again, without a photograph or an image of me pointing it, it's difficult to convey. so I hope that your listeners are, I've got a decent visual understanding what I mean. Basically, they're hacking the frame buffer so that they could pull out this red and green and blue as separate layers and rebuild the lighting step by step. It's a classic mat move.
Starting point is 00:34:24 Ignore the rules, see how things actually work under the covers and talk straight to the hardware. But in principle, it was a massive trick and it was one of those like, oh, that's clever. And it wasn't my trick, it wasn't Nick's trick. We found it. It was one of the things that actually, I think it was Mike Abrash. who was working, I think, for Sony at the time. And that's why Matt is proud of this. There was a business constraint,
Starting point is 00:34:46 and it forced them to do something uncomfortable. The Dreamcast prototype became this SWAT project on the Xbox, where they built this new engine, but then they had to stretch it and find ways to do it on the PlayStation 2 through deep hardware-specific hacks. And the payoff wasn't just getting these ports working, although that was a big thing.
Starting point is 00:35:05 It was also this lesson about how knowing about the next layer below you gives you options when the rules change. After games, Matt carried the same instinct for pulling back the curtain into a new world, high-speed finance. The problems look different, but the pattern was the same. When something felt off, he couldn't stop
Starting point is 00:35:23 until he understood what was happening. Why was this abstraction breaking down? In game engines, that might mean poking hardware registers and debugging with colors, but in trading systems, it would be more like chasing down timing bugs, asking what the machine was really doing underneath all of the abstractions.
Starting point is 00:35:39 Here's a real example. A weird issue started cropping up at Market Open. An expensive, high-performing network card kept dropping packets. It wasn't the most urgent problem, but something about it had that same feel as debugging the Dreamcast. Something about it implied that the operating system abstraction wasn't quite working the way it said it should. And so it grabbed Matt's attention and he started digging in. The first thing he noticed was that this network card needed a huge chunk of memory ready to go. And they had a flag that let us pre-fault in all of the memory that it was going to use so that it was never going to be demand page.
Starting point is 00:36:19 So normally when you allocate a big slab of memory in Linux, Linux goes, yes, and it hands you back an area of memory that has no memory in it. And then as you start reading a writing to it, it actually starts finding the memory and giving it to you. So that means that when you allocate 20 gig, it will, go, yep. And it's like, well, I haven't got 20 gigs. So yeah, hopefully you won't look at all 20 of those gigabytes. Because if you do, we're in trouble. But if you don't, no harm, no foul, right? You know, so, but that's really important because that process of faulting in those pages takes time.
Starting point is 00:36:53 And it's also magic invisible time. You know, as far as you're concerned, you just read some bytes from the network. But actually, or you wrote some bites in this instance. But actually in this instance, that caused the operating system to go, whoops, I lied to you, there's no memory there. I'm going to come in right now. You know, nobody expects the operating system. It comes in, like the Spanish Inquisition and starts like taking up loads of your time.
Starting point is 00:37:15 You're like, I'm in the middle of doing something really time critical, and now you're allocating memory for me? This is bonkers. The vendor saw this problem coming. That's why they had added this flag. Their idea was to avoid page vaults by reading all the memory up front. They were basically poking through this virtual memory. abstraction to force pre-allocate everything. But even with this workaround, things were still breaking.
Starting point is 00:37:39 And the way it was breaking is what caught Matt's attention because it just didn't make sense. Why the hell, you know, this network thread is the one that's actually having the problem. And it should be doing nothing other than looking at these bits of memory. Very long story, but we discovered a tool called System Tap, which allows us to like really hook in the lowest level of the operating system and go, why are you doing this? What's going on? And we discovered that the network process when it was under heavy load, was trying to take out a lock and we're like, what are you doing network code? You're literally like zero copy network code with zero lock,
Starting point is 00:38:11 lock free code and everything. Where the hell is this lock coming from? Skipping over some steps, here's what Matt found. The flag for pre-allocating memory tried to trigger page vaults by reading a byte from each block in that 20 gig chunk. Okay, seems reasonable.
Starting point is 00:38:27 That definitely causes it to become faulted in, except that it had been written in C and when you read from memory there's no side effect for reading to memory and so a compiler, an optimising compiler will go, well, you just read that and did nothing with it and so it optimised it away. But it had worked for years because
Starting point is 00:38:43 it had, you know, compilers had been crap. Luckily, there was this website that I could use where I could submit a bug report and the patch to say, like, I think you'll find that you're not actually faulting in the memory after all and there's a much better way to do it anyway. But, you know, but we dug and we dug and we dug and we found that issue.
Starting point is 00:39:00 And I learnt what system tap was, and that has been a regular part of my armoury for this kind of like tool going forward. And we solved a problem which was not just in our code. It was in, I mean, we solved our actual business goal, which was like don't drop packets during the open. And we made the open source software better. So, you know, everyone was a winner in this. And it took a long time to get here. And, you know, there's that famous XKC, well, there are many famous XKCDs, right?
Starting point is 00:39:25 But, you know, there's an XKCD that talks about automating your tools. And, you know, like if it takes you one minute and it takes you. you two minutes to build a tool, it's not worth it, those kinds of things, right? That matrix. And I fundamentally disagree, unfortunately, with that particular thing, because it does not take into account the lessons you learnt along the way. And so while you could sort of say to me, you know, if this issue wasn't quite as, you know, business limiting as it was, it was actually causing us trouble, but it was just one of those niggling problems were like, why is that? And I'd left there. Okay, you could have said, like, it's equivocal whether it was actually better or not.
Starting point is 00:39:58 sure, but I learned system tap and I learned how kernel bypass worked and I learned how the operating system does this and I learned a compiler thing. You know, all of these things are more important than actually solving the problem. And so I think the journey is worth it, even if the thing you get to is not, you don't even get there, right? It's just you're learning all the way. Maybe you don't know how the operating system itself exactly works, but know that page tables exist because that maybe some reason why your thing doesn't work. Again, it's got to be like somewhere in your head that there's a place as a starting point for me to look at to know that these things exist. Not being familiar with it is fine, but just knowing that it's there.
Starting point is 00:40:38 And similarly, like if you're up in the web world, like what layers are between the web page that I'm writing in this format and the actual HTML that goes to a browser? How does a browser actually understand the HTML? Right? There's a lot of subtleties in there. Why does the browser make these stupid decisions sometimes? Well, if you thought about how a browser has to work, well, maybe you should think about that or, you know, why doesn't my thing work this way? You know, like, when I had a friend who
Starting point is 00:41:01 would always say, you know, like, oh, this can't be that hard. How hard can this be? And it would be like package management or build system. And then I'd say, go give it a go. And he comes back and says, it was really, really hard. I'm glad we used something else. I'm like, yes, you know, but those kinds of experiences, I think, are part of what makes you understand, like, where the boundaries are. And all you need to do is be able to look, see beyond that boundary and then be aware of the boundary beyond that. So like we all work on a convenient level of abstraction, right? The floor will not fall out underneath me. I don't have to understand exactly why, but I take it as red that I'm not going to fall down anytime soon. And when you're working in software, it's so convenient to be
Starting point is 00:41:38 able to say, like, I'm just going to call this function and it does the thing that it says it's going to do and nothing more and nothing less. I don't have to look at it anymore. I don't have to understand any more than like string copy or printf. How does printf work? I've never stopped to think, but it doesn't matter because it works, right? So you live at that abstraction and it gives you so much power because it frees up brain space to do loads of other things that are important. But what I would like to, this is my thesis now, is that while you should have a layer of abstraction you're familiar with and you're comfortable with, you should also have a decent understanding of the layer beneath you. So if you're a C programmer, you should probably understand how the C runtime works to some level. And you should understand how the operating system interaction works.
Starting point is 00:42:20 You don't have to know exactly, but have some knowledge. Because one day, Printf will not work, and then you'll be like, what? And if it's a complete mystery to you, you won't know how to take the first level off and look at it. So I think you should know one level well, have a working knowledge of the level beneath you. And then fundamentally, this is probably the strongest thing, is be aware of the shape of the layer beneath that. That was the show. That story is a good reminder that real progress happens when you stay curious. Because if you're curious, you can dig into tough problems and you won't be afraid to relearn when you get things wrong.
Starting point is 00:43:01 Because if you really understand the layer you work in, and if you understand what's below, you can turn these limits into skills. So let's call that Matt Godbolt's rule. You should know your layer well. But you should also know one layer below it a little bit, and you definitely need to know the shape of the layer that's beneath that. So yeah, write that down. That's Godbolt's rule. Probably also something to be said for paying attention to what's exciting to you. For Matt, it's this digging into the layers below. For you, maybe it's something different.
Starting point is 00:43:32 Thank you to everybody who's been sending me emails lately. I do very much. I appreciate that. Emails, LinkedIn messages, Twitter messages, or blue sky, I appreciate it all. Sometimes I'm super motivated and sometimes I'm not motivated at all. And definitely somebody telling me that they appreciate what I do. raises the bar. So seriously, thank you. And thank you so much to my supporters. You can go to co-recursive.com slash supporters if you want to join them. And thank you to Matt. And until next time,
Starting point is 00:44:03 thank you so much for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.