Embedded - 506: How Do I Fit a Whale Into an Apartment Building?

Episode Date: July 25, 2025

Dmitry Grinberg joined us to talk about running Linux on small microprocessors (physically small and/or 4-bit). Dmitry does this by emulating a MIPS processor. Boot times vary between minutes and days..., depending on the processor.  Dmitry’s projects are on his website (dmitry.gr) including: 8-pin Linux (Cortex-M0+!) Linux on an 8-bit micro? Linux/4004 Dmitry recommended NandGame, an online game about building up a processor. We mentioned Eric Schlaepfer of TubeTime. He was on the show on 419: Fission Chips, with EMSL’s Windell Oskay, talking about their book Open Circuits. Transcript Mouser Electronics has a dedicated Empowering Innovation Together hub that covers the latest breakthroughs in tech. Their new series explores how AI is reshaping engineering—from design automation to rapid prototyping and predictive maintenance. You’ll find insightful articles, podcasts, and videos that showcase real-world applications across industries. If you’re ready to see how AI is powering the next generation of engineering, head over to Mouser.com/empowering-innovation.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to embedded. I am Alicia White alongside Christopher White. Our guest this week is Dimitri Grinberg and we're going to talk about Linux on microcontrollers or whatever's smaller than microcontrollers like itty bitty controllers. Hi Dimitri, welcome. Hi, nicemitry, welcome. Hi, nice to be here. Dmitry, could you tell us about yourself as if we, I don't know, met on an airplane?
Starting point is 00:00:38 Sure. I'm a software engineer. I mostly like to write code for small systems. I occasionally have delusions of designing hardware. Sometimes it even works. And I like to write up my projects because I like to think that people sometimes will try to reproduce them
Starting point is 00:00:54 and actually learn something from it. Sometimes you will write to me and tell me that's what happened, and that's kind of cool. And you have some really interesting projects. But before we talk to you about those, we want to do lightning round, where we ask you short questions and we want short answers. And if we're behaving ourselves, we won't ask, well, what about the networking system
Starting point is 00:01:14 or other longer questions? Are you ready? Bring it. Favorite processor? Probably one of the Apple Silicon processors. ARM64. Favorite Linux distribution? Debian, because it mostly just sticks to doing what it's supposed to be and not showing off.
Starting point is 00:01:35 How many bits is enough bits? Depends for what? I could do a lot with just one. How many bits is too many bits? for what? I could do a lab with just one. How many bits is too many bits? I don't play too much with 64-bit and higher. It's just too big, not much of a challenge. KDE or GNOME? Neither, I use XFCE. Okay.
Starting point is 00:02:00 Complete one project or start a dozen? I don't allow myself to have more than two projects in flight because if I did, I would never finish anything. I don't consider a project done until it's finished and written up. So I'd like to start a dozen, but I don't let myself do that. Favorite fictional robot?
Starting point is 00:02:21 Absolutely, Marvin, the paranoid Android. Do you have a tip everyone should know? Yes, if you don't know how to do something, find some corner of it that you kind of know how to do and then do it. That'll get you into the problem more and both motivate you and help you figure out how to do some other adjacent parts of it. Eventually before you know it, the whole problem will be done. Okay, I gotta go write that down. That's really, really good. It is a really good
Starting point is 00:02:46 one. And it applies to lots of things besides technology. That's very good. I like that. Before we get back to today's discussion, I want to share a new resource for anyone curious about the rise of AI in engineering. Mouser Electronics has a dedicated empowering innovation together hub that covers the latest breakthroughs in tech. Their new series explores how AI is reshaping engineering from design automation to rapid prototyping and predictive maintenance. You'll find insightful articles, podcasts, and videos that showcase real-world applications across industries. If you're ready to see how AI is powering the next generation of engineering, head over
Starting point is 00:03:31 to mouser.com slash empowering dash innovation. Now let's get back to the show. We have been kind of skirting around small processors and we mentioned Linux. So last spring you posted about an 8-pin Linux project. This was three chips that each have eight pins. Could you tell us about it? Yeah. So the idea was that sometime long ago
Starting point is 00:04:06 in computing history, you could actually buy a kit computer that was comparable to the sort of computer you could buy in a store. But you could assemble this at home and have comparable computing power. We are very long away from that now. But I thought it'd be cool to try something similar. So for some definitions of modern computer,
Starting point is 00:04:24 being able to run Linux and compile a C program can sort of qualify. So I tried to find the minimal set of parts I could use to produce such a computer that one could still hand assemble at home, assuming no soldering skill, which means only using very large chips with very few pins. And I built on some previous code I'd written to emulate MIPS and boot that on it. And then I designed a small board that is about an inch across circular with a USB connector on it that actually does all this. And if you assemble this at home, it will over about a minute boot and you'll be faced with a command line with the real Linux on it that you can then actually write some code and compile it and run it.
Starting point is 00:05:06 So the Linux has, you mentioned, GCC compiler and make and then in editor Vi, right? Yep. And, but there's no networking? No. So networking was something I thought about, but there was not much I could do with just chips with eight pins, which was of course an artificial imitation, but that's what made the project interesting. So in order to interface to USB, the options are very few in terms of things that have eight pins. It's either USB to serial chips, which is what I used, or you can use
Starting point is 00:05:43 something like an 8-bit AVR, 80-tiny, 2-bit Bang USB. That gives you a few more options, but networking's not happening anyways. So this USB to serial chip was one of your three allowed chips. Yes, it was a really cute chip I found from Prolific. I'd never seen it before, but now I like it a lot.
Starting point is 00:06:03 Basically, you connect USB to one end of it, serial comes out the other, it doesn't need any external parts, and it even will supply 3.3 volts for the rest of your project. It's actually pretty awesome. Yeah, I've seen those on a previous project, and unfortunately we didn't get to use it, but I was pretty excited that it existed.
Starting point is 00:06:21 And one of your other chips was RAM, right? Yes, so Linux needs megabytes of RAM. If you're going to stick with TexModEbby, you can fit into about 8 megabytes. So I used an 8 megabyte PSRAM chip. There's actually a bunch of them out now. They're pretty convenient for all sorts of projects. They're made by AP Memory in the US and in China. There's a company called Vilsion that also makes them. And an SD card, which isn't counting against your chip count.
Starting point is 00:06:49 Well, I don't think it's a chip. It doesn't have legs and you're not soldering it. So I think that's a fair excuse. I will accept that, yes. And now the big question. You actually, in your write-up, did an analysis of half a dozen, almost a dozen microcontrollers with the pros and cons. And I have to admit that I have a little bias concerning the PIC 16. And so I loved your analysis. Why was that not suitable for this particular project? So the annoying part about PIC 16 is it's not very performant at all. So I actually
Starting point is 00:07:36 started my microcontroller project programming long, long ago with PICs, and I was a reasonably big fan of them at the time because they could go relatively fast at the time but now we have our microcontrollers that go much much faster. PIC 16 will go at 32 megahertz which doesn't sound terrible until you realize that it takes four cycles to do an instruction so you're down to about 8 million operations per second but then it's also an accumulator based architecture so even moving a value from one location to another is two instructions and god help you if you need to actually do something with the value. So before you know it, you're going really, really slowly.
Starting point is 00:08:15 That and even though there is a C compiler for these, it doesn't produce very good code. It can't. This architecture was never designed for C. It was designed for being programmed in assembly, which is fine. I've done plenty of assembly programming, but even assuming I wrote my MIPS simulator in assembly, the other options were a lot more performant. And my goal was not only to have this thing be small, but also performant enough that I could claim
Starting point is 00:08:41 it's interactive with some degree of being honest. And I don't think that could be done on that pick. But I did fairly evaluate every 8-pin microcontroller I could find. Yeah, including an AVR, which you've run Linux on before. Yes. But you didn't think that would be interactive because the boot time on that was a little long. Yeah. So first time I did this was six hours long
Starting point is 00:09:07 because I wrote it in C. If I did an assembly now, I could probably get it down to two, but that's still a lot. I mainly chose the fastest thing that had enough useful pins. You'd be surprised how many manufacturers make an eight pin part and then do completely stupid stuff with actual pins. Like if you're making an eight pin part and then do completely stupid stuff with actual pins.
Starting point is 00:09:26 Like if you're making an eight pin part, there should be one supply pin and one ground pin. Why would you have two supply pins? Why would you waste an extra pin out of your array of just eight? I don't know, and yet that was really, really common. And then you determined you needed one MIPS of CPU. How did you make that determination? Is it a Debian thing? No, I mean, there's no official requirement. Among my other projects, I booted Linux at 74 hertz.
Starting point is 00:09:55 That was basically me trying to evaluate how fast it would need to boot in order to boot within a few minutes so a user doesn't get bored and so that VI is interactive enough that you're not annoyed using it. And I just evaluated that by trying. I literally made an emulator for MIPS a while ago for a different project. And I was able to slow it down on purpose on my PC and come up with that approximate number.
Starting point is 00:10:21 So this MIPS emulator, we are, okay, maybe take me through the boot process, like from the reset vector. I'm on an ARM processor and I've hit the reset vector and normally now I would be doing the CNIT and then going to main. So mostly the same thing, there is in it and you go to main. Main will then just load the first sector from the SD card into the external RAM and then it'll start a MIPS emulator. And the MIPS emulator sees the external RAM as being its only RAM and it jumps to the beginning of it. The code there loads the partition table and looks at it to look
Starting point is 00:11:07 for a partition which is marked with a particular type. I think I used type BB. It then sees which sectors it contains and loads those sectors into RAM. Then that is the next stage bootloader which then looks at the partition table again, finds which partition is marked as active. Actually, this one is much bigger, so it has a file system support. So it mounts it as fat 16, finds a file on there called VM Linux, parses as an L file, loads it properly into RAM. Then it looks for a file with clock in the name and that it's used to figure out how fast the CPU should be running since I'm overclocking it and I wanted to give people a chance to
Starting point is 00:11:49 decide how fast. And after that, it just jumps into executing the MIPS code in Linux kernel, which will then come up and eventually it'll load partition number three because that's what the command line says as the root of S and then you're just running MIPS Linux. Okay, so this clock file, I wanna separate between the MIPS emulator and the- The bare metal?
Starting point is 00:12:16 The bare metal, the C running bare metal. Yeah. Does the C running bare metal, it has access to the file system? Yes. And so it has something like TinyFS or some other FAT file system? Yeah. So a very long time ago for a different project, I wrote a very tiny implementation of FAT16 that only supports read-only and needs about nine bytes of state.
Starting point is 00:12:49 That's what I'm using. Wow. That's a cool idea, read-only. It saves a lot of headache. It really does. Exactly. If you're clever, if you're using an SD card, because it's SPI and you can pause the clock any time,
Starting point is 00:13:03 you can get away without even having a 512-byte sector buffer. Sys was used when I was writing a project to play audio from an SD card on a microcontroller with only 128 bytes of RAM. The idea is you can start reading a sector, get to the byte offset you need and read it. If the next offset you need is further, you just keep going. If not, you keep going and skip to the end and start it again. It's slightly inefficient,
Starting point is 00:13:24 but you can get away with only having nine bytes of state. Sorry, my brain is whirling about that. And so then you read this clock and you have to go to the bare metal for that because that needs to access the chip registers for this particular microcontroller. And then after that, you're all in emulation. There's a few steps here that I guess I sort of skipped over because they're not relevant to the emulator. But there's also a bootloader on the MCU in order to allow me to update it since there's no debug connection, it's usable, not enough pins.
Starting point is 00:14:03 So in order to allow people to update it, there's also a bootloader which also mounts the SD card, looks for an update file and if it finds one will apply it, then jumps to the application that we discussed. It'll work without the bootloader as well. It's not an integral part. It'll just be very, very difficult to update if you wanted to change the emulator or fix something. You just needed JTAG.
Starting point is 00:14:23 No, you wouldn't just need a JTAG unit because this processor doesn't have, how would you update it? Well, so there's only eight pins, which means there's not enough pins to do everything. So to allow me to debug it, I do have an SWG connector here. And when the processor starts, it waits for about six seconds before doing anything, including reconfiguring those pins. So if you attach a debugger within the first six seconds, you can debug it. But this board cannot boot with the debugger attached,
Starting point is 00:14:53 because the debugger will mess with the other uses of those pins. So initially, debugging I did over that. But now, basically, you can only update it using the SD card. You put a file called update.bin on the SD card, and a particular byte offset is a version number. It's compared to the current firmware's version number. And if the one on the card is a higher one, it'll be copied from card into Flash,
Starting point is 00:15:15 and then a reboot will happen. So you can update it that way. No debugging. Most you can do is output serial console out through the serial port. I want to take a little step back because we talked about the emulator history of Linux stuff. Like you said, Debian runs in text only with 8 megabytes of RAM.
Starting point is 00:15:35 When I first started using Linux, it was a million years ago, on a 486 with, I think, four or eight megabytes of RAM, something like that. It was very small, but that was as big as Linux got back then when it first came out. Okay, long way around to my question. Use a MIPS emulator. Why is that architecture the right choice for something like this? Well, so two parts to this question. One is why use an emulator at all since Linux technically supports ARM.
Starting point is 00:16:07 Good to answer that, yeah. The answer here is that my RAM is not real RAM. It's an SPI device. It's not actually in the address space of a microcontroller. I can't just read and write to it. So things that Linux expects like let's allocate an 8 megabyte buffer or a 2 megabyte buffer. That's not going to happen. Because of that, we can't run Linux natively. We have to emulate to basically make this fake RAM look like real RAM. Every time an emulated device tries to do a memory load or a store, I can actually issue an SPI transaction.
Starting point is 00:16:37 But why MIPS? I guess I should have made a table for that too. I didn't. But basically, if you have to emulate, there's a few choices you can use for what's properly supported by Linux. You're looking at x86, ARM, MIPS, and RISC-V nowadays. Almost everything else is more or less not properly supported if you want a real kernel
Starting point is 00:16:57 and a real file system. x86 is a giant pain in the ass to emulate. Just decoding X86 instructions properly is enough to make a man go mad. And then there is all sorts of other arcane stuff you can have to support. So that goes out the window right away. Next up is ARM.
Starting point is 00:17:15 So I've done ARM before. My first attempt running Linux on AVR was ARM. But emulating ARM is also slightly tedious because all instructions can be conditional. And what that forces you to do is constantly be evaluating that condition. That slows down your emulation because before you get to the guts of emulating instruction, you're first forced to evaluate if it should execute or not. Great for actually writing code, annoying for emulation. So that leaves us with MIPS and RISC-V. Those are quite similar in design,
Starting point is 00:17:45 but RISC-V designers did one really annoying thing for emulation is a lot of immediates. So the values in instructions are not continuous in the instruction. They're split over some bits here, some bits there. It makes it convenient for making a hardware implementation of RISC-V, but very annoying for emulation. Because now in order to assemble this immediate value,
Starting point is 00:18:07 you have to end with one constant and with another constant, shift and add. I just described six instructions that I'm doing to get this immediate. Whereas in MIPS, it's always the bottom 16 bits. One single UXTH or SXTH instruction and you've got your immediate. And when you're trying to go fast, every cycle in the emulation counts.
Starting point is 00:18:28 So picking the architecture where I can get the value out four cycles faster matters, which is how you end up with MIPS. Plus it's still quite well supported by Linux kernel and by Debian user space, which is kind of cool. But you do, part of that emulation is some sort of hardware abstraction where you mentioned the RAM needed to be, I'm going to say translated, wrapped. And I assume the SD card interaction and also the serial port? So you emulate not just a processor, you emulate a a machine Linux doesn't support just processors. So the machine I'm emulating here is a deck station
Starting point is 00:19:10 2100 it's something deck made a while ago just because I happen to be familiar with that That's a deep cut. Yes. Okay, and That one did have a serial port card in it and it's a pretty simple one it's based on DZ11 which was actually a serial port card back in PDP 11 days. DEC just grabbed it and reused it. So I'm emulating a serial port card for real. Linux thinks it's talking to a real DZ11 card. I am not emulating a SCSI disk drive because that is very tedious. So instead I'm using the fact that Linux is open sourced to my advantage here and I wrote my own disk driver which I call PVD or it was a partially
Starting point is 00:19:53 virtualized disk and the idea is that I'm using a undefined MIPS instruction to make a call from the emulator world into the emulator itself And one of those calls I can make is, give me a sector or write a sector. And you provide a physical address in your RAM and a sector number. And then the actual emulator does the SD card access. So Linux just thinks it has a very weird disk device that is instantly fast.
Starting point is 00:20:22 And I assume you weren't using Debian 12 because that I think released later than your project notes. Yes. So, Debian did finally drop MIPS 1 support. Let me see if I can find out when was the time they dropped it because I researched all this and I was writing it and I forgot the actual answer. It looks like I think it was the last release that they dropped MIPS 1 support. It lasted for a while but now it is dead, which makes sense MIPS 1 is long, long, long, long, long dead. So I think it might be the one before
Starting point is 00:21:02 Bookworm that the release I'm using is. the thing is MIPS had a number of revisions So it was MIPS 1 then there was 2, 3 then there was MIPS 32, MIPS 64 then there was MIPS 32 E2, E3, E5, E6. I don't think I skipped any there. There was no 4. At least not that I know of But anyways MIPS 1 was supported for a very very long time. So that made it my life very easy Just grab a root of S and it all works. The other cool thing there is that MIPS was the last architecture I could find supported by modern Linux,
Starting point is 00:21:33 where having a floating point is optional, floating point unit that is. Which is really convenient for emulation because emulating floating point units is a pain in the ass. When I download a new Linux, which I usually end up with Ubuntu, for all that says about me, I'm sorry, it has a whole bunch of gruff that I don't want. And yet I have trouble getting rid of it because I don't know how everything interacts and
Starting point is 00:22:00 because I just don't care. How did you make a Debian distribution that was this small? What did you have to go through to get a clean distribution that didn't have all of the cruft? There's kind of two parts to this question. So one is Debian has a minimal root effect, which has nondigraphical stuff. That makes it quite easier. Why didn't I use Debian has a minimal root effect, which has non-digraphical stuff. That makes it quite easier. Why didn't I use Debian? But the other part, I mean, you are in fact using it. Ubuntu is based on Debian. But the other part is all this stuff gets started by an init system.
Starting point is 00:22:39 And to make this project faster, I actually didn't use the init system. So initially for testing, I booted directly into bash. But that leaves you with too few things to a point where console will not act correctly, or you won't have tmpfs which will break some implications. So I wrote my own little tiny init system here, which will mount tmpfs and procfs and start a couple of services and then dump you into bash. However, the emulator is complete. You can enable swap on SD card and run proper in it and it will bring up the entire thing complete with send mail and everything else you might not want. It takes a while though. So spawning a new process takes a really long time, just because
Starting point is 00:23:26 we are still running only at one MIPS. And you'll be amazed how many processes get spawned by even Debian, which is pretty small, starting. At some point, I counted a few hundred. So basically, if you actually want to exec init, which is the command you would use to go properly, boot up the whole system all the way up to the proper login prompt and all, it takes a little over an hour. Oh. Oh, okay. There's just no need to ever do that because my init brings up all you need in order to integrate with compiler, the I, you have a shell.
Starting point is 00:24:00 I mean, basically all the parts you would need you have. No, no, that totally makes sense. But I was having trouble with the idea you were booting it all up in a few minutes, if you were booting it all up. Kernel is up, you have a command line, crock is mounted, sys is mounted, dev is mounted, your compiler works, vii works, Emacs works, and your command prompt works for more because you need. But you didn't trim it.
Starting point is 00:24:29 You didn't try to get to- No, I just started with WN Minimal. Okay. If you don't start, I mean, they're not hurting me by sitting on the SD card. If I was trying to use a smaller SD card just for some other reason, maybe yes, but 1 gigabyte SD cards nowadays are in fact becoming more expensive than larger ones.
Starting point is 00:24:49 So at this point in time, the cheapest SD cards I buy on eBay is 100 packs of 2 gig SD cards for like 40 cents each. So trimming it down to less than 2 gigs made no sense. Part of my brain still has 100 megabyte SD card is probably the largest you're going to get. And I know that that part of my brain lives in like 2004, but it's just sometimes I forget that what used to be rare now is common. That's because we're venerable. That's the word. Venerable, that's the word.
Starting point is 00:25:24 I know podcast listeners are probably tired of me saying things from the past, but I remember how much it cost to get a 15 megabyte hard drive in my computer. Yeah, I'm with you there. But nowadays, go on eBay, search for two GBSD card. There are people selling like 100 packs and they'll come to you in a Ziploc baggie. Buy a couple of those and you're set for projects for life.
Starting point is 00:25:50 You never even reuse one. You just stick a card to a project and that's it. That's a really great idea actually. Because when I need SD cards, I usually go to some larger vendor and buy, well, the smallest one I get is 32 gigs. Well, I need 500 megs for this project, but that's the smallest I can get,
Starting point is 00:26:08 so I guess I'll pay the five or 10 bucks for that. But that's a much better idea to just stock up with the two gigs and forget about it. Yeah, I'll just buy them in 100 packs. Okay, this is back from 2004. The larger SD cards and their spy interface didn't used to work well, but it sounds like it's all fine now that in the intervening 20 years, they became okay.
Starting point is 00:26:32 The larger two gigabyte SD cards. No. So maybe what you're thinking of is when we switched from SD to SDHC, a lot of code that supported SD didn't work. But yeah, because the protocol changed. I had some fun time with that back in Palm OS days. But yeah, nowadays they're all HC cards and it's all nice and easy.
Starting point is 00:26:53 SPI protocol is still mandated in the spec, so they all do support it. Though there are bugs, there are all sorts of exciting bugs because there are corners of the spec that no one reads quite accurately. Yeah. I mean, there always have been. It's just which ones there are right now. And which SD cards.
Starting point is 00:27:12 Do you end up being reliant on one or three brands of SD cards or calling out some that don't work? I just buy whichever ones are on eBay. And then if I have an SD card driver I've been slowly improving since about 2004. It's the one that I initially wrote for Palm West. So over time, I just improve it. So when I find a card that does something weird, I will add support for that weirdness. And I actually have a stack of SD cards I test with the more weird ones over the years,
Starting point is 00:27:44 make sure I didn't break any of them. And that's it. So my driver over time slowly evolves to support all sorts of stuff. Okay. My latest one was a fake Sandisk branded card that claims to support command 23. It's optional. You're not required to but there's a way to query if the card supports it. It claims to support it but if you send it, it rejects it as invalid. Why? I don't know. It's not a marketing thing. No one cares if you support that command, but claiming to support it and then rejecting it
Starting point is 00:28:13 is an interesting thing I've never seen before. Yes. Okay. I know you didn't want to talk about your day job and I'm not going to ask direct questions, but does this work at all reflect your day job and I'm not going to ask direct questions. But does this work at all reflect your day job? Does it contribute to your day job? Is this purely separate? Well, I mean, I suppose this keeps me sharp so I can do clever things in my day job occasionally,
Starting point is 00:28:39 but no one's going to pay me to tinker with microcontrollers all day, I'm afraid. To be able to minimize Linux in this way, I think you'd be surprised. Some of my clients have wanted it, and I am not the person for it. Oftentimes, if someone asks you that question, I think this is called an XY problem, is what the name for this is. When someone asks you a question where an answer may exist, but really you got to wonder why they're asking the question because they probably already headed the wrong way. So if someone tells you, how do I fit a whale into an apartment building? There may be an
Starting point is 00:29:13 answer, but first you got to ask them why they're asking the question. So if someone asks you to minimize Linux into a tiny space, almost always the first question you should be asking them is what are you actually trying to accomplish Because you're probably going around it the wrong way. Yes, that is exactly what I usually do. And that's our whole job, pretty much, is to ask that question. Yep. Yes, and all of the horrible questions after, like, do you want the whale to come out of the apartment building later?
Starting point is 00:29:43 Do you want to be able to replace the whale? The client usually responds with what color they want the whale to be. Yes. Sorry, Dimitri, please go on. Oh no, I was going to say that usually when they ask you for that, what you'll find is they're going to try to run a five line Python script
Starting point is 00:29:59 they have lying around. Yes. Yes, indeed. I've had some fun experience with that because a really long time ago, I wrote the world's tiniest JVM that fit on microcontrollers. And I've had people actually license it from me for all sorts of very strange reasons. This is the Java virtual machine. Just before we go on. Okay, go ahead. Yeah, I wanted to see how Java works, so I figured the best way is to write a JVM.
Starting point is 00:30:23 So I did. And I wrote one that fits in microcontrollers and like a few hundred bytes of RAM. And I've had some companies license it from me, including one that makes agricultural products. Apparently, there's weather stations for farmers, and they wanted a way for them to program arbitrarily complex alerts to be alerted about. And the way they decided to solve this by letting them upload a compiled Java class file to this weather station's microcontroller. And they licensed my JVM from me to do that. But you know, you probably don't need a JVM.
Starting point is 00:30:55 You help them with the whale. The wind is greater than five miles an hour and it's raining. Yeah. And yet it was the way they wanted to do it. Yeah. Okay. So we mentioned, did we mention 8-bit AVRs in Linux? We did, didn't we? Briefly.
Starting point is 00:31:11 And the boot time there was six hours. Well that was not speed optimized actually. That emulator was actually written in C, so it could have been optimized quite a ways in assembly. Okay. At the time it was just a yes or no thing. Could I make it work or not? Once I made it work, I moved on with my life.
Starting point is 00:31:29 This time I actually care about performance, so the emulator's written in assembly. But that's not the smallest or most venerable processor you have put Linux on. No, that would be the Intel 404. Okay. venerable processor you have put Linux on? No, that would be the Intel 404. Okay. A four-bit microprocessor from 1971. Yep, that's the one. Finally, something that predates me.
Starting point is 00:31:57 And this doesn't have eight megabits of RAM and one MIPS CPU? Well, it does. It still uses the same RAM chip, the 8 megabyte PS RAM. I'm just talking to it over a bit banged SPI that I'm bit banging out of 404 at a great speed of 7 kilohertz. 7 kilohertz, kilohertz! Okay. It's the best I could do. 404 is not very fast. The fun part there is that the RAM chip, it's PSRAM, so it needs to self refresh internally.
Starting point is 00:32:30 So they specify how long you're allowed to keep it selected over SPI so you don't break refresh. And it said eight milliseconds, so I spent a lot of time optimizing my 404 code to fit into that eight millisecond boundary. And after I got it working, I realized that I misread the data sheet, it was actually eight microseconds. Oh, no. Oh, with that point you're-
Starting point is 00:32:49 But it turns out it still worked. It still worked? Yes. The reason the limitation is there is to give it enough time to refresh, because it's designed that you select it, read or write, deselect it, select it again, etc. It's not designed to just sit
Starting point is 00:33:02 there deselected for a long time. So it expects to do its own internal refresh in small chunks. But I select it so rarely because I'm running so slowly, that it has enough time in between these 8 millisecond selections to refresh itself plenty. So it turns out to work fine. Ah, so 120 Hz, 8 milliseconds? Did I do that right? Yeah. So what was running at 120 hertz? Well, I also did a MIPS emulator on the 404 because once again MIPS is the easiest thing to emulate. And that emulated CPU runs at an effective speed of 74 hertz. So it's still booting now.
Starting point is 00:33:41 No, it takes about five and a half days. I love it. Where did you get an Intel 4004? eBay. I had to beat out some collectors who buy them just because it's the first microprocessor ever made. Most of them just put it on a table and never look at it again, but unfortunately there are a lot of them, so the prices tend to climb up.
Starting point is 00:34:07 The plastic ones are 250 bucks. If you want one of the pretty ceramic ones, you'll be paying thousands. TubeTime, Eric, I think you're familiar with him. Yep. He has the 8008 board. Yep. Have you run Linux on his exploded? The AD08 board. Have you run Linux on his exploded... It's a 6502. Oh, is it?
Starting point is 00:34:31 Yeah. He made 6502 out of transistors, yeah. Right. And so, did you... Have you run Linux on his transistor board? That's way too slow. Well, it wasn't when I thought it was an AD. No, it's a similar I thought it was an 80.
Starting point is 00:34:45 It's a similar thing as on my to-do list. So the reason I did the 404 thing is after I ran Linux and AVR in 2012, that was basically the record for lowest-end machine to run Linux for a very long time. And then in 2023, two people actually beat that record. One person used a smaller AVR, and another one did it on a Commodore 64, which has basically a 6502 processor on it. So I needed to regain my record,
Starting point is 00:35:12 I needed something lower end than a 6502. So I went with a 404. So that's why I explored 6502. Oh no, it's not setting the floor. I already have plans for the next ones. My plan is to try to beat this record once every 10 years to give people a chance to beat me as well since this time someone did.
Starting point is 00:35:32 So my next few steps, so next one will be a one bit industrial controller Motorola did, it's an MC14000B I think. After that, I'll try to make a board with just transistors and after that, I'm hoping I'll have time to put together a board with only vacuum tubes. I'm running on ANIAC. I'll have to put together my own CPU there. Since there's not really a mass market vacuum tube based CPU, I can go buy.
Starting point is 00:35:59 But the idea is to keep moving this record for lowest end machine to run Linux down further and further, just for fun. Obviously, we're well below the speed where it's practical. On 404, you type ls and you come back 20 hours later to see the directory listing. It seems like you should be collaborating with museums at some point. Well, at some point in time, it'll become a museum. I think about Eric's board and how you can see it light up as it works.
Starting point is 00:36:24 That would make the five days worth watching. Yeah, it would be really interesting. So once I get to a transistor-based board, it'll be something similar. But even now on my projects, I literally output the current instruction calendar PC to LEDs. So you can literally see Linux boot. When it's going so slow, you can actually notice it. You can look at the patterns of lights and know which part of the kernel it's in. While watching it, I learned to recognize the common things like memory zeroing function.
Starting point is 00:36:52 I know what that looks like in LEDs now. Is that on the 8-pin Linux? No, that's on the 404-based one. The 8-pin one is too fast. The whole point was that one had to be interactive. You could see how fast the instruction counter is moving. It would not be interactive. Have people come to you and ask you to build the
Starting point is 00:37:14 404 one for them? I've built two more, one for a friend and one for a paying customer who wanted one. I tried to set it up as a kit that people could actually order, but the companies that make kits pushed back on it
Starting point is 00:37:32 with very legitimate concerns that it would be very strange to sell half a kit because you could get the board and the modern components, but no one's gonna buy you the original components. You have to go on Scow or eBay for those. And that would be a very strange kind of kit. And at the same time, we couldn't really go and source them, A, because it's a huge capital expenditure and it's hard to predict the sales, but also there is no guaranteed supply. Some days eBay has supplies of $42.89 and some days it doesn't. So basically now it's just here's the design files go by
Starting point is 00:38:06 the parts and put it together. I know of at least a few people currently building one of these boards. You mentioned earlier that you limit yourself to two projects until they're written up and offline you mentioned that you didn't really enjoy writing. Why do you bother? I'm not discouraging you, you do a great job, and I'm really glad you do it. The number of people who have reached out to me over the years and told me that reading about my projects encouraged them to get into embedded
Starting point is 00:38:36 or encouraged to get better is worth the results of that. There is a shortage of people doing low-level software. You can get a whole CS degree nowadays without ever touching assembly or even C, which is terrifying because all these people who think the world runs on JavaScript don't realize someone has to write their JavaScript VM. And getting people into embedded is sort of one of my hobbies and this is how you do it. You do it by trying. It's like riding a bicycle. You can read all the books and riding bicycles you want. Until you get a bicycle and skin your knee
Starting point is 00:39:10 a couple of times, you're not actually going to learn anything. And it's the same here. And you learn by doing. And you don't just wake up one day and know how to do this. So having a project you can follow can be helpful. There are people who started my projects and modified them. So my original, for example, Linux on AVR, I know a guy who actually took that, attached a display and an I squared C real-time clock to it and made a nightstand clock for himself that just happens to run emulated Linux.
Starting point is 00:39:40 And he wrote a Python app that runs on there that updates the time on this display, which was really, really cool. So I try to make sure that my write-ups are clear for people and that they can be followed. So that plays into part selection because I hate when I read someone's project and says, oh, I use this part that my grandpa left me in his will. Okay, great. Where do I get one? Get a grandpa.
Starting point is 00:40:04 So I try to make sure that my directions are reproducible and the parts I use are something you could at least have a hope of sourcing. And I limit myself to two projects because I hate the writing part. It's not fun. Going there with soldering iron is fun. Seeing things work for the first time is fun. Writing isn't. But the way I force myself to do it is I'm allowing myself two projects in progress at a time. And a project is in progress either until I've given up on it and destroyed it to make sure that I can't try to lie and say it's not in progress but come back to it. Or until it's been completed and an article about it has been published. So that's basically how I force myself to write because if I have two projects that need to be
Starting point is 00:40:46 written up I can't start another one until I've written one up. Do you tend to write as you go or do you kind of do retrospective after you finish? No. I take very messy notes as I go so I can later figure out what I did and when but like I said I hate writing I wouldn't want to spoil the final project with it. Usually it'll take me depending on the, a few months to go from start to finish and then probably about a week or two of writing. First, I'll take my notes and try to make an outline. Then I'll go and write very haphazard and typo written article.
Starting point is 00:41:20 And then I'll go and spell check it a few times and read through a few times and organize it and then add a table of contents. I really hate that part. I tend to do it in times when I have nothing else to do, like if I'm on a plane or if I just have a headache and I can't think right. You write complex and interesting technical articles when you have a headache and can't think right. Because when I can't think right, I'd rather go and think about something challenging. One thing and this was totally me I think. I couldn't really find your code. It didn't
Starting point is 00:41:54 look like I couldn't find you on GitHub or anything. That's on purpose. If I wanted to rebuild this I saw the list of parts but how do I get the rest of it? Oh, every one of my articles has the code down there. So if you look, for example, APinLinux, there's a section that says downloads and use. There's a subsection called the files and it says the main download is here. There's a link there that has all the design files,
Starting point is 00:42:19 the source code, everything. Yes. I am purposefully not on GitHub. A lot of my code is uploaded by other people, but I'm not on there on purpose I Don't really want to contribute the training some models with my work if they want to train on it They can at least go crawl my website And then zip the file. That's right. Yep They're not a human. It's not it's a poor human test. I've been working on implementing a better anti-ML model thing
Starting point is 00:42:48 from my website, but I haven't finished it yet. I wanted to ask, we talked a lot about this project, and I'm about to make a horrible pun, so warning upfront. Preparation for pun. What was the kernel of this idea? Like, what did this just come to you? Like, I want to attempt this.
Starting point is 00:43:07 Were you thinking about it tangentially for another reason, or was it suggested to you, or? Well, so the 8-pin Linux one, the physical size small one, that one was mainly because I was explaining to a neighbor's kid that back in the day you could just go and assemble a computer from a kit. And I got this look back up to him, like I was telling him that we used to go strolling around the main street with Martians. So I decided to try to see if I could recapture some of that in a way that could be appreciated today. Because if I give you a computer kit from the 80s,
Starting point is 00:43:45 it doesn't run anything modern today. So it's not actually believably a computer anymore. Definition has shifted. So I wanted something that can be considered a computer today for some definition of that. And for lower end things like the 404, there it's, I just thought it was cool to have this record. Now that someone actually took the time to beat it, I wanted to provide a return challenge.
Starting point is 00:44:11 Okay, here, here's the next thing to beat. And it's funny, I see why you came to it from that description, but I would have given them a Raspberry Pi, not a Pico, but that's a computer, that's more of a computer than I had through college. Sure, absolutely, but you're not going to assemble a Raspberry Pi. No. The whole point was I was explaining that there used to be a time when a computer you would buy and a kit you would buy to assemble would produce approximately the same computing power, running the same exact software more or less. So my goal was to run modern-ish software on something you could actually hand assemble. In theory with a reflow plate and all the right
Starting point is 00:44:56 parts you could maybe assemble a Raspberry Pi with some effort, but certainly not if it's your second time soldering. It'd be more likely to do a bagel board, but yeah Yeah, yeah, but I mean the original apples came in a kit version. I think the Apple one exactly Yeah, Heath kit had a ton of computers a lot of Z80 based stuff. So yeah But the the actual imitation of eight pins with my own just to keep it fun for me Yeah, that's because realistically allner makes a chip that has pins, so it's not BGA, it's not even QFP. They have a real QFN with actual solderable pins,
Starting point is 00:45:33 a V3S I think, that will run proper real Linux. It's an actual ARM Core, a modern one, with built-in RAM in the chip. So you can literally plop this thing on a board, sprinkle maybe a dozen parts around it, and you'll have real Linux running on that board. They're very cheap too. So the limitation of 8Pins was just my own,
Starting point is 00:45:53 because I decided to lower the requirement from hand solderable to hand solderable as your first soldering project. So when you embark on the transistor-only one, let's not talk about the vacuum tubes for a second. What are you going to invent your own architecture or? Yes. The idea is to minimize the number of transistors. So I have to create an architecture that is just designed to run this emulator.
Starting point is 00:46:21 Luckily, that does make a lot of things simpler. There are some things you just don't need if you're only going to run an emulator. Luckily that does make a lot of things simpler. There are some things you just don't need if you're only going to run an emulator. Don't need memory protection because you can emulate that anyways and you don't need serial ports because you can use SPI for everything because life is much easier. So yes, the idea then is to write my architecture. Probably eight or four bit one at that point in time. Working with less than eight or four bit one at that point in time. Working with less than eight or four bits, it should be tedious. Yeah. And more bits would require a lot more transistors. The idea here is I'll be writing an assembly anyway. So the fact that it's a new architecture is in no
Starting point is 00:46:56 way a difficulty. If I was going to do this, and then you can tell me how wrong I am because this is not the sort of thing I think about. But if I was going to do this, I would go to the Nand to Tetris book from Shimon Shaken and look at how he has done the transistors to his small assembly. Yeah. Because I don't know all of the different assemblies very well. I really only know ARM and some TI. But...
Starting point is 00:47:31 That's a perfectly good approach. And then I think because he actually builds this up the opposite direction you're going, that would be a good way to figure out the possibilities of it. Yeah, that's a perfectly good approach. By the way, if you like NAND to Tetris, in case you haven't seen it, there's a nandgame.com, which is pretty cool, it's by a different person, but it's basically you start with putting together
Starting point is 00:47:59 a NAND gate out of moss, and then you build other components, and this website will guide you through a set of projects all the way to a working maze-solving robot. So if you haven't seen it, really recommend it. I'm not connected to it anyway, but I recommend it to everyone. It is really, really, really cool. We've talked about some similar games, but that one's on the web, and it is really fun, especially for folks who are like,
Starting point is 00:48:24 okay, but what's under the processor? And it starts out with truth tables, which is kind of painful, but it builds up pretty quickly to, okay, this is how you actually put it all together. And then suddenly you have and gates and or gates, and then suddenly you're interpreting things to be much more complicated. And suddenly you have a robot that solves a maze. I'll make sure that's in the show notes. Okay.
Starting point is 00:48:51 We did not talk about processors. I mean, we talked about PIC16s and their C compiler. Their joy of a C compiler. Their anti-joy of a C compiler, I mean. It's not even theirs. It's a third party company. And you ended up looking at some Cortex-M0s and M0 pluses, which it was what my brain went to when you first mentioned this.
Starting point is 00:49:17 And then I was a bit surprised about the 8-pin chips. Because I had kind of forgotten that anyone made 8-pin chips. I had kind of forgotten that anyone made eight pin chips. I think I knew it one time because I've had some really tiny projects, but they've gotten, I mean now you can get 16 pins in a tiny, tiny package, but your goal to solder it. Let me blow your mind. There are actually six pin Cortex-M0s you can buy. That's... All right, so we got power and ground and... Cereal.
Starting point is 00:49:51 Cereal. And spy. Well, four pins for whatever you want, whether it be cereal or spy or whatever else. Yeah, and USART. But you do have two spy things, so you do have to have a chip select. I managed without it in this project.
Starting point is 00:50:09 Did you? How? Well, so six pins after you take away power and ground. Two for serial port leaves you with four. For RAM you need SPI. So master out, master in, clock and chip select. That's your four. Now you have no pins left for SD card.
Starting point is 00:50:25 Oh, but if you put an inverter in there, you're either selecting SPI or SD card. That was one of the things I considered. The problem is I found some cards really didn't like being selected for a very long time. But also inverters would be an extra chip on the board. I considered it, but I found a solution without it. So SD cards, besides SPII also support their native SD protocol
Starting point is 00:50:48 It's not as well documented in the open specs, but it's not hard to figure out It's not hard to figure out. I have tried to figure it out, but go ahead Well, you can go grab my code now. Yeah But the idea is that protocol doesn't use a chip select uses three wires But instead of SPI having all unidirectional wires, this uses bidirectional wires. But with that protocol, you can combine it with SPI on the same exact pins. Basically, when you're accessing the memory, the SD card just sees a clock and the command pin is in idle state, which is allowed by the spec. And when you're accessing the SD card, memory sees itself being selected and deselected
Starting point is 00:51:28 repeatedly with no command byte coming in, which is also allowed. Oh, tricky. So I run them on the same exact wires at the same exact time. Tricky, tricky, tricky, okay. Okay, this goes along with overclocking in that it would make me uncomfortable
Starting point is 00:51:43 to ship a product with this. Sure. Well, it's not a product. No, no. It isn't a product, but also by default, it doesn't overclock very much. But I allow people to do it because it's faster that way. It doesn't overclock very much. The chip goes 64 megahertz and you've overclocked it to 150 megahertz. Well, by default, I think I'd run it 130, but yes.
Starting point is 00:52:06 So mainly to make things go faster when it comes to processors, it's a question of giving it a higher supply voltage and making sure it doesn't melt itself. So the core in this particular STM chip runs at around one volt. There's a built-in regulator. And in order to allow power savings,
Starting point is 00:52:25 ST gives the regulator two settings. In one, it supplies just under one volt, and the chip is allowed to run, I think, up to 32 megahertz. Let me go see if I can find it for you. Yes, up to 16. And then they have a second setting, which supplies 1.2, where the chip can run up to the stock 64. And if you try try it'll run at
Starting point is 00:52:45 about 80 or so. But some of the older chips with very similar power regulator registers also had another setting which supplied 1.35 volts. So I tried that setting on this chip figuring ST was probably too lazy to design a whole new regulator. They probably plopped the same one in. And I was right. So with that setting, the chip will run a lot faster. So a few chips I have here, because obviously it's very individual, can get up to about 180. But every single one ran at 136, which is obviously
Starting point is 00:53:19 quite a bit faster than 64. So I allow people who put this project together to run as fast as they want and see how fast they can get before it gets unstable. If you have an external supply of vCore, you could go even higher, but obviously in this chip not enough pins for that. But that's how people overclock like RP2040s and such. Supply vCore externally can go even faster. That's some external cooling and yeah. Yeah. I'm really uncomfortable with overclocking.
Starting point is 00:53:48 I know you've done it with your computers. I mean, a lot of computers, a lot of Intel will sell you a chip that is unlocked specifically so you can overclock it. But it's, I guess I'm just so, engineering design spec mustn't, mustn't go over. Well, remember the... Yes, I'm certainly not claiming that this will run across the automotive temperature range and will age properly across a decade. Those are just not design guidelines here. I certainly wouldn't do this if I was shipping a product that's going to run someone's airplane or something. But this is a hobby project, so faster silicon aging due to higher voltages isn't a problem. And performance in sub-arctic winder outside is also not a problem, because I suspect no one's going to try to use this outside in Alaska in February.
Starting point is 00:54:34 I don't know, those Antarctica scientists do get really bored. They can do it indoors. Exactly. And the chip manufacturers, they're trying to hit a spec where they want the chip to survive even if you put it on a board and don't do any cooling or anything. Yeah, no, I know. They have to over-engineer it and then people who want to go outside that span can. It's not how my brain works.
Starting point is 00:55:05 It's not the way you ship a product. But for a hobby project where you only need to make like two of these and maybe a couple dozen people will follow your footsteps, it's fine. Plus the performance benefits. Think about this. Going from 64 megahertz to 160 is not something you'd really want to leave on the table. Not for this. No. Not when you're talking about a boot time.
Starting point is 00:55:26 Yeah. And you're right. It's a hobby project. It's for fun. Let's make it fun. And overclocking, I guess, because it's a do not do, is kind of fun, just on principle. Just so you know, your computer's overclocked.
Starting point is 00:55:43 Things he doesn't tell me. Because the BIOS has a thing where you just push a button Just so you know, your computer is overclocked. Things he doesn't tell me. Because the BIOS has a thing where you just push a button and it determines the appropriate clock setting. I thought the BIOS was just so I could change its little LEDs. Yeah. Sorry, that was not true. I'm sorry.
Starting point is 00:55:59 I just wanted to make sure that you knew you were lying to people, that you don't overclock things. I have overclocked on your behalf. Thank you. So you mentioned that you actually chose an ST part, STM32, a G031, which is the M0 plus, and you had looked at a TI part and a PSoC part as well. But this will work on almost any M0 at this point. There's not a lot of difference between them. No, so because I'm using basically no hardware peripherals
Starting point is 00:56:38 at all, it'll work on anything. I begrudgingly was forced to choose the TI part just because of the speed. The STM part? Or the TI? That's right. That's what I meant, yes. Yeah, okay.
Starting point is 00:56:49 Just because it had the highest clock speed? Yes. And you begrudgingly chose this because you don't, their errata is not as up to date as you'd like. I am a huge foe of STMicro. I try to convince everybody who I know is using them not to. I even offer my time for free to companies who are considering switching away from STMicro to anything else to help them do so. Really? We'd like to thank our sponsor, STMicroSense.
Starting point is 00:57:19 No, they're not our sponsor. I've beefed with them from a number of years ago on a project. They released a part, STM32H7, which was supposed to support external PS RAM, memory mapped and read-write mode. Oh, right. The problem is it was broken. If you enable caching, it would sometimes hang if you ran code from it.
Starting point is 00:57:41 If you disable caching, some writes would sometimes get lost or write neighboring bytes. I tried to show them and even with the demo, they basically denied the issue was there. Then I found a workaround for the issue and I was able to complete my project. And then a few months later, they came back and said, hey, we have a customer who is experiencing similar issues. Can you tell us the workaround? You've got to be kidding me. As soon as you guys admit that the chip is broken, and that's basically as far as that went. So the chip remains broken.
Starting point is 00:58:12 I still haven't told them what the workaround is. But basically, if people are out there knowing that their chips are broken and not documenting it, I'm not okay with that. So I've made it my personal mission to try to make sure as few people as possible use STMicro parts. So being forced to use it was not my personal favorite, but it was the best part in this particular case.
Starting point is 00:58:36 And because I'm using no IP on there at all, except the actual core, there was very little chance of hitting an errata because they don't design the ARM core. ARM does. I'm trying to think which processors I've used that were fresh that didn't have some errata that they then denied and later came back for. I recommend RP2350 to anybody doing anything nowadays that needs an ARM core or even a RISC-V core. I am a huge fan. The documentation is great. The support forum has actual people from Raspberry Pi answering questions, and these things are really, really fast. The hardware abstraction layer is beautiful.
Starting point is 00:59:16 That I wouldn't know, I'm afraid. My favorite thing is when the chip vendor says, oh, wow, you found a new problem. Thank you for telling us when they've already admitted it to somebody else separately. Yeah, it's like we don't talk to each other. Yeah. Yeah.
Starting point is 00:59:32 That was what I was thinking of. OK, well, I think we are about out of time, but I wanted to know if you wanted to say anything about DEF CON. No, I think that's already played out. OK. They've said everything they want to say. I've said everything I want to say. Defcon. Nah, I think that's already played out. Okay. They've said everything they want to say, I've said everything I want to say, everyone has already made up their mind, I think that's over.
Starting point is 00:59:53 Do you think you will be attending in the future? You know, I never attended Defcon before this event. I'm not usually big on conferences. I went just to make my appearance. I don't think I have any reason to do that again. Okay. Dimitri, do you have any thoughts you'd like to leave us with? More people should get into embedded.
Starting point is 01:00:16 It's fun and I think that it's going to be a very good career going forward. Everyone who's worried about LLMs taking over their jobs should go ahead and try to get one of these coding LLMs to spit out some useful embedded C code for you. It's like having a very enthusiastic three-year-old help you in your wood shop. You're mostly just very, very scared of the results. I don't think that's going to change.
Starting point is 01:00:41 It's fun and it'll be a good job. Anybody who wants to get into embedded, please do. Please encourage everyone else you know as well. We need more people who understand how things actually work. Fully endorse. Our guest has been Dimitri Grinberg, hardware hacker. You can find his website at https://dmitry.gr. And I will have links in the show notes.
Starting point is 01:01:11 Thanks, Dimitri. That was great. Thank you. Thank you to Christopher for producing and co-hosting. Thank you to Dennis Jackson for the introduction. Thank you to Mouser for sponsoring the show. And thank you to our Patreon listeners Slack group for their continuing encouragement of our book discussion. Finally, thank you for listening.
Starting point is 01:01:34 You can always contact us at show at embedded.fm or hit the contact link on embedded FM where the show notes also live as well as the transcripts. And now a quote to leave you with from Cory Doctorow. It's the stupid questions that have some of the most surprising and interesting answers. Most people never think to ask the stupid questions.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.