Python Bytes - #441 It's Michaels All the Way Down

Episode Date: July 21, 2025

Topics covered in this episode: * Distributed sqlite follow up: Turso and Litestream* * PEP 792 – Project status markers in the simple index* Run coverage on tests docker2exe: Convert a Docker im...age to an executable Extras Joke Watch on YouTube About the show Sponsored by Digital Ocean: pythonbytes.fm/digitalocean-gen-ai Use code DO4BYTES and get $200 in free credit Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: Distributed sqlite follow up: Turso and Litestream Michael Booth: Turso marries the familiarity and simplicity of SQLite with modern, scalable, and distributed features. Seems to me that Turso is to SQLite what MotherDuck is to DuckDB. Mike Fiedler Continue to use the SQLite you love and care about (even the one inside Python runtime) and launch a daemon that watches the db for changes and replicates changes to an S3-type object store. Deeper dive: Litestream: Revamped Brian #2: PEP 792 – Project status markers in the simple index Currently 3 status markers for packages Trove Classifier status Indices can be yanked PyPI projects - admins can quarantine a project, owners can archive a project Proposal is to have something that can have only one state active archived quarantined deprecated This has been Approved, but not Implemented yet. Brian #3: Run coverage on tests Hugo van Kemenade And apparently, run Ruff with at least F811 turned on Helps with copy/paste/modify mistakes, but also subtler bugs like consumed generators being reused. Michael #4: docker2exe: Convert a Docker image to an executable This tool can be used to convert a Docker image to an executable that you can send to your friends. Build with a simple command: $ docker2exe --name alpine --image alpine:3.9 Requires docker on the client device Probably doesn’t map volumes/ports/etc, though could potentially be exposed in the dockerfile. Extras Brian: Back catalog of Test & Code is now on YouTube under @TestAndCodePodcast So far 106 of 234 episodes are up. The rest are going up according to daily limits. Ordering is rather chaotic, according to upload time, not release ordering. There will be a new episode this week pytest-django with Adam Johnson Joke: If programmers were doctors

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes where we deliver Python news and headlines directly to earbuds. This is episode 441 recorded July 21st, 2025 and I am Brian Auken. Who am I? I am Michael Kennedy. Hello everyone. This episode is sponsored by DigitalOcean. Thank you DigitalOcean. Hear about them later in the show. If you'd like to give us some topics for the show or just ask us questions or connect. The links to Mastodon and Blue Sky are in the show notes and on our webpage by thumbbys.fm. You get it while you're there. You may as well go ahead and sign up for our newsletter or we'll
Starting point is 00:00:36 send you links for the show afterwards. And I apologize because last week it was a little late because I've been sick, but I'm better now. And if you'd like to join the show, we do post it live on usually Tuesday mornings. Monday mornings. Sorry. Monday mornings at 10. It used to be Tuesday. But just go ahead and go over to pythonbytes.fm slash live and it'll tell you when the next one is. And then even if we change it, you can see there. Yeah, so let's kick it off. What do you got for us, Michael? Let's do some follow up for our first item here. So remember, last week I talked about RQLite, the lightweight user-friendly distributed relational database built on SQLite. Right? And that was pretty cool.
Starting point is 00:01:27 You could build fault-tolerant relational databases in clusters, even potentially geo-distributed using SQLite. Well, as always, our wonderful listeners are like, that's cool. Did you know about? Right? Yeah. I love that though. Yeah. I do too. Yeah. Thank you. like, that's cool, did you know about? Right? Yeah, I love that though. I do too, yeah, thank you. So two people wrote in and said, did you know about?
Starting point is 00:01:50 And one was Michael Booth. And Michael Booth said, you should check out Terso, the next evolution of SQLite. So Terso is interesting in that it is kind of like SQLite, but rewritten in Rust with the idea that it can be a distributed database type of thing. Also adding some of these AI features, as in vector search and other things
Starting point is 00:02:17 that you need to be able to support working with AI. So pretty cool. If you look through the website, it's like their proper terser.tech which it's cool but it I don't know it kind of leaves it a little bit unsure like what exactly it is but like what is its relation to SQLite right so it's go over to their github it's a little bit clearer right so it has SQLite compatibility.
Starting point is 00:02:45 It is not based on SQLite like RQite is, but it has language bindings for Python and other, maybe some of these other languages people have heard of like Java and JavaScript and so on. But it has async IOS support, which is really cool. And you might say, well, it's SQLite. Can it really have async IOS support? Yes.
Starting point is 00:03:05 At least on Linux, it uses IOUring to make that happen. But it runs on all the server or all the platforms. So you can develop on Mac OS, but then you get better perf on Linux, for example. It also, they're working on adding a begin concurrent operation, like a SQL statement or query statement and indexing for vector search like probably pretty much required and better alter support right so basically think of it as a more modern take not just because it's
Starting point is 00:03:36 rewritten in Rust which seems to be like the thing you do but as part of that taking on a lot of ideas like concurrency and others. Big question, is it ready for production use? No. No. But these folks also worked on this thing called lib-sql. So it says that the Taro database is the next evolution of SQLite and Rust. They had previously worked on something called lib-sql, which is an attempt to evolve SQLite
Starting point is 00:04:02 in a similar direction, but through a fork rather than a rewrite. And they're like, we think this new database thing is a better way to go. So if what I talked about with RQ Lite inspires you, this also might. They also have some kind of what you might expect as open core but we'll do the infrastructure side of things for you So they've got this thing called terso cloud which is a production or a paid for a thing, right? So serverless access or sink or whatever you want, right replication and saying vector search all that kind of stuff Basically managed there's alright, so that's thing one. The other one comes to this is like all it's all Michael Michaels all the way down
Starting point is 00:04:43 So we had Michael boothoth recommend that one, got me and Michael presenting it. We got Mike Fiedler talking about the next one here. And Mike Fiedler says, have you heard, well, listen to the most recent Python bytes and RQLite is cool, hadn't heard of that. Something related that might be interesting is lightstream.io.
Starting point is 00:05:02 Now lightstream.io is more SQLite than before. You just keep using SQLite, like the one built into Python, but you launch a little daemon that watches it and constantly syncs it with S3 or Azure Blob Storage or something like that. When you do that, what do you get? You get some of them pretty cool. It just constantly streams changes to AWS S3,
Starting point is 00:05:27 Azure Blob Storage, Google Cloud Storage, et cetera. And that means you basically get failover. So imagine you're like, I would love to just have a SQLite database that I can run on a simple little app. But what if that server goes down? What if that server fails? Then all my data's
Starting point is 00:05:46 lost from the last time I backed it up because my alternative was to run some kind of cluster of Postgres or Mongo or whatever. But then you're into lots of DevOps and managing those things or you're paying for managed databases which can be relatively expensive to small servers, like DigitalOcean has. And what you can do is you can just keep on using regular SQLite but then use this LightStream which has no real performance differences in terms like these other two things are like trying to change how SQLite works or re-implement the compatible API. This one just says, we promise to hook into,
Starting point is 00:06:26 I put a link in the show notes. There's a nice write-up about this that they did. It's sort of like, why is this? Called Lightstream Revamped. And it also just talks about like, why did they build this, how does it work? So it says it takes over the wall checkpointing process to continuously stream updates to the database. So basically as commits are committed to the SQLite file,
Starting point is 00:06:49 it just keeps pushing that to blob storage. Cool, right? So if that RQ Lite thing was interesting, but not quite what you're looking for, maybe one of these two things is. Yeah, interesting. Pennies a day, I like that. Dirt cheap.
Starting point is 00:07:03 Yeah, dirt cheap. Only costs pennies a day because it's just blob storage. So however you do that. And the Lightstream, I don't know that they, I don't see pricing, how about that? So this is just pure open source, whereas the, This is pretty cool. The Terso was like, you can do the open source thing
Starting point is 00:07:21 yourself or we can do our cloud thing, right? I don't know which of those makes you happier, but they're not the same in that regard Or they are the same with a wrapper pay wrapper. Anyway, yeah Okay, cool. Awesome over to you Uh, I want to talk about project status a little bit. So I do want to talk about this new pep so pep 792 project status markers in the simple index. So project status, what is this about?
Starting point is 00:07:48 So this is a new pep that was, well, sorry about that. It is pretty new. So it actually updates in February of this year and it's now accepted. So I'm gonna talk about this a little bit, but I wanna talk about project status and what this means. So if I take a look at a couple of my silly projects, I've got PyTestCheck, which is a project that I maintain,
Starting point is 00:08:11 and I actually intend people to be able to use it. So it's maintained. And so I have added a Trove classifier to say the status of the project is production stable. I think you can use it in production, but it's, you know, and, but this is optional. Maintainers don't have to put this in. For instance, I also have another one called PyTest Crayons that I just did for a talk that I gave. I didn't put the status in there, so that's an optional thing. So what we have, in the current situation, there is the Trove classifiers that you can optionally add to your project.
Starting point is 00:08:50 And if you like, for instance, start taking it down, you can say that it's inactive. But I don't think very many people do that. Often inactive just means somebody left, but it's still, they haven't updated a new one. That's like, please close the door on your way out. Usually they just walk out the door. So, and then there's also,
Starting point is 00:09:08 there's actually three kinds of status. So there's that, the trove classifiers, but then there's also indices can mark distributions and releases as yanked. So you can, like a version you can yank, so it's not there anymore. So then it's still like got an entry there, but it's been a yanked, whether it's true or false.
Starting point is 00:09:33 So, and then- I've started seeing that a lot lately, actually. Have you? At least UVPIP warnings. So I'll say, update the dependencies of this project, basically UVPIP compile, and then I'll say, update the dependencies of this project, basically UVPip compile, and then I'll install the requirements, and it'll say, this thing has been yanked
Starting point is 00:09:50 because of, and it'll have some reason, like broken metadata or whatever. I'm like, well, I don't know what I'm supposed to do about this, because something needed it. Yeah, okay. So I've not seen anything fail, like I haven't seen the apps no longer run, because something got that status,
Starting point is 00:10:04 but I've seen warnings about this now. And then there's another type of status that's the PyPI itself can have a status for an entire project. And the entire project can be quarantined if the administrators of PyPI think that the project has been, it's got malware in it or something.
Starting point is 00:10:25 So they can quarantine it. And also a project owner can archive a project and say, basically, you can disable new releases and it's still around, but nobody can update to it and it's just archived. It's there for historical purposes, I guess. Anyway, so there's three statuses and it's a little confusing on how to use those if you had an alternative in tech, so if you were trying to do dependency resolution or other things. So trying to clean that up a bit.
Starting point is 00:10:58 So there's really, this proposal is to have one, a project always has exactly one status. And the status will be, it'll be active and then there's some semantics around that. Active or archived or quarantined and deprecated. And this makes sense. So it's either it's in use, it's active or it's not. But it's also, what does it not mean? It's either, you know,
Starting point is 00:11:26 it's various levels of why you can't use, shouldn't use it. And I think this is, this is a way for good way forward to have like to kind of consolidate these. Now this is, this has been accepted, this, this active archived quarantine deprecated status, and it can only be one of those. But this has been accepted, however, it's not implemented yet. So don't go out and look in PyPI to try to find this yet. So this just was accepted, July 8th resolution.
Starting point is 00:11:58 So we will update you further as we hear from people in the Python community that this is all implemented. Yeah, excellent. I think I'm behind it. The quarantine one is good. I'm very glad I've never seen that warning. And the archive one feels to me like a little bit of a supply chain safety type thing. Like I'm not going to mess with this. I will never update this. But let's not allow something to happen where someone else could either. Well, and I'm not sure, so I was trying to read this
Starting point is 00:12:28 and I was trying to understand kind of how these, how it's gonna be set, so it's probably not, it's not gonna be something just in the Trove classifier. So it probably is gonna be something outside of the data itself. So because there's times where it's clear there's nobody updating it anymore and you can't get ahold of the maintainer. You need to be able to say this one's dead.
Starting point is 00:12:52 I think it's too bad that we don't have a better way to say but after the fact if somebody like just stops maintaining something, hey, does anybody else want to start maintaining this? I know there's like security problems with around that, but you know, it might be good to be able to have some things live on. Yeah, and just archiving old stuff, it's not good. Was it GitHub, I think that got into a big uproar because they decided that they're going to archive
Starting point is 00:13:21 a bunch of stuff if it hadn't been updated in like years or was it NPM? I don't know, one of these these types of places was like if stuff doesn't get touched people are hey it's not it's not unmaintained it's just done there's nothing to add it's perfect. Exactly cool all right well before we move on let's talk about our sponsor huh? Yeah. Yeah, so super excited to have DigitalOcean back to support the show. As always, this episode is brought to you by DigitalOcean. And DigitalOcean is a comprehensive cloud infrastructure
Starting point is 00:13:56 that's simple to spin up, even for the most complex workloads, and it's way better value than most cloud providers. I've looked at the big three, and they're not even close in how much value you get per dollar and how much easier it is to use. So at DigitalOcean companies can save up to 30%
Starting point is 00:14:13 off their cloud bill. They boast a 99.99% uptime SLA. That means they promise to support that level of uptime. Our experience, Brian, running Python bytes and other things on Digital Ocean for many years was way higher than four nines. Really, really reliable, love them. They also have industry leading pricing on bandwidth,
Starting point is 00:14:35 also true like 10 times, eight times cheaper than AWS and Azure, really good. So it's built to be the cloud backbone of businesses, small and large, and now they have GPU powered virtual machines, plus storage, database, networking capabilities, all on one platform. So AI developers can confidently create apps that their users will love. And devs have access to a complete set of infrastructure tools. They need both training and inference so they can build anything they dream up.
Starting point is 00:15:03 DigitalOcean provides full service cloud infrastructure that's simple to use, reliable no matter the use case, and scalable for any business. So, should I say good value? $4 a month for a server. That's yours. Your Linux server, you SSH it to. Amazing. And GPUs, servers for under a dollar per hour.
Starting point is 00:15:20 So, easy to spend up infrastructure to simplify even the most intense business demands. That's DigitalOcean. If you use the DO4Bytes code, then you can get up to $200 in free credit to get started. DO4Bytes. Now just click the link in your podcast player show notes. DigitalOcean is the cloud that's got you covered. So like I said, please use our link. You'll find the podcast player show notes, a clickable chapter URL while I'm speaking right now.
Starting point is 00:15:48 And at the top of the page at pythonbytes.fm for the episode, thank you to DigitalOcean for supporting Python Bytes. Indeed, indeed. Now over to you. Well, so I want to talk about testing a little bit. So Hugo van Kemp, sorry bit. Hugo Von Kemenade wrote a post called Run Coverage on Tests. This is something that I've taught everybody to do and or taught everybody I can get my get their all of their ears to do this because it's important. I'm glad
Starting point is 00:16:23 that Hugo wrote a post about it. And this was just going to be an extra. However, um, I was blown away by a few things here and I'm like, Oh, we got to make this highlighted a little bit more. So the, the classic reason why you should run coverage on your test is because of the copy paste, modify problem with PI test because PI test really kind of it, the, the name of the test is just sort of shows up in the reporting so it's very easy to to copy my you take an old test and since you're not calling
Starting point is 00:16:54 the test function yourself you it's easy to copy paste modify and forget to change the name and if you do that it just the second test just hides the name and hides the first one. And that is the classic reason why I tell people to run coverage so that you don't do this. And it's hard to it's hard to figure out otherwise. But a couple of cool things that I learned from this post is that ruffs rule F11, F811 catches that apparently. So there is, if you run Ruff on your test code also with at least F811 turned on, it checks for variables,
Starting point is 00:17:38 which the function name is officially a variable, that are defined and redefined or otherwise shadowed and unused. So if you used a variable, didn't use it and redefined it, that's similar to defining a test twice. So I didn't know that. So I will make sure that I've got rough running on my test code also. So that's cool. Also a tip to say, hey, if you're really just copy paste, modifying a test, maybe think about pi test parameterization, which I agree you might want to just parameterize the test. But this is also pretty common. It's so common to copy paste. The second example is what the one that I
Starting point is 00:18:16 really was like, Oh my gosh, we have to cover this because I would have never I wouldn't have caught this. So his second example is this weird. OK, so it's like a, he's got an image, some image stuff that he's testing. And it's like, I don't know, it's, I guess I'm just, it's a little bit of a complicated thing. You've got some colors and images and you're trying to test in the, in the end, you're asserting whether or not images are similar of a reloaded image versus an expected image. And I don't really understand what's going on here, but this isn't terrible. But one of the things in here is that when he ran coverage, it said that these two lines at the bottom
Starting point is 00:18:57 are not being run at all. And he's got some images, a set of images that he's iterating over, and apparently there's nothing in them. So what's going on? And the thing that's going on apparently is his images are a generator is what's going on. So he's got some images that are being set up
Starting point is 00:19:21 as a generator, so like a, you know, 4IN. A generator comprehension. Yeah, generator comprehension. And then he passes that to a function that consumes the generator, and then when he tries to use it in the assert section, there's nothing there anymore. And I would have never, like, obviously,
Starting point is 00:19:43 this is highlighted because I use generators way more than in generator comprehensions more than I used to, because they're cool. And they're efficient. And if you've got huge things, they use up less memory. So I'm using them more and more. And I probably am using in my tests, I better make sure that I'm not like mucking up with things like this. So pretty cool reason to use coverage on your test code to make sure that you don't do this. Oh yeah, the fix by the way of doing this is just instead of using a generator comprehension, just use
Starting point is 00:20:18 the list and pass it to your test code. Square brackets, not parentheses. Yeah. Problem solved. It's subtle though I agree very subtle. Oh yeah that's the difference really. Yeah parentheses for those brackets and then it works. Uh well yeah okay anyway thanks Hugo. Yeah thanks Hugo. All right let's talk about something simpler than all that SQLite distributed async business. Here's the problem, Brian. You have a program, probably in our case,
Starting point is 00:20:48 often a Python program, and you want to give it to somebody and let them run it. How do you do that? Still, to this day, there's no amazing options to do that. I know we have Py2.exe and Py2.app, but those things are, while awesome, they're not consistently reliable and all sorts of things. So this is a little bit more dev focused than that.
Starting point is 00:21:13 But if you have Docker installed on the machine, right? So if Docker is like Docker desktop or Orb stack or whatever, and I guess even in the upcoming Mac OS, we're supposed to have a built-in Docker equivalent built straight into Mac OS, which should be interesting. Anyway, if you have any of those, there's this project called Docker to EXE. To executable, I believe it works on all the platforms.
Starting point is 00:21:38 I don't think it's just Windows EXE, even though, you know, yeah, it absolutely works on Windows. And Linux and Darwin, AKA Mac OS. So, just to executable, right? And the idea is that if you have a Docker image with an entry point or something like that that will start an app running, you can simply build that as a executable,
Starting point is 00:21:58 either a.exe on Windows, a.app, you have to rename it, I think, more or less. They're just executable binaries, right? So you can take that Docker image and build it into an executable, which is pretty awesome. So you could, you know, the example they give is kind of, it's a bad, I'm going to call it, it's a full-on bad example, because it says, here is a bare Linux image.
Starting point is 00:22:21 You can distribute that as a binary. It's like, okay, great. That's not exactly the use case. The use case is I have something I want to give to someone and run it as a Docker thing. So maybe a better example would be like, here's copying over my source code, building a Docker image, and then turning that base Docker image into an executable
Starting point is 00:22:41 that I can distribute. But this thing would do that. It's just the example doesn't show that. So I think that is pretty neat and if you want yet another way to distribute a more durable tool to somebody, if they have Docker, then you're good to go, right? That's pretty cool. Well, do they have to have Docker? Yes. Okay.
Starting point is 00:23:03 Because basically somehow this thing just, bundles up the docker image and then just executes it using docker in the same way that you would just say docker run etc. The other thing that's not clear to me is if there's a way to pass through mappings for example map port 8000 to 2722 whatever you know what I mean or map this volume to this folder Probably you can in the exit like when you run the executable Can you pass that kind of stuff to it also again the reason I don't like this example here boy They just like bundle of Linux in the Docker file itself You can say expose this port and do other type of things like that in the Docker file
Starting point is 00:23:44 expose this port and do other type of things like that in the Docker file. So you can bake in, like if you're saying, my thing is a little Flask app that runs using maybe SQLite for the database. It exposes this port so that you can then, you know, it could print out a thing like click here to like talk to the server and launch the URL, the web browser to like a local URL. So you can do that with, I'm pretty sure with expose and other stuff in the Docker file. But yeah, anyway, you want to ship a Docker image as a just an executable binary, check this thing out.
Starting point is 00:24:11 Cool. Very neat. All right. I know that you have extras and I don't. Okay. I've got, I got some test and code news. So test and code has sort of been on pause. When was the last time?
Starting point is 00:24:20 May 7th and what? It's July now. It's been, it's been on pause for a while because of life, but life has got a little bit of room in it now. So I'm going to do a little bit of testing code. So there is a, there's three episodes in the queue and hopefully more to come. So at least them. And so this week there will be an episode and I'm pretty excited because the this week's episode is covering PyTest Django with Adam
Starting point is 00:24:45 Johnson. And a lot of people have asked me about PyTest Django. And since I am not a Django expert, and Adam is, Adam was willing to talk with me about it. And it's a really great discussion. So I'll be excited to get that out. I'll be excited to get that out this week. And then also, I'll be excited to get that out this week. And then also, I went ahead, so I'm using a hosting provider I'm using for this, is Transistor FM. And Transistor has the ability to push things to YouTube, but it's a little bit wacky.
Starting point is 00:25:16 So I have went ahead and turned that on. And so everything now is for test and code, well, not everything yet. 207 episodes are now up on YouTube. The catch is they sort of show up in random order. They push, I think, 80 episodes a day or something like that. They do a lot. So by like the time we, I don't know, by the time you probably hear this, if you
Starting point is 00:25:43 don't hear it on the first day, they'll probably be all up. But anyway, they're there, but they're just in weird order. That's a lot of content to be pushing up there. I suspect though, after it sort of bulk uploads, once it does, as you start the machine, it'll go in the right order. Yeah, well, it's because YouTube orders,
Starting point is 00:26:01 I'm pretty sure YouTube orders in the upload order of like when you uploaded it and there's no way to pack date it to say this was actually two years ago or something like that. Yeah, and probably Transistor just batch processes them and the order in which they complete is the order in which they upload. Luckily when I did the transition to Transistor though, they do have a, it just uses your old post dates. So even though I transitioned like, I don't know,
Starting point is 00:26:30 a couple of years ago or something like that, I still was able to, you know, like some of the early episodes still show up as like being happening in 2016, even though they got transferred later. But anyway, a lot of content. Good deal. no extras for you No, I'm not very extra just been working hard on stuff. I'll share I'll things to share eventually But right now no extras. I do have a joke. I've brought for us. What do you think? Oh, yeah
Starting point is 00:26:55 Let's do it for it. Yeah, okay. So Brian what if what if programmers or doctors, you know, they'll joke Hey doctor, my leg hurts. We'll stop, you know, my leg hurts when I like lift it like this. We'll stop lifting it like that, right? There's a fun variation on that joke for programmers. And it's basically the doctor equivalent of it works on my machine.
Starting point is 00:27:20 It says, doctor, my leg hurts. That's weird, I also have a leg and it doesn't hurt. The issue must be on your side Yeah, that's that's definitely true. Yeah, I Like it that's dumb but funny I know it's it's short and sweet Yeah, I like it. Cool. Well, thanks for the joke Michael and thanks as always for this wonderful episode you bet Bye Brian. Bye everyone

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.