Python Bytes - #417 Bugs hide from the light

Episode Date: January 21, 2025

Topics covered in this episode: LLM Catcher On PyPI Quarantine process RESPX Unpacking kwargs with custom objects Extras Joke Watch on YouTube About the show Sponsored by us! Support our work th...rough: Our courses at Talk Python Training The Complete pytest Course Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org / @mkennedy.codes (bsky) Brian: @brianokken@fosstodon.org / @brianokken.bsky.social Show: @pythonbytes@fosstodon.org / @pythonbytes.fm (bsky) Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it. Michael #1: LLM Catcher via Pat Decker Large language model diagnostics for python applications and FastAPI applications . Features Exception diagnosis using LLMs (Ollama or OpenAI) Support for local LLMs through Ollama OpenAI integration for cloud-based models Multiple error handling approaches: Function decorators for automatic diagnosis Try/except blocks for manual control Global exception handler for unhandled errors from imported modules Both synchronous and asynchronous APIs Flexible configuration through environment variables or config file Brian #2: On PyPI Quarantine process Mike Fiedler Project Lifecycle Status - Quarantine in his "Safety & Security Engineer: First Year in Review post” Some more info now in Project Quarantine Reports of malware in a project kick things off Admins can now place a project in quarantine, allowing it to be unavailable for install, but still around for analysis. New process allows for packages to go back to normal if the report is false. However Since August, the Quarantine feature has been in use, with PyPI Admins marking ~140 reported projects as Quarantined. Of these, only a single project has exited Quarantine, others have been removed. Michael #3: RESPX Mock HTTPX with awesome request patterns and response side effects A simple, yet powerful, utility for mocking out the HTTPX, and HTTP Core, libraries. Start by patching HTTPX, using respx.mock, then add request routes to mock responses. For a neater pytest experience, RESPX includes a respx_mock fixture Brian #4: Unpacking kwargs with custom objects Rodrigo A class needs to have a keys() method that returns an iterable. a __getitem__() method for lookup Then double splat ** works on objects of that type. Extras Brian: A surprising thing about PyPI's BigQuery data - Hugovk Top PyPI Packages (and therefore also Top pytest Plugins) uses a BigQuery dataset Has grabbed 30-day data of 4,000, then 5,000, then 8,000 packages. Turns out 531,022 packages (amount returned when limit set to a million) is the same cost. So…. hoping future updates to these “Top …” pages will have way more data. Also, was planning on recording a Test & Code episode on pytest-cov today, but haven’t yet. Hopefully at least a couple of new episodes this week. Finally updated pythontest.com with BlueSky links on home page and contact page. Michael: Follow up from Owen (uv-secure): Thanks for the multiple shout outs! uv-secure just uses the PyPi json API at present to query package vulnerabilities (same as default source for pip audit). I do smash it asynchronously for all dependencies at once... but it still takes a few seconds. Joke: Bugs hide from the light!

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds and mine. And this is episode 417, recorded January 21st, 2025. And I'm Brian Ocken. And I am Michael Kennedy. And we're excited about this show today. And nothing ain't nothing going to bring us down. So, but before we get started, I want to thank everybody that has supported us through TalkPython training, through PythonTest.com, the courses, through Buy My Book, our Patreon supporters, of course, you rock. And of course, many of the sponsors that have sponsored us in the past, and we love them too. But we also love people that support us directly. If you'd like to send us topics, please do so through,
Starting point is 00:00:47 there's a contact form on our website, but also you can send them to us at Blue Sky or on Mastodon. And those links are in the show notes. And if you are listening to this, thank you. And also share it with a friend. And if you'd like to join us live sometime, check out pythonbytes.fm slash live to see when the next episode's going to be filmed
Starting point is 00:01:08 and recorded. And you can join us and comment while we're going live. And thank you also for people that subscribe to the email newsletter. If you go to pythonbytes.fm, you could subscribe there as well and get the list
Starting point is 00:01:25 of all the topics directly in your inbox so you don't have to go look those up. Yeah, we're evolving the format of that a little bit, trying to make a little deeper analysis, but also skimmable. And yeah, it's a huge resource. I think it's great. Yeah. People listen as well, but it's also nice to just have that written down in one place. And we cover lots of great topics every week. And what is our first topic this week, Michael? The first topic will be the LLM catcher. The name, not terribly descriptive of what it actually does, but here's the deal. I'm sure everyone has done this at this point. I know I've done it recently
Starting point is 00:02:02 as I was yelling at the Bodo 3 API because there ain't nothing as frustrating as a little bit of little Bodo, auto-generated, no comments, no documentation, no idea what parameters go in it. Anyway, you might take those errors and pass them over to an LLM and go, please, dear chat co-pilot, anthropic, whatever, what is going on here? What am I missing? Right? And it's super helpful. But this project is a little bit different. It's like a gateway to those types of things. So here's what you get. If there is a crash, obviously you have stack traces or tracebacks, depending on the language you're in, how you say
Starting point is 00:02:41 it. They describe it here as the unsung villains of debugging. Why wrestle with the wall of cryptic error messages when you could let LLM catcher do the heavy lifting? So here's the thing. You basically, I'll go down here somewhere. What you can do is in your try accept blocks, you can say, given an exception, diagnoser.diagnose passing the exception,
Starting point is 00:03:03 and it will pass those details over to various llms and say help me understand this and print out a message that will show me how to fix it not just trace back okay okay so i don't know how much i'm excited about this yeah i think it's pretty dope i would not use it in production though you could you could if you want your logs to have messages about here's actually what happened, it's your debugging sidekick. So what you do is you can run Ollama locally, and that's the default. Or if you give it your open AI API key, it can pass it over to whatever level of model you possibly have. You know, it'd be awesome if you could have a one mini or something like that, diagnose it over at chat GPT.
Starting point is 00:03:49 So there's different ones that we'll work with, but basically when it gets an exception, it says, Hey, I'm working on this thing with fast API and I get this exception, help me figure out what's going on. So the Olama one is a local free, just running your machine version, open AI. Well, we know all know about chat GPT, right? So you can put it as a decorator on a function. You can manually do it in a try-accept block, or you can even register a global exception handler. So anytime an exception happens that's uncaught in your system,
Starting point is 00:04:16 it'll diagnose it. It has both async and async API, and you can set it up through environment variables. So it shows you how to pull down the QWEN 2.5 coder model for Olamo, which is pretty excellent. And just off it goes. Look at that. So you've got your diagnoser.catch on a risky function.
Starting point is 00:04:35 Or in your track set block, you just say diagnose or async diagnose. Because it's going to run for a while. It's going to make an API call either locally or out to chat CPT. So you don't want to necessarily block your system. So you just make a little async await. Boom. Off it goes. Yeah, there you go.
Starting point is 00:04:51 That's pretty much it. You can get formatted or unformatted information back. So if you need plain text to go in some kind of JSON field, you can do that. Or you can get it with proper formatting to make it more readable. What do you think, Brian? I'm going to withhold judgment on this until I give it a shot at some point. But yeah, you can even specify the temperature, aka the creativity you want the model to apply to your analysis.
Starting point is 00:05:16 That's funny. Yeah, it's an open AI thing. I kind of like it to like on any exception, just upload my entire code base and rewrite my code to fix the error exactly don't diagnose it just fix it why are you with it man why am i even in the way here yeah so look at it yeah you can even do the the full-on o1 model okay of chat gpt which is like the really really is that the 200 one or that's the 20 one but you only get to call it like 50 times a week so not too many errors.
Starting point is 00:05:45 If you get the $200 one, you can call it all day long. Yeah, I'd like people to get the $200 one, put this in your CI, and do it over all versions of Python so that we just fill up all of the... And then we'll get an announcement of, oh, the entire West Coast is blacked out because we broke the power grid. But anyway, I think it's interesting, right? Just plug that in. Oh, the entire West Coast is blacked out because we broke the power grid. But anyway, I think it's interesting, right? Just plug that in.
Starting point is 00:06:12 Yeah, it looks like it might be kind of fun. Yeah, it does look kind of fun. And this was recommended by Pat. So thanks, Pat, for sending that in. Pat Decker. Oh, and Pat's here. Thanks, Pat. Yeah, yeah.
Starting point is 00:06:22 Over to you. Well, I kind of want to talk about bad packages a little bit. Like no Christmas presents for them or what's going on? Yeah, no wrapping paper. Actually, we are talking about wrapping. So I want to talk about the Python Packaging Index and malicious stuff. So let's scroll down here. There's in the, there was a security and safety engineer first year in review.
Starting point is 00:06:51 This was from Mike Fiedler. And he talked about a lot of stuff. But one of the things he talked about was quarantining. And this came out in August. But I just am catching up. So it's like if they catch COVID or what's going on?
Starting point is 00:07:04 No, it's, you know, it's like if they catch COVID or what's going on? No, it's like bad packages. So if somebody says there's malware in a package, it shouldn't be there, what do we do with it? And they used to have the option to investigate it and then yank it, but it just sort of makes the whole thing go away. But there's a new process, and they just recently at the end of December wrote about it. And there's it's called Project Quarantine. And in this we're linking to
Starting point is 00:07:32 an article that really talks about it. So if you're if you're worried about malicious packages and you're curious about what API is up to, go ahead and check this out. I'm not going to go through the whole thing. However, it is kind of interesting. So the idea is if we jump back down to like future improvements in automation, hopefully we'd have some sort of automated way. But like, let's say a couple people report that a package has malware in it. Administrators of PyPI can go ahead and somehow have some litmus test to say or something to say rather quickly, let's get this under control. And the quarantine doesn't delete the whole thing. It puts the, there's an API, simple API that an admin can go in and say, hey, we're going to quarantine this project.
Starting point is 00:08:20 And the package goes into quarantine. And at that point, there's a bunch of, a bunch of stuff happens. The it's not installable, but the owner can still see it. And the only owner can, can make I don't know if they can make changes, but yeah, it's not modifiable while it's in quarantine, but they can see what's going on. Administrators can look at it and, and determine whether or not there really is malware there. And possibly, it's possible that we might have some bad actors reporting packages. So we don't want people to report stuff that's fine and have things to remove just because
Starting point is 00:08:58 they're angry about it or something. But that hasn't happened yet. So this has been in place for a little while. And looking at the statistics, it's been, let's see, since August, they put this in place. There's been 140 reported packages, and they've gone into quarantine, and only one of them exited quarantine. And it's because because why was it the the there was obfuscated code in there then that's a violation of the pipe i acceptable use policy project owner was contacted they fixed it because they just i guess weren't aware that you can't do that really interesting
Starting point is 00:09:38 i didn't know that was a policy yeah well i mean if you want to ship something that you i know there are companies out there that would like we would like to obfuscate our code but we still want to make it available but we don't do it through pipeai i guess don't do it through pipeai okay yeah and i don't want to pipe obfuscated code so i understand that that's primarily a shielded malware right behind right like yeah they'll have a base 64 encoded string of something or other, and then they'll decode it and execute it. Right. Yeah. So, yeah, there's created some outreach templates.
Starting point is 00:10:12 So the full process, if you're confused, or if you have a, this is something, if you get notified by an administrator that one of your packages is in quarantine, they'll probably point you to this anyway. But, you know, check this out. I'm glad that they're working on this and um we're making the environment easier for pipey i admins to deal with but also just safer for everybody to use so it's good yeah excellent well you know i'm sure you're aware of this brian testing testing makes your code safer to use yeah and i have fully embraced the async lifestyle these days you know i talked
Starting point is 00:10:46 about rewriting talk python and court the async version of flask and i blogged about that and brought it up on the show i'm pretty sure but how are you going to call apis i'm working on some projects right now that are like all about calling apis and i'm like oh my gosh so many apis this thing calls that which you know so on if you can do that asynchronously, that'd be awesome. And I would say probably the best kind of, I'm a fan of requests, but I want async story these days is HTTPX, which has got some basically very, very similar,
Starting point is 00:11:17 not identical, but very, very similar behaviors and API patterns as requests, but also has an async variant. You create the async client and then await all your calls, which is great. So you might want to test that, right? Even asynchronously as you run code as async. So I want to introduce people to RESPX,
Starting point is 00:11:36 like response X, probably is the way you pronounce that. I'm not sure if it's HTTPX or response X, I don't know, whatever. RESPX. And what it does is it lets you mock out HTTPX response X, I don't know, whatever. RASPX. And what it does is it lets you mock out HTTPX requests. Super, super easy, however you like. So for example, if I want to make a call where I say HTTPX get,
Starting point is 00:11:59 and I want to make sure that if that URL comes in, it's going to return some particular value like a 204, you just say resp.same function call with the values, and then you just say.mock, and you set the values or the behaviors that you want it to do, and off it goes. That's pretty cool. Yeah. And it also comes as a PyTest plugin, if you want to roll that way. So then you just say resp.mock.whatever, and just call the functions. And then all the examples here, like first line, mock it, second line, call it. But, you know, probably you're testing some function that then internally is using HTTPS through like a sync with block. And right. Like there's a lot of layers going down there that you might need to work with.
Starting point is 00:12:35 And so, right. That would be a more realistic example. You call the mock and then you call your code and then something happens. Right. So that's pretty cool. You can even use mark. Brian, make sense of this pytest.mark statement here for me. What are we doing?
Starting point is 00:12:50 Well, what do you mean? Okay, so you've got pytest marking it with RESPX, so the project is defined a custom mark, and it's passing in the base URL of foo.bar, and then within it. Yeah, you don't have to, I guess, say the base URL, right? Right, because you're just passing it in. Because it's really, it's not that bad, not that hard to pass in through markers a variable to fixtures.
Starting point is 00:13:18 So that's what's going on. So you kind of pre-pair it with your mark here, okay? Yeah. Awesome, yeah. And then the fixtures pass in. Okay, cool. And that's pretty much it. There's not a a whole lot of not a lot to say about it if you need to mock out hdpx instead of using generic mock stuff you can use this library that basically has exactly
Starting point is 00:13:33 the same api as hdpx pretty cool sometimes i forget that not everybody has completely internalized the entire content of my book but well we we can work on that. We can work on it. I learned something new. Oh, really? You know what? I think if this is your next topic, I had no idea about this either. So I'm about to learn something new.
Starting point is 00:13:56 Okay, well, so this is actually something that Rodrigo also learned something new because he marked it as T TIL for today. I learned, um, and I kind of love people posting the TILs, but also I'm, I'm personally somebody that I don't think you need to prefix things with TIL for today. I learned if you just have a small blog post, go ahead and post it. I like small posts anyway. So unpacking keyword args with, or K-Kwargs. I usually just say keyword args. Do you have, do you say Kwargs?
Starting point is 00:14:30 I'm K-W-Args. K-W-Args, okay. But I know people say Kwargs, but I don't know. It sounds like I'm speaking Klingon or something. I don't do it. Yeah, it makes me think of Deep Space Nine with Quark. But unpacking keyword args with custom objects. So let's say you've got, so there's a couple of things.
Starting point is 00:14:49 Unpacking, and we're talking about the star or the double star or the splat splat or double splat, however you want to say it. So let's say you've got a dictionary and you want to pass the contents of the dictionary as arguments to a function or something. That's how we often use it, is doing a star star with a dictionary and it unpacks it into keyword arcs for a function call, which is cool. Or you can just do it. Here's an example of merging two dictionaries with this.
Starting point is 00:15:22 I don't do it like, I don't usually do this much, but cool, you can do that. There's a newer syntax where we use the pipe on dictionaries with this. I don't do it. I don't usually do this much, but cool, you can do that. There's a newer syntax where we use the pipe on dictionaries as well, and that's the same thing. There's like three or four ways to do this these days. Yeah, because with Python, there should be one obvious way to do this. And if there's not, there's four.
Starting point is 00:15:39 Or unless it's strings, then there's six. Okay, so there's a lot of times where doing this star star unpacking is like so cool and convenient. But if you have custom objects, not dictionaries, if you have your own objects, how do you deal with that? Can you do that? Yes, you can. And that's what this little tl is about. All you have to do is you have to add a keys function to your object or your class. And the keys function needs to or method needs to return an interval. And in this case, just a list is an interval, for instance. And then the example, he's got a Harry Potter class, it was returning first, middle, and last. And then a get item that presumably takes a key and returns something.
Starting point is 00:16:30 And that's all you need. And then you can do this double splat thing, and it works. Oh, that's awesome. And also, the example's good also just to remind everybody that when you're doing the get item, to go ahead and do an else clause with a key error. So if people pass in the wrong thing, they get the appropriate exception. So anyway, thanks. Indeed. Yeah, I love it. Very, very cool. All right. Different items, right?
Starting point is 00:16:59 That's it, I guess. You feel pretty extra, I can see. I do feel pretty extra. I got more extras than i had normal things so let's jump in let's do it um uh over on uh python test.com um oh a couple things i'll just kind of go backwards first off i finally fixed it i had uh x up or twitter and i don't do twitter anymore so i replaced it with a blue sky icon and also on my contact form has blue sky now so i fixed those things um also i had like uh incorrect podcast thing stuff up so i fixed my podcast data testing code by the bytes and stuff of course anyway that's not what i really want to talk about what i want to talk about is the top high test plugins. I've been researching a lot of the stuff in here
Starting point is 00:17:45 for the testing code season two. And I'm relying on, with this data, I'm relying on the top PyPI packages. And this is an excellent resource, and it uses BigQuery. And there was just a new article from the person that created this, Hugo. He wrote an article about what's going on with this. A surprising thing about PyPI is BigQuery data. And it's interesting and also kind of exciting news. he he's using the like the free uh free version of uh of uh google big cloud or big query stuff
Starting point is 00:18:29 whatever you need the google account um you get a few big query queries um and if you do it too much they kick you out um and so uh at first he started with 4 000 projects then he went bumped up to 5 000 projects and then 8,000 projects, but there's more than that. So he's like, well, I wonder how much I can do. And so this is a little test that he went through. I'm going to jump down to the, the punchline and the punchline is that, um, you can do, he went up to, uh, tried a million packages, uh, and there aren't a million packages, and there aren't a million packages, but it returned 531,000 packages, and it was the same bytes processed as even just doing one for 30 days. So it doesn't matter.
Starting point is 00:19:18 It turned out it didn't really matter how many packages to query. What it was was how the date time, the date spread. So if you did like five days, it was way cheaper than 15 days, which is way cheaper than 30 days. And it's relatively linear. So it looks like what he's going to do is change it so that we get like a ton of package data,
Starting point is 00:19:43 like as much as we can get, 531,000. But he's not going to, he's probably going to report that in smaller chunks too because a lot of people- Daily or something, yeah. But a lot of people aren't going to want to see 531,000. The top 8,000 is probably sufficient. You really got to zoom in to see them all at once.
Starting point is 00:20:02 So I'm excited because when, when I'm using, I'm using the 8,000 dataset and the top high test packages, there are currently 133 in the top 8,000. And I'd like to have a bigger list. So if I've got the top like 10,000 or 20,000, I could probably get a bigger list of packages anyway. So that's, that's it.
Starting point is 00:20:29 It's just an interesting thing if you're doing BigQuery data. It's the date that is the big effector of the price. Right, because it probably counts the number of downloads for each day or per download individually, whereas if there's only 500,000 packages, there's that. But there's way more downloads than there are packages. The other thing that, yeah, and the other thing that might change is,
Starting point is 00:20:50 I think is going to change is, it cost more to filter on just pip packages. And now we're getting a lot of UV, people using UV to download stuff from PyPI. And so he wants to include that too. So it'll probably, I think he's going to change it so that the data is from
Starting point is 00:21:12 everything instead of just PIPs. That makes sense. Yeah, it definitely does. Yeah. Anyway. Awesome. Cool. Do you have any extras? I do. Not too many, but let's do it. So, Owen Lamont, remember we talked about UV-secure, the project he created, and it scans your lockpods.
Starting point is 00:21:29 And I was speculating what API it was using. He wrote in to say, thanks for the shout outs. It just uses the PyPI JSON API at present to query for package vulnerabilities. Same thing that pip audit does. He does work at asynchronous later, try to make it a little faster, but it's just the simple API there.
Starting point is 00:21:47 So that's what that is, not something like Snyk or some other more advanced threat modeling setup. And yeah, that's it. That's all I got for my extras. All right, cool. How about a joke? Oh, I've got a joke.
Starting point is 00:21:58 People, if they like puns and stuff, this is gonna be good. It's at angle bracket slash angle bracket code puns or codepuns.com. So we've all written bad code. And I know that sometimes testing stuff that's going to be good it's at angle bracket slash angle bracket code puns or code puns.com so we've all written bad code and i know that sometimes testing will shake out the bugs brian but do you know why programmers prefer dark mode i think this is not totally wrong i think we should switch it i think it's a foul fallacy here why uh i guess i'll read it as it is why do programmers prefer dark mode because Because light attracts bugs.
Starting point is 00:22:25 I guess if you're talking moths, but if you're talking cockroaches, it's the other way around. But here's the thing. That's a great joke, but you can click more puns and they just keep going. My love for coding is like a recursive function.
Starting point is 00:22:37 That's not very good one. That's fine. Why did the for loop stop running? It took a break, Simicol. How do you comfort a JavaScript bug? You console it. I see. There's a lot of good stuff here. Like, cause console.log is how you debug that thing instead of. Oh, okay. Okay. Yeah. It's the print, print debugging equivalent. Not a JavaScript or okay. Well, you certainly can't console your JavaScript bugs when you
Starting point is 00:23:01 create them. All right. Why do you not want to function as a customer? Because they return a lot of items. Come on. Anyway, you can go to codepuns.com and click through until they can't take it anymore. Yeah. That's probably why you want to see customer because they only return one item. That's true, right? Yeah.
Starting point is 00:23:24 Well, good stuff. Yeah. All Yeah. Well, good stuff. Yeah. Good stuff. All right. Well, thanks again. Thanks for showing up for Python Bytes. And thanks, everybody, for listening. Yeah, you bet.
Starting point is 00:23:34 Thanks for being here. Bye, everyone. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.