Python Bytes - #356 Ripping from PyPI

Episode Date: October 10, 2023

Topics covered in this episode: Psycopg 3 dacite RIP: Fast, barebones pip implementation in Rust Flaky Tests follow up Extras Joke See the full show notes for this episode on the website at pyth...onbytes.fm/356

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 356, recorded October 10th, 2023. I'm Michael Kennedy. And I'm Brian Ocken. And this episode is brought to you by us. Courses over at TalkPython Training, the complete PyTest course from Brian, Patreon supporters, and find us over on Fostedon. That's probably the best way to chat with us these days.
Starting point is 00:00:23 And also be part of the live show, pythonbytes.fm slash live, usually Tuesdays at 11 Pacific time. And you can also see all the old versions there as well. So Brian, what are you going to start us with today? Well, I thought I would start with a, it's kind of going to have a theme of getting information from Fostodon, people letting us know. So I saw today a post by the PsychoPG group, so Psycho Postgres, and the post says, it feels weird, but it's time to stop considering PsychoPG 2,
Starting point is 00:00:59 the present, and PsychoPG 3, the future. We've entered the time where PsychoPG 3 is the present and 2 is the respectable past. Updated the feature page and a few other resources on the website to reflect this. So I thought I would check it out. Awesome, that's big news.
Starting point is 00:01:17 Postgres is clearly the biggest database that people use in the Python space according to the survey we talked about recently. So yeah, this is relevant, right? Yeah, and this is a great library. I've been using the version 2 for a long time. And I guess I haven't been paying attention to the 3, but it's been out for a while.
Starting point is 00:01:34 So the 3 page basically talks about that it's a new implementation of the most used, reliable, and feature-rich Postgres adapter for Python. And there is some differences, though. So apparently most of 2 was written in C, but 3 is written in a combination. So a lot of it's in Python, and some of the speedups are in C. And so the big announcement really is that there are no, so we're going to link to both the three page and the features page. And the features page has a nice comparison of the,
Starting point is 00:02:11 of two versus three. So the really, the recommendation is they're still going to maintain two, but you should maybe think about mostly think about three for new projects. So if you already have an existing project that's running two go ahead and leave it running with that i couldn't find where they're doing it when or if or where they're doing announced an end of life so um i don't think that's even planned at this point for end of life for two eventually probably but so two has been around since 2006 uh three started in in first release in 2021. But they've got a whole bunch of cool things in here.
Starting point is 00:02:49 So some of the things I thought were pretty cool was we've got native async IO support. That looks pretty nice. Native support for more Python types such as enums and Postgres types. So Postgres, one of the cool Postgres types was multi-range. So that's supported now. That's pretty nice. And then I don't know what parameter bindings are actually, but apparently two has client-side parameter bindings
Starting point is 00:03:16 and three defaults to server-side parameter bindings. But you can still do client-side if you want to. So lots of fun stuff in here advanced connect connection pool static typing support yeah so I say why not try three and only go back to two if you really need to so yeah this is super exciting those are a lot of great features I think the async support was previously done through a separate library, and now it's just part of it. That's pretty cool.
Starting point is 00:03:51 What is the present? What is the past? What is the future? We'll come back to that in our extras. We'll have some fun there. Okay. And then also our audience is awesome. Mike says, hey, I migrated PyPI from. From psycho PG two to three in June.
Starting point is 00:04:07 It was not too hard, but it took some time to do safely. So, Hey, when you pip install things or IPI lookup things, it's already using this. Awesome. Yeah.
Starting point is 00:04:16 Oh, that's cool. Nice. Indeed. Indeed. All right. Shall we convert some data? That's what I got next.
Starting point is 00:04:23 Sure. Let's talk about it. So we all know about Pydantic, right? And pydantic is cool. So pydantic lets us like create classes that derive from some pydantic base type, base model, and it will do conversions and parsing of JSON and so on. But maybe, maybe you know about data classes, right? These are built in, they're not anything separate, they're part of Python itself. And so being able to use them is pretty cool. And Raymond Peck commented on one of our YouTube videos
Starting point is 00:04:53 and said, hey, I use this thing called DaySite. Hopefully that's the right pronunciation, DaySite. And the idea is it allows you to create Python data classes in a similar way, right? So simple creation of data classes from dictionaries. So APIs and other things. And you just create data classes as you normally would. Data classes just are classes with, you know, global or static level field type, right?
Starting point is 00:05:21 Like class user, name, colon, string, age, colon, int, and so on. Then you just put the data class wrapper on it, right, like class user name, colon string, age, colon, and so on, they just put the data class wrapper on it, right? And it gives it a bunch of nice things like hash ability, comparability, etc, right constructors. But if you've got a dictionary that has sub objects that themselves should be data classes, and maybe a list of other things that should be data classes, actually turning that JSON into one of these more complicated versions is a hassle, right? You also might need to do data conversion, as well, right to make sure that the types match. So that's what you can use this for. And it's pretty neat. You can just say from dict, a site from dict, and you say what data class it's going to go to, and what the dictionary is, And like a little factory method out pops one of these objects. Cool, right? Yeah, that's pretty cool. Yeah. And
Starting point is 00:06:10 let's see, features include nested structures, like I described, basic type checking, optional fields, union. So if you say it can be an enter a string, it'll check that it's one of those two, but not a float or something. Forward references, collections, and interestingly, custom type hooks. So for example, you can say anytime that you're going to take a string, actually call string a lot dot lower dot strip on it, if it's not null, right, if it's not none, things like that, right. So you can actually get into the parsing side a little bit if you need to. So that's pretty neat. What else have we got here? Let me scroll down and talks, of course, about all the things I said. I think that's, that's probably it. I guess one thing it's worth pointing out, it says, right in the docs, they say, it's important to mention that it's not a data validation library. Now, when I first read that, I'm like, but there's all these
Starting point is 00:07:00 types, right? Is it like, is it supposed to find out when I say the thing takes a string and an int and a bool for its three fields, like it will actually do that. It does that, but you can't say like the string must be this regular expression and this many characters and the int has to be positive and like those types of things it doesn't do, right? So it's just dictionary to object, possibly complex data class object,
Starting point is 00:07:24 but with type validation and that's about it. But still, I think data class object, but with type validation, and that's about it. But still, I think that's super, super useful. It's a kind of partway validator, I think. Yeah. Yeah, more like a proper parser without the validation itself. Yeah.
Starting point is 00:07:37 Yeah, indeed. So anyway, that's what I got for people. It was news to me, so check it out. I think that's pretty neat. So I am going to continue on with um a topic of guess the ever rustification of um of python projects uh and this one is one that we use every day pip so um this was this came to us from owen owen lamont i think owen Lamont.
Starting point is 00:08:05 Thanks, Owen. So I said, hey, you guys might be interested in this. It's under the project group prefix dev is rip for a Rust pip written in Rust. And I was ready to go try it. It's not ready to try yet yet but it's still pretty exciting so uh the uh kind of the headline fast bare bones pip implementation and rust and it's not just an installer though so it has um it's got uh what does it have so far it's got you can download and aggressively cache uh pi pi metadata resolve pi pi packages using a project called Resolvo, which is a kind of a
Starting point is 00:08:47 Rust thing. And then still on the planned list is actually installing the files. So doesn't, I'm just chuckling because yeah, I just jumped the gun, but this is new. So it's fine that it's published early so uh first commits look like about two weeks ago so um i'm pretty excited about about this i think it'd be fun to uh try it out um and uh and look at diff possibly different resolvers and how they handled it versus normal pip so kind of neat yeah when i saw this too, I was pretty excited. So thanks, Owen, for sending this in. Yeah, it's cool. And Mike says, let her rip.
Starting point is 00:09:28 I love it. Yeah. So it looks like maybe we should swap these last two topics. But I don't know. Let's go with your next topic. Well, I do think that this one would have been pretty good for you to cover. But too bad. I'm already on it.
Starting point is 00:09:44 So here we go. I guess the only like stronger tie into me is it's in response to a talk Python episode. So this one comes to us from Marwin and thank you, Marwin for sending it in and writing this article called how not to foot gun yourself when writing tests, a showcase of flaky tests. And so as I was writing an article,
Starting point is 00:10:02 I was writing this article after listening to talk Python with Gregory Kamphammer and Owen Perry talking about flaky tests and says i was writing an article i was writing this article after listening to talk python with gregory humph hammer and owen perry talking about flaky tests so that was the subject of that basically talked about all of their experience here which is cool um like a definition and really a lot of examples of flaky tests i thought i mean you know brian did you get to check any of these out i haven't looked at this this yet. No. Well, we'll do it live. Okay. Okay. So the first one is really about concurrency and said, well, look, I've got a bunch of tests. Maybe I could speed them up by using threading and run a bunch of them. Oh yeah. That'd be fun. However, there's a real simple example of like, Hey, I've got an account and I can transfer money from one account to the other. So first account dot withdraw this amount, and then second account deposit that amount, right?
Starting point is 00:10:49 And how could that go wrong? So do a bunch of those. And then, hey, if we want to make those faster, let's run them in some threads, right? Rather than using, say, one of the PyTest plugins like more properly, right? This is more to highlight like what might go wrong, you know? And it turns out that we have the GIL
Starting point is 00:11:05 and I think Marlon's right. I think people do think that the GIL will just kind of save you from concurrency, right? Because only one thing can run at a time. So how are you going to have a problem? Well, it's one fine thing. It's one kind of bytecode at a time. Exactly, one Python bytecode.
Starting point is 00:11:23 But here's the thing. If your program ever enters into a temporarily invalid state ever, you may need some kind of concurrency locks or something. And I think my reading of Python stuff, I don't see this very often. And I think actually a lot of people should be doing more locks honestly so even in this example I withdraw some money and now for just a moment the program is in a temporary temporarily invalid state until it's deposited into the other account right yeah so that's this moment like if the gill says okay you ran enough we're gonna switch to the other one then somebody tries to the other one
Starting point is 00:12:01 reads that state that's gonna be trouble right so they were talking about well talking about, well, how do you actually, you know, how do you actually check this? And here's something I actually didn't even know. Look, you can actually make that switching back and forth more aggressive. You can control that switching that the GIL does on how much Python on one thread it'll do before it switches to another by getting the switch interval. And here they set it to one tenth of a millisecond. And then they do a bunch of Python on one thread it'll do before it switches to another by getting the switch interval and here they set it to one tenth of a millisecond oh wow and then they do a bunch of work and then they put it back and that's pretty interesting did you know you could do that I didn't this is pretty cool to I know this might be worth covering the article right here just that you know yeah
Starting point is 00:12:40 good for yeah for testing these race conditions yes Yes, exactly. Like make it worse. And also run it on more cores potentially. I don't know. Probably that doesn't too much matter. Okay, so to avoid boilerplate, you can reach out to the PyTest repeat plugin. Weren't you just talking about this? I know you're doing some stuff with it.
Starting point is 00:12:57 Yeah, I'm one of the maintainers on it now. There's my picture. Yeah, I feel like you had actually just mentioned it. Maybe it was the Git article or something. But anyway anyway recently i thought you were just talking about that so yeah exactly also uh worth pointing out a similar and more straightforward plugin possibly for this job is pytest lake finder which is meant to find flaky tests oh yeah this okay so what are the let's just hang out for here uh One of the differences they're saying is that you can repeat your test multiple times with repeat or FlakeFinder, you can repeat your whole suite. That's one of the things I need to change for repeat because you can do the same thing with repeat. You can run the whole suite. It's just kind of hidden in two lines of the readme and it needs to be more bolded that you can change the scope and repeat the whole thing. Oh, nice. Nice.
Starting point is 00:13:49 All right, randomness. For example, algorithms that are non-deterministic, like heuristic ones. So that's pretty interesting. So they do, what is this, like a distance algorithm or something that's heuristic? So they say, like, you know, NP close, which there's they're testing on NP close, whereas NumPy, like are these vectors close, says basically fix this by actually, you know, computing the tolerance, and they use a little statistics, like, yeah, probably more statistics than I know, but let's say three standard deviations away, or something like that.
Starting point is 00:14:23 It's interesting, obviously, floating point arithmetic is always trouble, loss of precision is always trouble. But when they talk about here, that's interesting is using integers, like integers in Python are arbitrarily large, which I think probably complicates C interoperability every now and then, but otherwise it's like a good thing generally. However, if you're doing NumPy, NumPy has C backing for a lot of its types, right? Like int32 and so on. So you could end up with, if you specify a particular data type in there, when you create your array, right? Data type is npint 32. Then you do have to care about the 2.14 billion limit, right? Yeah.
Starting point is 00:15:10 I mean, you probably know that all the time from C, right? You've got to worry about variable sizes and signed, unsigned shorts and whatever. Yeah, and be careful about the order of operations so you don't overflow in the middle of a set of operations. Yeah. Let's see. There are some interesting things about buzzing your data, like sending a bunch of crazy data or even using hypothesis to try to find edge cases, timeouts for external systems be like super explicit about those. So there's just, you know, there are a bunch
Starting point is 00:15:38 of cool examples and you're like, this is a really properly long article here. So I think it really highlights a lot of good examples. Follow up to that podcast episode, but just good for testing as well. Yeah, I can't wait to read that more closely and listen to that episode. I have to admit, I haven't listened to it yet. Yeah, it's a good one. Blaze out in the audience wonders if we have to reinvent these corner cases for rust. I imagine we probably do, Blaze.
Starting point is 00:16:03 Good point. Yeah, possibly. How extra are you feeling, Brian? I imagine we probably do, Blaze. Good point. Yeah, possibly. How extra are you feeling, Brian? I'm feeling pretty extra. Actually, myself, not too much. I've just been actually doing a lot of personal projects, so I haven't been doing a lot
Starting point is 00:16:17 of work projects to announce. However, those are wrapping up. The personal stuff's wrapping up. So, I hope to get more Python people and Python test episodes out soon and more course chapters coming so everything in due time nice how about you uh i have some extras as well first i just got back from pi bay last night so that was a lot of fun pi bay is always a good time if i see him i get the video to play even so really cool environment and saw, you know, nice to meet a lot of people there. So for those of you I met, great to meet you. Also, I just want
Starting point is 00:16:50 to give a shout out to Sparkmail. I just started using Sparkmail to try to kind of unify some stuff. What a cool, what a cool app for macOS email. So people, if you're like fed up, was using different web run-ins for different things. And it was like, ah, they're all a little bit different. One has E for archive, one has A for archive, but like ProtonMail, like the A for archive only periodically works. Sometimes it works.
Starting point is 00:17:16 And you're like, why is it so frustrating? Like maybe I could just use one thing and all. And that was really fun. So also I think a big part of the development team is in Ukraine. So happy to be supporting those folks as well. Nice. Somewhere it says like made from, you know, made from like hello from Ukraine or something like that, which is cool.
Starting point is 00:17:36 However, one of the challenges is one of my personal email domains is actually backed by ProtonMail. I think I talked about that before. But ProtonMail has end-to-end encryption. And so you can't talk to it with a third-party email client, right? Because it can't decrypt it. It doesn't use IMAP, at least not directly. So if you use ProtonMail and you want to have something that is not, you know, there's a proper, like a standard email client, you can install this thing called Proton. What's it called? Bridge proton mail bridge is its name. And what it is is it runs locally on your computer. It does all the end to end encryption and then puts it like it has a D a password protected, but not
Starting point is 00:18:16 end to end encrypted IMAP thing that just runs on local hosts. So you just attach to local hosts for your IMAP and then you have proton mail plugged into, you know, their examples or Outlook. It just made me get a little wheezy just thinking about it. But, you know, it also works on SparkMail and other nice things. So I had been using SuperHuman, which was really nice, but that's only Gmail, which is such a hassle. So this works for anything, which makes me super happy. Yeah, I don't think I'm using... What do you do for email?
Starting point is 00:18:43 I just use the web clients, but mostly it's Fast super happy. Yeah, I don't think I'm using... What do you do for email? I just use the web clients, but mostly it's Fastmail. Yeah, nice. That's what I had been doing for 10 years, but I just kind of like, there were just too many and they were, I don't know, weird. And I'm like, let me try this.
Starting point is 00:18:55 I really like it. I think I will check it out. One of the things you brought up, Outlook, I thought it was... I have to use Outlook for work and it still drives me crazy that control F is not fine. It's forward. Oh my gosh.
Starting point is 00:19:10 Yes. It's terrible. Yeah. This thing is nice. Like it has sort of digital wellbeing stuff where it will only show you, you can have it timeout. So it brings you to this thing like, Hey, check your email, your email, just like two or three times a day. Show me on like this little list here. That'll just show like, say people that are important to me, but nobody else.
Starting point is 00:19:27 You can like block senders. Like, it's pretty sweet. Nice. Cool. Yeah. Yeah. Okay. Next, what I hinted at before is I ran across this YouTube channel called Dust.
Starting point is 00:19:40 Okay. Man, are they making amazing science fiction. Have you seen this? Just, you shared it with me last week. It's pretty cool. Okay. Man, are they making amazing science fiction. Have you seen this? Just, you shared it with me last week. It was pretty cool. Yeah. It's just, it's just this independent channel and they are posting like new, if you like short sci-fi, like 10, 20 minutes sci-fi stories, the production quality is just off the chart. So I recommend to people actually interested in this FTL faster than light, which is about faster than light travel. And it's, it's pretty neat. Like the, the graphics and stuff is it's surprisingly good for what it is. So people can check that out. And also one called,
Starting point is 00:20:15 uh, called Oceanus, which is like about this, uh, sort of underwater world. And yeah, it's like this one's 30 minutes. It's long, but anyway, if people want, you know, short form science fiction, this is pretty awesome. I'll link to it in the show notes. Oh,
Starting point is 00:20:31 that's pretty cool. Yeah. Blaze out there says FTL is a great short. I totally agree. It's very, very well done. Yeah. And it's not always the Hollywood,
Starting point is 00:20:40 like of course the good person has to, you know, the hero has to triumph at the end. Of course. It's just a matter of how. Yeah. You never know. It's some of these are pretty open-ended as you might expect a 10 minute show to be.
Starting point is 00:20:51 Well, you know, I think, you know, there's some half hour shows on TV that really are only like 15 minutes if you take the commercials. I know a lot of the comments on, if you look at like the FTL one, for example, the comments are like, this is a better show than Hollywood studios make with millions of dollars and large teams. Like, how are you all doing this? So anyway, I thought people might appreciate this given our audience is probably a little bit techie. Yeah. Cool.
Starting point is 00:21:15 Right. Have you, everyone did this, like rewrite your, your software, like some old junkie thing, wrote it in some old code and you're going to rewrite the new awesomeness? Frequently, yes. So this is the joke. So there's an amazing video, music video, which is a parody on American Pie. And for those of you who are not familiar with American Pie, it's a really great song. Oh, you should sing it. But it's eight. No, I'm not singing it. It's eight and a half minutes long. And so this guy, Dylan Beatty, he's really talented.
Starting point is 00:21:48 And he redid one that basically is like a journey through all the follies of his different perspectives through his programming career. And it starts out in like assembly. Then it goes, I don't know what it's the next one. Is it VB6 or something? And then, oh, it's just, it's an amazing, amazing thing. But eight and a half minutes, I'm not going to play it. So I'm just going to say, go watch the video. Uh, it's I'm sure it will connect with you. What do you think? It's very good. And then check out his channel. Cause there's a bunch of great, uh, nerdy videos on his channel. So it's good. Yeah. If we scroll down here, but what we find in the recommended, um, you give rest a bad name. That's funny. That was a good one. Yeah. we find in the recommended um you give rest a bad name that's funny that was a
Starting point is 00:22:25 good one yeah the bug in the javascript i think we featured before but it's like starting to think i might need a drink because the bug is in the javascript that's pretty good yeah huh fun yeah fun anyway so this is an entry point into quite a bit of time of programmer fun ideas. Okay. So that, and that's a programmer one. I've had a, like a dad joke, science joke that I wanted to share. Cause I just, I ran across it recently and I just thought it was so funny. So it's just a comment.
Starting point is 00:22:59 There are more hydrogen atoms in a single molecule of water than there are stars in the entire solar system. And I talked to several people about it and they just looked at me blankly and said, that can't be true. I'm like, sure. There's two hydrogen atoms in a molecule of water and there's one star in our solar system. That's awesome. And those two hydrogen atoms. Did the hydrogen atoms come from stars? I don't know. Were they just the stars?
Starting point is 00:23:27 Anything larger than that should have come from stars. Yeah, that's awesome. I love it. It does make you think, because like if you think galaxy, universe, whatever, right? Yeah. But solar system, I mean, solar, singular. Yeah. And I had, it was funny.
Starting point is 00:23:40 Some of the comments were like trying to calculate the volume of the water and how many atoms might be there. And I'm like, no, it's not atoms. It's a single molecule of water, not a glass of it. So pretty funny. Yeah, that's how I first started. Like, well, how large is the glass? How many? Okay, how many molars is that?
Starting point is 00:23:59 And how many? Oh, wait. That's not what it says at all. That's irrelevant. Yeah. I love it. Anyway. All right.
Starting point is 00:24:04 Well. Cool. Well. Once again, great chatting with you weekly. Yep. You as well. that's not what it says at all that's irrelevant yeah i love it anyway all right well cool well once again great chatting with you weekly yep you as well and thanks to everyone for listening see y'all later bye

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.