Python Bytes - #271 CPython: Async Task Groups in Python 3.11

Episode Date: February 16, 2022

Topics covered in this episode: fastapi-events Ways I Use Testing as a Data Scientist py-overload Next-generation seaborn interface Compile CPython to Web Assembly Extras Joke See the full show ...notes for this episode on the website at pythonbytes.fm/271

Transcript
Discussion (0)
Starting point is 00:00:00 Hey there, thanks for listening. Before we jump into this episode, I just want to remind you that this episode is brought to you by us over at TalkPython Training and Brian through his PyTest book. So if you want to get hands-on and learn something with Python, be sure to consider our courses over at TalkPython Training.
Starting point is 00:00:17 Visit them via pythonbytes.fm slash courses. And if you're looking to do testing and get better with PyTest, check out Brian's book at pythonbytes.fm slash PyTest. Enjoy the episode. Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 271. Really? Wow. Recorded February 16th, 2022. I'm Brian Ocken. Hi, I'm Michael Kennedy. And I'm Steve Dower.
Starting point is 00:00:44 Welcome, Steve. So who's Steve Dower? Who's Steve Dower? Yeah. A number of things. Probably most interesting to this audience is I'm a core developer on CPython, one of our Windows experts. So I spend a lot of my time focusing on making Python run better on Windows. I also work at Microsoft, where I also spend a lot of my time making Python run better on Windows. So I'm kind of a bit of a one-trick pony, I guess. But I feel like it's good work, and it helps a lot of people. So if I have a problem with Python on Windows, it's your fault. If there's solutions for Python on Windows, then it's my fault fault i'll let other people own the problems so if i go to the windows store i can
Starting point is 00:01:28 now install python from there and you were you were part of that right oh i should i should have had that up on screen shouldn't i um yeah that that was uh that that was actually um the the request came from people within microsoft were like hey why can't why can't we get python up on the store? And my response to all of these is like, well, if the community is willing to do it, which is half me and is half the people who would have to take over if I stopped doing it, then yeah, we'll go ahead and do it.
Starting point is 00:01:55 And so I got actual work time for that. That was a contribution from Microsoft for that one. But yeah, the community was on board and it's going really well. That's also the one that we tied into the default Python.exe that's on every Windows machine now. So if you go to a brand new machine
Starting point is 00:02:12 and just type in Python, you'll get straight to the PSF's Python. It's not, Microsoft is not doing it anymore. We just contributed the change and now I switch hat and do it with the other hat on. So, you know, it's real Python, right? It's exactly the same as what you get from python.org it's just delivered you know easily fast install automatic updates um and a couple of edge issues that we're working on
Starting point is 00:02:37 bringing down so yeah fantastic automatic update i know this wasn't one of the topics but now i think i might have to rethink how i'm installing python on my desktop that work so that's a cool idea i i have only had store installs on my own machines since 3.8 i think i haven't apart from testing i haven't actually used the regular installer on my i mean you Of course that makes sense. I mean, you know, it's always testing, right? Every time I'm using Python, I'm testing. Chris May out there says,
Starting point is 00:03:11 thank you so much for making my work life in Windows easier. Anytime. Well, Michael, why don't you kick us off with a story or topic? I have got a good one.
Starting point is 00:03:20 So I'm a big fan of FastAPI and FastAPI being built on Starlet. So by the transitive property, I'm also a fan of Starlet. And there's this thing I want to cover called FastAPI events. So when a request comes in to a particular API endpoint, or if you convert it over to a web app, to a web page sort of request or something, you might want to dispatch that out to say like WebSocket listeners or something along those lines. So there's this cool project called Fast API Events.
Starting point is 00:03:49 It's pretty small and new. So I'm going to try to give it some visibility. It's only got 36 stars. It's pretty new. But the idea is that you can go through and basically create this middleware handler that will let you say when a request comes in, here's the way when an event is raised, here's the thing that's going to handle it. And then in some API endpoint, you can say dispatch, give the event a name and some dictionary data to be passed along. I suppose it doesn't have to be a dictionary,
Starting point is 00:04:15 it could be whatever. And then in other parts of your code, you can say, I want to just hear about this event that happens no matter what API endpoint received it, no matter where in like how deep down in the code it was received and so on. So then way down here, you just put a little handler decorator on there and say, I want to capture all the events that start with, you know, some substring like cat star for like category, whatever, or this one is actually
Starting point is 00:04:41 literally about cats. And then you can just go through and write these functions that will then handle that and you know you can do whatever you want you can also pass them off to queues like you can use the um sqs the simple queuing service from aws i believe that is as the endpoint instead of it just being your app right so if you've got like lots of scale out and stuff like that wow cool so just like a neat way to to do logging or even distributed logging i guess if you've got forwarding handlers in there you can just yeah yeah it seems like it right like um or if you know you want to sort of build up like here's the request transaction and here we're at this stage or like you could maybe do like a visibility into long-running workflows with this kind of thing or something along those lines i would think so yeah there's
Starting point is 00:05:23 also an echo handler for debugging. I kind of like that. Like if I just need to see what is happening, it'll just print whatever's happening. It'll just start printing out all the behaviors that you're logging. So, and then when you want to stop doing that, you just take away the handler and you don't have to search the entire code base for print and find everywhere that you added it in for debugging. Exactly.
Starting point is 00:05:44 Alvaro out there says this looks similar to Django events. Yeah, I suspect it is similar. Anyway, pretty short and simple, but if you're looking for a way to sort of put notifications in a structured way into a fast API app, well, here you go. Oh, I'm thinking of a whole bunch of more abusive ways to use this. Yeah, you can write some really impressive spaghetti code with this. Yeah, I'm sure that you can.
Starting point is 00:06:14 Get the cloud involved in everything. Yeah. So let's switch gears a little bit and talk about testing. Imagine that. I've got a testing topic. So I'm pretty excited. This is I've been asked a lot about testing pipelines, testing data science stuff. And, and I'm not, that's not something I do day to day. So I'm really glad to find people talking about it. So this, we've got an article from Peter Baumgartner, ways to use testing, ways I use testing as a data scientist.
Starting point is 00:06:46 And I actually, I just really love this article. It's great. To start with, he starts off with what he uses testing for. As a data scientist, he uses testing to make sure things work, to document his understanding, and to prevent future errors. Well, that seems straightforward, but the, the reason why he wrote, wrote this up is apparently because there's a lot of software, there's a lot of testing stuff out on the web, but it's not, it's like geared towards
Starting point is 00:07:15 test engineers or, or software developers. And he's like, I'm not a software developer. I'm a, I'm a, and you know, I'm doing something else. I'm doing analysis. I'm, I'm not a software person, even though, yeah, yeah, you are. Um, but, uh, but to write this up in, in a, a context where data people might understand it better. Um, for instance, uh, he doesn't even start off with writing, having written tests. Um, his, his analysis is like, if you're doing notebooks or other code, just use a cert a lot. So he's using a cert all over the place, including he says, where do you have use it for as many intermediate calculations and processes as, as you can, as it makes sense. Because in doing things like checking obvious stuff, like he's got an example of a table count where he's counting up all the yeses.
Starting point is 00:08:09 Well, you can do a little bit of math just to make sure the math works. So like all the yeses and nos and missings should all add up to the same count. Go ahead and throw an assert in there because sometimes it doesn't. And in this example, he said that he actually caught an error because he was looking at two different uh data frames um so they really weren't they didn't add up to the same so you can catch things like that so just double checking yourself on on things as you go wrong away go as you're developing one of the cool quotes he has in here is like, as he has a habit of when he's using notebooks to whenever he's visually inspecting the output, if you're visually looking at the data that comes out,
Starting point is 00:08:52 maybe write a tech and the search statement to do that analysis so that it's always checked. And this is a cool use of putting a certs in notebooks. I like this idea. The article goes on. It's pretty extensive talking about checking the data, using hypothesis to, well, not the data at this part, but your assumptions around the data. So using hypothesis to check your assumptions and hypothesis will show you things that maybe you didn't consider like uh nans are you handling those correctly um empty series or empty data structures
Starting point is 00:09:31 that are going into your uh into your code are you handling those if i mean hypothesis does have take some handholding but it does make you think about really what is the shape of the data going in um and do you can you do you need to limit it uh what hypothesis is looking at or do you need to change your code to handle more things hypothesis is great i've used that for a couple of uh parsing projects or combining projects i spent way too long um adding all the strategies to be able to test a URL parser that I was calling into. But it's fantastic for finding kind of things that you would not have thought of. Yeah.
Starting point is 00:10:16 I mean, it's finding things, but it's also, and it does make, yeah, that aspect of it seems like the point of it. But the real value I get on a hypothesis is thinking, making sure I really understand the data that's going to come in and thinking through those. It goes on to talk about actually testing your data using things like Pandora, which I wasn't familiar with, and another package called Great Expectations to look at putting schemas around the data coming in and making sure that the data always matches the schema. Going on to talk about a range act to assert and using PyTest.
Starting point is 00:10:46 PyTest comes in with, he's only really writing formal tests when he's writing libraries for other people. But all these other packages to be able to test with data science, I think this is a great addition to the data science community. Yeah.
Starting point is 00:11:00 Alvaro talks about how this is, you know, often referred to as defensive programming. And then, you know, I feel for him a little bit. He says, at work, we use this with our Fortran code. So there's that. But I do think this is a really interesting way of thinking about defensive code. I think of writing defensive code as like, oh, I'm going to have a bunch of is statements to verify this thing's not none or verify that this is the right type and that it has
Starting point is 00:11:23 a reasonable value and raise exceptions. I haven't really thought so much of it for like notebooks. So that's pretty interesting. And one of the neat things about like, if you're actually putting a search in your code, you can actually, you can write tests against your code that don't even have any certs in them. And because the search will happen within your code and the test will still fail and catch it so it's kind of cool yeah yeah very cool good stuff yeah uh steve i am super excited to hear about what you're you got coming up because this is brand new being a core developer
Starting point is 00:11:56 i feel it is appropriate that you break this i news here i mean i'm not gonna lie when it came to you know what am i going to talk about, what's the most recently accepted PEP that was somewhat controversial? And I think just as you kind of look down to the section on rejected ideas, which is considerably longer than the accepted ideas, you could probably get a bit of a sense for just what went on with exception groups. And I know, Michael, you just had a conversation. You've learned all about them. So you can take over when I run out here. I'll share my thoughts with it. But yeah, go ahead. I'd love to hear about it. This is sort of inspired by Trio, right? The end goal kind of is. So this is an interesting pep. And we've got a few of these on the go at the moment. It's kind of like a
Starting point is 00:12:42 stepping stone towards a better programming model or a stepping stone towards better libraries. So it's something that I think in my opinion, very few kind of application developers, kind of the last developer in the chain is often not going to use them and they're not going to need them. But as you go further in towards the lower levels of libraries, especially people writing async schedulers, are going to find incredible value out of them.
Starting point is 00:13:10 Essentially what the idea is, is that when you're running multiple tasks in parallel, if some of them fail, we don't currently have a neat way to capture the exceptions from all of the ones that failed. There's some approaches that would be like, wait for all of them to complete and wrap it in a list. And then you get some exception that contains a list of exceptions, but that's lost a whole lot of context. You can get just
Starting point is 00:13:34 whichever exception happens first, but then you lose all the other exceptions. And there's just been no real way to handle it. So an exception group essentially does bundle up all the exceptions internally in some way. But the really interesting thing is the except star syntax, which I'm going to have to scroll a long way down to find where that comes up. But this is really clever because if you're in that situation where say you're running 10 parallel processes, So here's kind of the first example of it. Then exceptions are no longer control flow at this level. Because if you've run 10 things and you're waiting for 10 things to complete,
Starting point is 00:14:12 you're not actually doing control flow with the exceptions anymore. What you're doing is handling the exception, but then the control flow is going to go back to where it was anyway, because you're going to be doing something different. So for example, if a file doesn't open, then you would want to do something different, right? You're going to stop going on and trying to read from the file. But if you've tried to open 10 files
Starting point is 00:14:36 and three of them failed, at the outside level, so at the inner level for each file that may have failed, you'll do something different. At the outer level, all you're really going to do is say, hey, this task failed because a file couldn't be opened. And maybe you do something else, but it's at the outside level. So AcceptStar takes that exception group and it's going to give you a chance to handle each exception essentially on its own. It will group them together. So in this example, if five tasks report report spam error then you'll get into this except spam error block with all five of them at once um which is just uh what is that a list of spam exception spam error exceptions something like or tuple something like that i think it's i think
Starting point is 00:15:17 it's a tuple i think with the the star center i believe um something iterable, basically, yeah. Yeah. Something you can iterate over to see the exceptions, but it's really just this happened at some point, and you process it. And if the group actually contains multiple types of exceptions, then each handler that matches is going to be called for all the exceptions that match that. So you could have this try block raise an exception group that has some spam errors, it has some raise an exception group that has some spam errors,
Starting point is 00:15:46 it has some foo errors, it has some bar errors, and all three except star blocks are going to get called with the exceptions that match those, which is a bit, it's definitely going to confuse a lot of people. It confuses me, which is why I was keen to actually spend a bit more time digging into it and trying to figure out what's really valuable about this. And I do think the most valuable one is really where the error is canceled error. Because if for whatever reason, five of your tasks have been canceled, then you need to capture that and do something with that outside of it. But it doesn't necessarily mean you want to throw away the five successful results.
Starting point is 00:16:24 And so you do kind of want to keep a bit of everything going on. And like I say, it's a building block. On its own, this isn't enough to do anything new and useful. The next thing that comes along is task groups. And that's being worked on by, I expect a lot of the same people who worked on exception groups, because with task groups, now you can actually start, there we go, Guido's just merged task groups. Excellent. Then now you can actually run the task group.
Starting point is 00:16:55 And if the group raises any errors, then you'll catch them through an exception group. And so that enables a whole lot of new uses and new ways to use async IO or just async generally in the metadata library. As you say, trios already had something like this for a while. Yeah, from their nursery thing. Yeah. And so that is now being standardized so libraries can kind of share their implementations and work together on it.
Starting point is 00:17:23 So one of the reasons you need this is if I start two web requests and three database queries, and then I go to wait on them, you know, then if several of them fail, the error state captured in totality is a tree of errors that represent, well, this task started this other task, which then had this error, this other one, right? So you need some way to deal with a group of errors that could happen kind of all at once, right? In one of these task groups that gets kicked off. Yeah.
Starting point is 00:17:53 So the new task group thing is super cool. So you say async with task group as TG, and there's two things that are neat about it. One is right now, if you fire off a bunch of tasks in async and await style, they're basically unrelated. Like if one fails, that means nothing for the other, right? They're just like, well, here's a bunch of stuff that happened. And this creates a relationship between them, right? So that if one fails, I think it might not schedule new ones, something to that, like it's brand new. I'm just seeing the tweets. So I think that that's the story. I believe that was the story of Brio. The other thing that's interesting here that in this example, which I'll link to from Yuri that he posted, he tweeted about the news,
Starting point is 00:18:31 was notice that the first one says task group, create task for some task, and then await something that creates another task. There's nowhere where you say, store all those values into like some lists of tasks, then go to the task and iterate them and wait for them or gather them or whatever the heck it was you had to do before. This now makes tasks fire and forget, I can say run this, run this. And within that, it could do more of those types of things. And then you just block at the width context manager level to wait for all the tasks to finish, which I think is a real big improvement. Because right now you've got to like constantly juggle, well, I've got to return a task from
Starting point is 00:19:07 this so I can go wait on it later and all those sort of oddities. And this cleans up a lot of that. And of course, being Python, I don't know exactly how the syntax works, but being Python, that TG object, the task group doesn't actually disappear at the end of the with block. So if that's got results stored into it, then you still have access to those and all of the information about the task group, even after you've waited for it to complete running. Oh yeah, that's cool. Yeah. So I think this is a nice addition to async IO and Python. This is cool. And apparently 3.11 is coming. Yeah, coming in 3.11. I do see a
Starting point is 00:19:40 question from Sam Morley in the chat there. Is there a way to short circuit so that you don't re-catch certain exceptions? My understanding, and Michael, if you've got a better one, correct me, is that the accept blocks work in the same way as regular ones. And the first one that matches a particular exception will handle it. And the later ones don't, even if they would also match. So if the spam error is a subclass of foo error, but there's another subclass of foo error,
Starting point is 00:20:09 then spam errors will get handled by the spam error handler. The foo error handler will handle all the other ones apart from the spam error subclass. Yeah. I don't know much about the except star other than it was basically a requirement for the task group stuff to be implemented properly. So when came in then the other could come in yeah it's the the
Starting point is 00:20:29 only feasible way to actually do something as a result of an exception group otherwise you you do end up with you know a very generic exception and then you write a for loop over all the exceptions that it handled and try and figure it out yourself. So you'd end up rewriting the code and it was just not going to be feasible. It needed to be syntax. And so it is. Yeah, right on. Very, very exciting and very timely. Thanks, Steve.
Starting point is 00:20:54 I'm kind of glad that I put off learning how to do async code until 3.11. This looks easier. It's a good band and a good time for async.io. Well, cool. All right. I guess I'm up with the next one, huh, Brian? Yeah. Let's see what you got. I have got some other interesting things. I'm here about showing off the underappreciated projects or the new projects. Just a couple of
Starting point is 00:21:16 stars here. And we've talked about overloading before, but I thought this was a clean way to do it that people could think about. And Steve, I'll definitely love to hear your thoughts on this. So Felix the cat created this library called pi overload. And the idea is basically once you have type information, then you can have method or function overloading the idea of being like, okay, I have a function called boo or whatever. And if it you can say if it takes an integer, I want this implementation run. If it takes a string, I want some other implementation to run, right? That's sort of the traditional c plus plus c sharp definition of it right but in python we don't have that really because the language started without type so how are you going to figure out the type to overload it you know right that just like doesn't make any sense
Starting point is 00:21:57 so with this one you could sort of use like traditionally you could use is instance we're going to do one thing or another is it a single thing or is it a list of those things what are we going to do but with this one you can put just at overload and then whatever the signature is if you can say it has no functions or has no parameters or it has like two integers or it has three integers or it has um like a list of them whatever and there's even a way to sort of say uh somewhere down here, there's a way to say like, if none of them match, call this particular one. So basically, it's just straight function overloading in Python, if that's the thing you want. Steve, does this make you cringe or do you like it? Well, you know, I'm not going to lie. I'm not the most into static typing in my Python code, as a lot of other people. Uh, and, and there's a
Starting point is 00:22:46 lot of, uh, you know, there's a lot of complicated reasons, but I think for a situation like this, um, I mean, if I know if I was writing a function that took a string or an end, the very first line would be converted to whichever type I actually want. And then the rest of the function is going to look identical. Um, and that's sure. And that, in that case where like there might be a unparse type of thing, for sure. I think you wouldn't really do an overload. That would be insane. And my, my kind of gut feel, and you know, I'm always open to, to examples proving me wrong. In which case I, you know, I would write the instance code that's in those examples. You know, my, my kind of gut feeling is that if you're doing two drastically different things
Starting point is 00:23:22 in the function based on the type, you need two functions. And once you've got two separate functions, if the people calling don't know what they're passing you, then they've got a problem. And it's not so much my responsibility to fix it with overloading. That said, overloading is really cool. And I am the exact opposite person when it comes to C and C++. I will do all the craziest possible stuff with overloading in those languages because
Starting point is 00:23:48 I think it fits the language and it's a lot of fun. And there's definitely occasions and value for having it in Python. We do have the single dispatch decorator, has been part of Python for a while, which will do this on the very first parameter. This, you know, very trivially extending it to the whole function signature is is really cool so you know it's if i needed to do this i would probably want to use this a library like this um would i you know i'd probably i i would probably reconsider my api design choices up to that point uh but but can understand the attraction of getting to reuse the
Starting point is 00:24:28 name and not make the person calling it think too hard about what's actually going to run. Yeah. The place where this sort of seems interesting to me is there's a lot of tricks and juggling people do with like star args,w args we're like okay depending on how you pass it stuff we'll do a bunch of things yeah and i'm always looking for a way to like not do that yeah how can i'm not how can i remove that like it's completely opaque i have to do a google search and read the docs to figure out what is it at all possible here well one of these days i i'm going probably going to take all of the kind of patents for that kind of thing that I've collected and turn it into a book. But writing a book just feels like way too much work.
Starting point is 00:25:08 So, not anytime soon. Sorry. My colleagues at work can ping me at any time and I'll give them a patent for what they're trying to do. But that's... I do have quite a set of, oh, you're trying to make stuff weirdly work in this way. Here's a nice way that you can enable that without having to resort to type checks and everything. Yeah, yeah. I've been using Python for a long time. And I do remember one of the first things that I noticed is I couldn't do overloading. And at the time, so this was many years ago,
Starting point is 00:25:41 I was using a lot of overloading in my C and C++ code. And, and I was like, Oh, I can't do overloading. But one of the things I've noticed is actually the instead of keeping wishing that I had overloading in Python, I've noticed that I don't really use it in C and C++ anymore. I've, it's gone the other way. Yeah, I really, I'd rather be more explicit about the and just have a function that two functions that that are some maybe they're similarly named but they have an appendix that's uh that's different so that if you have different data you pass it and i'm with you michael i'd rather have people go well which one do i need i'll look it up then just uh passing the wrong
Starting point is 00:26:22 data type and having me so because you know you know, sometimes if they haven't converted the data, like string versus number is a scary one for me because I'm often getting my numbers from an API or something and they come in as a string if you forgot to convert it and you passed it to the wrong thing and you're really doing something completely different. That's not a good thing but i i got bit by that one just yesterday updating one of my um one of my ci builds to use python 3.1 i mean 3.10 uh but you know is it exactly is it a string is it a number
Starting point is 00:26:56 interesting yeah yes but yeah certainly that conversion would be where you know would be worrying the other one is is it a string or is it a list of strings and that's the one that bites us in python all the time and i don't even know how you resolve an overloaded function based on is it a string or can i iterate it well like in that case actually i would rather just have that part be part of the function at the top of it if it can handle both to to check the type and and iterate or not but you know yeah well all right let me close this out with two quick thoughts um first i think this is interesting because it's one of the things that's possible with modern
Starting point is 00:27:37 python like once we've added typing now you could consider this as a thing, whereas previously it really was highly impractical, I think, as a way to do it. So I think that's kind of cool. And then two, I think it might be an entryway for people who are not where Brian and I'll put myself in there as well. Yet I've going like, actually, these things I thought I need, I don't need those. Right. There's a lot of stuff I thought I needed and I haven't used it for three years.
Starting point is 00:28:02 So maybe I actually don't need it. But that's not how you maybe first approach, approach solving your first problem in Python that you're coming from C++ or whatever, C sharp, whatever this might be a gateway. So anyway, those are my two thoughts.
Starting point is 00:28:15 One more thought from Dean after Python 3.11, do we get Python 95? There was, you know, there was a windows 3.12. So I think Python gets to do a 3.12 as well i think it was only available in china interesting and i believe i like to follow on with that dean very funny um i believe that windows um 10 was named you let me know if you
Starting point is 00:28:41 know different steve windows 10 was named windows 10 because there used to be the check windows 9 as the the starting string for 95 and 98 so you can't be nine because then you're going to be 95 so we got a kick on past it there was some embarrassingly big um language runtimes out there still doing that check uh that that really struggled with Windows 9 and showed up in enough places that, yeah, I think it just made sense for everyone to just skip it. Not skipping 13. We're skipping 9. It's too unlucky. All right.
Starting point is 00:29:17 Awesome. Brian, what you got for us? Oh, what do I have next? I have the next generation Seaborn interface. So Seaborn is a really awesome plotting library built on Matplotlib. And I, you know, actually, I don't use it that much, but I've always been intrigued by it and kind of watching what plotting libraries do and stuff. And one of the things I was curious about, which I'm really grateful for this article, is some of the history behind it. So the article starts off next generation seaboard interface, talks about the background and goals. But some of the great things in here, let me grab some notes. This work grew out of a
Starting point is 00:29:59 long running effort to refactor seaboard internals so that functions, you know, anyway, where I wanted to get at was he was developing a refactor of the internals. And he's like, wait, wait a second, if I want to refactor it, maybe I should expose more stuff. And some of the background was Seaborn was originally conceived of as a toolbox to do of domain specific statistical graphics to be used alongside Matplotlib. So the intent was people would use both Seaborn and Matplotlib together. However, people are doing things differently. A lot of people just grab Seaborn by itself. Some people even just learn Seaborn before they even learn Matplotlib, which is an interesting thing. And that's how I thought you
Starting point is 00:30:43 were supposed to be doing this. But the concept was, and then over time, there's a whole bunch of features that have been added to Seaborn to where it's like really slick looking, but to do the same thing by hand in Matplotlib is a lot of work. So there's some things that like, if you, Seaborn's almost there, but you need to tweak it a little bit and you have to do things manually, well, then you have to just do everything by yourself. And it's a lot of work. So the idea around this, this, uh, a rewrite of the API is let's rework some of the internals so that a lot of the little sub components that go inside of a plot are exposed. Um, that way people can get access to it, to more fine-tuned configuration within the, so they don't really have to just do everything by hand.
Starting point is 00:31:29 It's either all or nothing, Seaborn or Matplotlib. You can kind of do both more easily, which is a kind of a cool idea. There's a whole bunch of great details in here that talk about some of the API changes. Basically, it's exposing the internal, if you create a plot, there's nothing there and it won't show up. You have to create layers on the plot. And then within the layers, you've got marks and, and different components that go into it. I kind of like this idea of building things up, but what I really like is the public aspect of this. So you've got a, you've got a library that's out in the open. It's being used by a lot of people already. And somebody's saying,
Starting point is 00:32:06 maybe we should tweak the API and do something different. And just going ahead and doing that in the open saying, hey, we're going to do this. There's a note at the top or I'm thinking about doing this note at the top saying it's a work in progress. Don't depend on these examples because things might change.
Starting point is 00:32:20 But this is the direction we're trying to go, trying to get feedback from people. And I think this is a lot of things that a lot of people struggle with when they're maintaining packages that have been around for a long time is, I want to do things a little different, but am I going to break everybody? And talking through it. So anyway, this is a great read, especially if you're a data plotting kind of person. Very nice. I always want to do more with visualization, and I'm sure that I have some good data plotting kind of person. Yeah, very nice. I always want to do more with visualization and I'm sure that I have some good data I could pull up.
Starting point is 00:32:48 Yeah. I end up basically just writing APIs on websites these days, but I really should be pulling this up and doing some of these graphs and I'm really happy these are around. Steve, how about you? See, Seaborn's great.
Starting point is 00:33:00 It's always like back when I first discovered it, one of its major selling points was simply importing seaborne would magically make your default matplotlib charts look nicer uh which which matplotlib is i love it it's like the bootstrap of matplotlib it it really was it's like they they just apply their style by default and every matplotlib chart suddenly look nicer which you know matplotlib's done their own styling work now. So it's less valuable for that. I do like this API. It looks good. And as Dean's pointing out in chat, it's like matplotlib has an object-oriented plotting API similar to this, possibly identical, just like everyone else. I've never learned the object-oriented API,
Starting point is 00:33:41 but it is there. And it's, you know, that's the modern one. It's like, I know a lot of people say Matplotlib is impenetrable and kind of hard to build things up, but it does have a really nice API there. It's just not the PyPlot one that kind of imitates Matlab's old API. And so, you know, having it there is really nice. And Seaborn, you know having it there is is really nice and seaborne you know having their own is also great um another uh nice uh thing that to read about in this is um uh he does a hat tip to gg plot or gg plot 2 or whatever it's called um saying that um yes it's gonna look a lot of this is similar to gg plot but uh it isn't that i'm trying to copy it or maybe that's that's definitely influence but um it is uh that seaborn is uh is important because
Starting point is 00:34:33 we think about things differently in python than we do in r and and and just having it would be but also a hat tip to another library that is a a wrapper around ggplot if you just want that you can do that in python too that's available so um it is interesting to these are we think of these as competing libraries but they're really not competing with each other they're working together to push the push plotting forward so yeah nice dean out there points out you can do plot.style.use seaborn or ggplot. Let me throw out. Oh, yeah, go ahead, Steve. ggplot is certainly the one to copy from.
Starting point is 00:35:12 I mean, there's a reason that one is as universally popular as any plotting library can possibly be. It's probably competing with excel for popularity of plotting data realistically it's it's it's a really nice api and it looks good and everyone's familiar with it and so you know there's nothing wrong with copying from ggplot nice i got one more shout out to throw uh into this conversation the xkcd plotting style or umplotlib. So you've got, I mean, this is fantastic. It looks like the, it really does look like XKCD would, you know, the comic would do for these. So this is fantastic. I love it.
Starting point is 00:35:57 What I love is I actually see this. I see this in papers and stuff like that. People just go ahead and use the XKCD style and for serious stuff. And it just is, it's awesome. I love it. I think there's actually some value to having like cartoony looking graphics, like UI sketches and graphs to say like, look, this is speculative. This is just like, don't read too much into it. I'm trying to give you an idea rather than an exact thing. And I think sort of a UI, like cartolooking sketches, and this also plays into that.
Starting point is 00:36:25 Yeah. Right? Steve, you got the last one? I got the last one, yeah. So this is another kind of recent delivery from the CPython core team. We can now compile CPython to WebAssembly. Wow. So, and to a lot of people,
Starting point is 00:36:39 that probably means very little, but I guess the brief, brief summary is WebAssembly is kind of what the JavaScript in your browser compiles to before it runs. So it's skipped that initial step of being JavaScript and it's now ready to run in the browser. So it's a lower level. There are tool chains out there that can compile all sorts of languages directly to WebAssembly. And so in this case, we've taken, I believe we use one of the, I don't know the exact tool chain that's used and it may not matter, but it basically takes the C code and compiles that to WebAssembly, gives you a package that can be brought into an Electron app or a Node.js app or a web browser, modern web browser and be
Starting point is 00:37:22 run in the browser. There there is so this page is a little bit dated uh there's been a bit more work since then but i found this is the best overview of where things are kind of at long list of c extensions that don't work probably unsurprising like the browser doesn't have a lot of this stuff in it yeah you don't have what no tk all the different apis the win32 api, whatever it was delegating to you. No TK enter? What? No. Yeah, no TK enter, no subprocess.
Starting point is 00:37:52 C-types apparently you can do. I've heard there is a libffi port to the Emscripten kind of platform. So how this kind of works is when you take you take WebAssembly in, into a browser, it has access to nothing. Like it starts off in a really enclosed kind of box of things that it can do. Um, and, and that doesn't actually work for Python at all, because the very early thing that tries to do is search the file system for what files it should be loading. Uh, so, so you, we actually build it as part of another platform. Emscripten is one platform that kind of polyfills
Starting point is 00:38:29 a whole lot of native-looking APIs so that code that's compiled on top of Emscripten is able to use it. And this little demo, which I just hit start, REPL, this is running on that. So this is a build of Python 3.11, Alpha 4, built with Clang, running on Emscripten. And I believe I can do this.
Starting point is 00:38:51 Do like OSLister. And it thinks there's a file system there. Now, that's not my file system. That's in memory. It can be changed to browser storage. But this is entirely in the browser. Like there's nothing downloaded. There's nothing running on my machine here.
Starting point is 00:39:04 There's nothing running in the cloud. It's literally in the browser. Like there's nothing downloaded. There's nothing running on my machine here. There's nothing running in the cloud. It's literally in the browser. I can probably freeze my browser with this. Like I can do an infinite loop and do it, do it. Let's see if this cuts me off. I'll just let that run. Hit clear.
Starting point is 00:39:17 What happens if you hit clear? Start it again. Are we going to start again? Yeah, now it's done. A refresh. Yeah. So, yeah uh so yeah and and the there's a second one that the actual build as it was committed supports which is wazi w-a-s-i uh that's a slightly different approach to adding all the functionality around a web assembly module
Starting point is 00:39:40 um it's so it's a little bit more flexible a a little bit more controlled. Emscripten is really like, give me POSIX system inside my browser, all in memory. And so we have two options. And these are available in the main branch. At the moment, we're not shipping pre-built modules for WebAssembly. That might be a possibility. If that's something that you'd like to see, then I guess go to discuss.python.org and post about it. It's probably a post there. I should have looked for a post there. But we're not currently doing pre-built releases,
Starting point is 00:40:14 but I think we could. I think this is one of these options where the WebAssembly build is totally portable. And so if we build it, we can distribute it. And then websites that want to do something like this could just download it from our servers and run it. So I think there's a lot of potential here. And it's at the potential stage, right? This is another stepping stone to bigger and better things. Our responsibility as the core team is to enable it. And now we really
Starting point is 00:40:44 want people to come in and pick this up and do awesome things with it. Firstly, so we can figure out what gaps still need to be filled, but also just to expand the growth and the reach of Python, to bring it into places that currently doesn't exist or can't work and give it new life in new places, open it up to new people. This is fantastic. Congratulations. And so the work for this primarily done by Katie Bell,
Starting point is 00:41:10 Christian Himes and Ethan Smith. So I think Christian got to do all the merge commits, but it's definitely been a number of people working on this for a while. Those are the, the, the primary three. I'm really excited for the possibility for this. I think one of the things that could be amazing, obviously running it in the front end is a thing that could be done.
Starting point is 00:41:31 I saw the documentation said it was about 10 megs to download it. I'm sure you can put that on like a CDN. So you kind of hit it once somewhere for a particular version of Python. That's pretty good. You know, we all have pretty fast things these days. Yeah.
Starting point is 00:41:45 It's still bigger than Zoom. Yeah. What gets me really excited though is putting that into an Electron.js app. Yeah, absolutely. Right. Because Electron.js is a really interesting way to bring web technologies cross-platform as much as I, I like, oh, I said an Electron app. Still, it's, it's really opened up the possibility for a lot of things, but it really has meant, okay, you're doing TypeScript, you're doing JavaScript,
Starting point is 00:42:08 and you just have to go full on in that world. So here you could still do like your front end and whatever, but having the core logic of that desktop app being in Python running in this, that's exciting if that can be put together. I should also add two things. Pyodide is a project that people have probably heard of before,
Starting point is 00:42:27 which has been working on this for considerably longer than the core team has. And so I think a lot of the patches that needed to happen have come from them. And they now get to spend more time focusing on the data science stack, which, because they've got ports of NumPy and Pandas and other libraries to actually do data science in the browser. And the other interesting thing that I saw was someone from CondaForge
Starting point is 00:42:50 suggesting that they could elevate WASM builds to their kind of automated level. And so all of CondaForge may suddenly become available to use in the browser on top of a build of Python like this. That would unlock so much. builder Python like this. Wow. That would unlock so much. That would be incredible. Interesting.
Starting point is 00:43:08 I imagine initially it would unlock a lot of bug reports. But we need to work through those first. Yeah, I was just thinking of take the top 1,000 most popular packages. Could you get 90% of those compiled to other WebAssembly things that then could be included and then imported here somehow. Exactly. And the top 1,000 with native code,
Starting point is 00:43:29 because it's only the native code, right? The Python code still compiles in the browser, just like it would in the CPython interpreter. It's only the native code that has to be ported and built. And so once that's done, then we're up and running. So the top 1, 1000 is probably more than you need yeah absolutely all right awesome i'm looking forward to seeing where this goes so many neat uh options there's there's just cool ways to say like ship the python runtime
Starting point is 00:43:55 to places where maybe it would have been hard to get now you drop this wasm file plus something that can run wasm and then now you've got a deployable shippable yeah c python runtime without tk enter and a few things but still you might not miss it depending what it depends what you're doing i mean most apps are not tk inter apps is all i'm saying i'm not trying to bang on it no no but i just haven't every time it comes up that it's still there i'm like really we still have that okay don't ask me what i've been spending my week working on brian it's not going to make you happy are you are you creating a tk inter base killer for against textual no no unfortunately not
Starting point is 00:44:37 awesome all right well brian are we at extras we are at extras do you want to kick us off i will kick us off so i've got a couple of things that i think are interesting let's start with this one so we've talked about oh uh oh my z shell right yeah a lot love it i just came across uh realizing that actually this is a portland company that puts together the sort of core maintainers of, um, that. So I just thought it was funny to give a quick shout out to, um, planet Argon. They're not really in the Python space, but they're in Portland, which I thought was kind of fun. Uh, and then what is this? This, uh, next one comes to us, I think via PyCoders. That's where I got this. Uh. Django just reformatted all of Django with black.
Starting point is 00:45:27 And I know I was just having a discussion with somebody like, oh, your code doesn't follow PEP 8. Oh, I don't want it to follow PEP 8. Yeah, but if people are going to use your code, like literally you got imported, then it probably should follow, like it should not come up with all sorts of warnings. And so I thought it was interesting
Starting point is 00:45:42 that Django just said, everything, make it black. Steve, what do you think about that? I'm totally on board with just using black on everything. I don't agree 100% with the style, but I agree 100% with not arguing about it. So it's close enough. It's close enough. Yeah, plus there's enough tweaks that make it good.
Starting point is 00:46:06 I'm really grateful that you can tweak the line length for instance yes because i mean here's example what if i want it really short so uh for no for seriously for formatting the code for the the py test book i wanted them all quite a bit shorter so that they fit better in a book format. And I could use black to cover with that and convert everything with black to make them like that. So it was great. Nice. Awesome. All right.
Starting point is 00:46:33 And the final one is I have been doing some stuff with more fun things on YouTube, trying to put these little short videos together. So here's a, how long is it? Six minutes, 44 second video on using time Delta to get like how many weeks are in some time span. So, you know, the cool tricks you can do there. So people should check that out. That's my latest Python short thing.
Starting point is 00:46:54 And yeah, that's it for my extras. Okay, so I've got a quick one. Just, I've got, I don't have a graph, something to throw up, but I just, I was looking at looking at the git history of um of a repo and trying to figure out whether i i included one of my co-workers uh branches in it if i merged it yet or not things like that and i was on the command line and i just learned i'm like can i just do this with the command line apparently i didn't
Starting point is 00:47:20 know this exists so apparently a git log dash dash graph just shows you the get graph, your branch history or the branch graph on the command line. And I didn't know it was there until today. I started using it, tweeted about it. And then a whole bunch of people said, oh, you should use these flags, too. That makes it even nicer. So it's fun to learn learn something old some as a new thing and then somebody else told me how about git k so uh git k is a um uh is a graphical browser of your repository that just comes with most git installs that i didn't know was there i'm like
Starting point is 00:48:01 do i need to install it i'll just type it and see what happens. And it popped up this graphical interface. I'm like, this is great. This is exactly what I wanted. So GitK is pretty cool. I didn't know about that one. I've seen the command. I've never actually run it to see what happens. So I was not feeling quite brave enough.
Starting point is 00:48:20 Did it mean Git kill or was it something else? I was just most Git commands scare me until I've run them the first 100 times or so. How about you? Yeah, I got a couple of extras. Can I get my screen back up there? I was feeling a little bad about being a bit self-serving here, but then Michael just promoted his video series. So I don't feel too bad anymore.
Starting point is 00:48:40 Get it on, man. This is the Python 3.11 Alpha 5 download page. And we have a new addition this time around, which is this Windows installer for ARM64. So ARM64 is not a massive, massive platform for Windows yet, but it's growing and we want to have Python support on it. So the builds have been running in the background for a while, but we've never actually released it.
Starting point is 00:49:01 We're hoping to get it out with 3.11. That is going to depend largely on do people use it do they love it do they hate it my experience so far with it has been that it is noticeably faster on at least on the arm 64 devices i've had access to compared to the intel devices which is really really cool um and there's there's like the test suite is uh kind of 30 to 50 faster which is huge huge really so so i think there's a lot of potential here i may just have had awesome hardware i'm not sure it was a virtual machine so it's kind of hard to tell yeah uh but this is fantastic this is new uh if you have an arm 64 device like a surface pro x or there's a couple out there from other manufacturers i'm
Starting point is 00:49:45 running windows 11 arm on my macbook pro and through parallels i then you please in download and install it and let me know how it goes if you get it through the windows store which is currently still not public you need to get the the link from basically from one of my tweets to the windows store you'll automatically get the ARM64 version on ARM64 as well. So this installer is the traditional one. Otherwise, you get it through the store. The other thing, which I wasn't going to do, and then I spent a bit of time working on this,
Starting point is 00:50:16 a couple of years back at the Core Dev Sprints, I forget who I was chatting with. I was chatting with one of the other Core Devs about everyone typing from collections import deck uh and misspelling it and it's like you tell you know so a deck dque is a double-ended q very useful data type for certain purposes but people would type it deck like deck as in deck because it's phonetically what it sounds like exactly and so as as a bit a joke, I made a package that when you installed it, it would give you from collections import deck.
Starting point is 00:50:49 And obviously the thing that that collection should be is a deck, a double-ended queue of 52 cards representing the cards in a normal deck. And over time, for various reasons, it's just kind of grown. And I recently added support for calculating poker hands to it and so now you could build a game with this uh it does it uses enums it's got shuffling dealing uh jokers are optional um and you can calculate a poker hand and yeah you can even compare them poker hand one greater than poker hand two i i spent a lot of time uh most of my work on this over the last week was writing the tests
Starting point is 00:51:27 that proved how incorrect that function was until I wrote the tests for it. But now at this point, yeah, you can look at the values it gives back. It's actually a tuple with an enumeration saying what the hand is, and then a selection of the card values in a way that makes the tuples comparable. So you can actually look and see, you know, it's a pair of aces. It'll have the number 14 there for the ace. And the next highest card was a 10. So if someone else has a pair of aces and their next highest card was a nine, then you're still going to compare higher.
Starting point is 00:51:58 So I'm pretty proud of that function. Yeah, that's clever. But yeah, this is, and it's code style black. Nice. Oh, very nice. Yeah, it's clever. But yeah, this is, and it's code style black. Nice. Oh, very nice. Yeah. It's one short file. And it does still override DEC in the collections module for you.
Starting point is 00:52:14 I love it. It doesn't, so that DEC isn't there, right? It's like D-E-C-K-E is untouched. But if you try and import D-E-C-K from collections, then you'll get it. Nice. Hey, one other quick thing to shout out. We're hiring contractors to help develop features for PyPI.org.
Starting point is 00:52:32 It says at the top of PyPI.org. Do you know anything about this? I guess if people want to work on PyPI.org, that's pretty neat. Yeah, no, they have funding. And there is a post that describes the surveys i believe this is the organizational accounts project they're looking at yeah organization accounts so uh if if like me you are kind of the one of the primary python people at your company then you'll spend a lot
Starting point is 00:52:59 of time helping people publish packages to pipei if that's the business you're in. Certainly is for us. There's a lot of packages from Microsoft up on PyPI. And the kind of corporate account for that is, it does exist. We have a user account with 483 projects. This is all manually curated right now because PyPI just doesn't have the functionality to kind of hand out permissions to it safely. The teams and all that kind of stuff.
Starting point is 00:53:26 Yeah. So I believe the idea of this is to add that functionality to PyPI. So I would love it if someone comes along and does this. I believe we've contributed some of the funding towards this. So, you know. Yeah, it looks like it. So Steve, I've got a 3.11 question for you. So 3.11 is an alpha.
Starting point is 00:53:46 So what does that mean really? Does that mean I can like start using 3.11 or should I wait? It means you can. It means we still may change stuff that will break you and we won't apologize. Okay, but if my code runs,
Starting point is 00:54:02 can I trust it? Or? Yeah, if all of your tests pass, then you should be able to trust it fairly well. Certainly existing code. There will be new features available in the alpha that have not been as thoroughly tested yet or may change again. But again, if you're running existing code, you won't be using those.
Starting point is 00:54:21 So that won't matter. But yeah, it's totally viable to use. You can specify 3.11 dev on GitHub Actions. I believe it compiles from source when you do that now. They don't have a build there. They should for beta. Beta is when we really want people to start doing stuff. At this point, alpha is so that people can test the new features, kind of targeted testing on anything new that we've put in. Beta is when we really want people to start porting libraries, especially kind of the core libraries to be able to work with it and just test it. Because if existing code doesn't work on the beta, we want to hear about that so we can fix it in the runtime and not force you to fix it
Starting point is 00:55:01 in your code. Okay. But if I'm like a package maintainer, I can start, if it's got GitHub actions for it, I can start testing, having my CI test against 3.11 then too. Absolutely. You will likely want to mark it as it's okay if it fails. Okay. Yeah, awesome. Okay, thanks.
Starting point is 00:55:17 Should we do a joke? Let's do a joke. Let's do a joke. All right, so this one, coming from the programming humor one, and it's a, like you talked about the visualization stuff, right? And this one, it says, there's a search that says how to get labels on MATLAB bar charts to be horizontal. Look what the result came back from Google was, says you're not alone. Help is available. If
Starting point is 00:55:42 you're experiencing difficult thoughts, please call 116-123. Or if this is an emergency, call 999. And the underlying bit here is it isn't that drastic, Google, but thanks. And I believe it might also work on Bing. I'm not sure. If you scroll down, I think there's a Bing equivalent down here. Yeah. Not just Google. Bing thinks you're an emergency as well um yeah that's awesome yeah so it's not
Starting point is 00:56:13 that that much of an emergency i'll go to stack over for you nice to know that the big search engines are looking out for our mental health that's right people become very upset after failing to get those bar charts. This is not the emergency, but it's coming next when you realize what the answer is. Something, I don't know. Nice. Anyway, that's the joke I found for us, guys. Well, thanks everybody.
Starting point is 00:56:37 Thanks, Steve, for coming. And thanks, Michael, again. Yeah. Thanks for having me. Thanks, all. Bye. Bye. Bye, Ron. Thanks for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. Get the full show notes over at pythonbytes.fm. If you have a news item we should cover,
Starting point is 00:56:56 just visit pythonbytes.fm and click Submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click Livestream to get notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.