Python Bytes - #271 CPython: Async Task Groups in Python 3.11
Episode Date: February 16, 2022Topics covered in this episode: fastapi-events Ways I Use Testing as a Data Scientist py-overload Next-generation seaborn interface Compile CPython to Web Assembly Extras Joke See the full show ...notes for this episode on the website at pythonbytes.fm/271
Transcript
Discussion (0)
Hey there, thanks for listening.
Before we jump into this episode,
I just want to remind you that this episode
is brought to you by us over at TalkPython Training
and Brian through his PyTest book.
So if you want to get hands-on
and learn something with Python,
be sure to consider our courses over at TalkPython Training.
Visit them via pythonbytes.fm slash courses.
And if you're looking to do testing
and get better with PyTest,
check out Brian's book at pythonbytes.fm slash PyTest. Enjoy the episode.
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 271. Really? Wow. Recorded February 16th, 2022. I'm Brian Ocken.
Hi, I'm Michael Kennedy.
And I'm Steve Dower.
Welcome, Steve. So who's
Steve Dower? Who's Steve Dower? Yeah. A number of things. Probably most interesting to this
audience is I'm a core developer on CPython, one of our Windows experts. So I spend a lot of my
time focusing on making Python run better on Windows. I also work at Microsoft, where I also
spend a lot of my time making Python run better
on Windows. So I'm kind of a bit of a one-trick pony, I guess. But I feel like it's good work,
and it helps a lot of people. So if I have a problem with Python on Windows, it's your fault.
If there's solutions for Python on Windows, then it's my fault fault i'll let other people own the problems so if i go to the windows store i can
now install python from there and you were you were part of that right oh i should i should have
had that up on screen shouldn't i um yeah that that was uh that that was actually um the the
request came from people within microsoft were like hey why can't why can't we get python up on
the store?
And my response to all of these is like,
well, if the community is willing to do it,
which is half me and is half the people who would have to take over if I stopped doing it,
then yeah, we'll go ahead and do it.
And so I got actual work time for that.
That was a contribution from Microsoft for that one.
But yeah, the community was on board
and it's going really well.
That's also the one that we tied
into the default Python.exe
that's on every Windows machine now.
So if you go to a brand new machine
and just type in Python,
you'll get straight to the PSF's Python.
It's not, Microsoft is not doing it anymore.
We just contributed the change
and now I switch hat
and do it with the other hat on.
So, you know, it's real Python, right? It's exactly the same as what you get from python.org it's just delivered
you know easily fast install automatic updates um and a couple of edge issues that we're working on
bringing down so yeah fantastic automatic update i know this wasn't one of the topics but
now i think i might have to rethink how i'm installing python on my desktop that work so that's a cool idea i i have only had
store installs on my own machines since 3.8 i think i haven't apart from testing i haven't
actually used the regular installer on my i mean you Of course that makes sense. I mean, you know,
it's always testing, right?
Every time I'm using Python,
I'm testing.
Chris May out there says,
thank you so much
for making my work life
in Windows easier.
Anytime.
Well, Michael,
why don't you kick us off
with a story or topic?
I have got a good one.
So I'm a big fan of FastAPI
and FastAPI being built on Starlet.
So by the transitive property, I'm also a fan of Starlet.
And there's this thing I want to cover called FastAPI events.
So when a request comes in to a particular API endpoint,
or if you convert it over to a web app, to a web page sort of request or something,
you might want to dispatch that out to say like WebSocket
listeners or something along those lines. So there's this cool project called Fast API Events.
It's pretty small and new. So I'm going to try to give it some visibility. It's only got 36 stars.
It's pretty new. But the idea is that you can go through and basically create this middleware
handler that will let you say when a request comes in, here's the way when an event is raised,
here's the thing that's going to handle it.
And then in some API endpoint, you can say dispatch,
give the event a name and some dictionary data
to be passed along.
I suppose it doesn't have to be a dictionary,
it could be whatever.
And then in other parts of your code,
you can say, I want to just hear about this event
that happens no matter what API endpoint received it,
no matter where in like how deep
down in the code it was received and so on. So then way down here, you just put a little
handler decorator on there and say, I want to capture all the events that start with,
you know, some substring like cat star for like category, whatever, or this one is actually
literally about cats. And then you can just go through and write
these functions that will then handle that and you know you can do whatever you want you can
also pass them off to queues like you can use the um sqs the simple queuing service from aws i believe
that is as the endpoint instead of it just being your app right so if you've got like lots of scale
out and stuff like that wow cool so just like a neat way to to do logging or even distributed logging i guess if you've got forwarding handlers in there you can just yeah yeah it seems
like it right like um or if you know you want to sort of build up like here's the request transaction
and here we're at this stage or like you could maybe do like a visibility into long-running
workflows with this kind of thing or something along those lines i would think so yeah there's
also an echo handler for debugging.
I kind of like that.
Like if I just need to see what is happening, it'll just print whatever's happening.
It'll just start printing out all the behaviors that you're logging.
So, and then when you want to stop doing that, you just take away the handler and you don't
have to search the entire code base for print and find everywhere that you added it in for
debugging.
Exactly.
Alvaro out there says this looks similar to Django events.
Yeah, I suspect it is similar.
Anyway, pretty short and simple,
but if you're looking for a way to sort of put notifications
in a structured way into a fast API app, well, here you go.
Oh, I'm thinking of a whole bunch of more abusive ways to use this.
Yeah, you can write some really impressive spaghetti code with this.
Yeah, I'm sure that you can.
Get the cloud involved in everything.
Yeah.
So let's switch gears a little bit and talk about testing.
Imagine that.
I've got a testing topic.
So I'm pretty excited. This is I've been asked a lot about testing pipelines, testing data science stuff. And, and I'm not, that's not
something I do day to day. So I'm really glad to find people talking about it. So this, we've got
an article from Peter Baumgartner, ways to use testing, ways I use testing as a data scientist.
And I actually, I just really love this article.
It's great.
To start with, he starts off with what he uses testing for.
As a data scientist, he uses testing to make sure things work, to document his understanding,
and to prevent future errors.
Well, that seems straightforward, but the,
the reason why he wrote, wrote this up is apparently because there's a lot of software,
there's a lot of testing stuff out on the web, but it's not, it's like geared towards
test engineers or, or software developers. And he's like, I'm not a software developer. I'm a,
I'm a, and you know, I'm doing something else. I'm doing analysis. I'm, I'm not a software
person, even though, yeah, yeah, you are. Um, but, uh, but to write this up in, in a, a context where
data people might understand it better. Um, for instance, uh, he doesn't even start off with
writing, having written tests. Um, his, his analysis is like, if you're doing notebooks or other code, just use a cert a lot.
So he's using a cert all over the place, including he says, where do you have use it for as many
intermediate calculations and processes as, as you can, as it makes sense. Because in doing things
like checking obvious stuff, like he's got an example of a table count where he's counting up all the yeses.
Well, you can do a little bit of math just to make sure the math works.
So like all the yeses and nos and missings should all add up to the same count.
Go ahead and throw an assert in there because sometimes it doesn't.
And in this example, he said that he actually caught an error because he was looking at two different uh data frames um so they really weren't they didn't add up to the same so you can
catch things like that so just double checking yourself on on things as you go wrong away go
as you're developing one of the cool quotes he has in here is like, as he has a habit of when he's using notebooks to whenever he's visually
inspecting the output,
if you're visually looking at the data that comes out,
maybe write a tech and the search statement to do that analysis so that it's
always checked.
And this is a cool use of putting a certs in notebooks.
I like this idea.
The article goes on.
It's pretty extensive talking about checking the data, using hypothesis to, well, not the data at this part, but your assumptions around the data.
So using hypothesis to check your assumptions and hypothesis will show you things that maybe you didn't
consider like uh nans are you handling those correctly um empty series or empty data structures
that are going into your uh into your code are you handling those if i mean hypothesis does have
take some handholding but it does make you think about really what is the shape of the data going
in um and do you can you do you
need to limit it uh what hypothesis is looking at or do you need to change your code to handle more
things hypothesis is great i've used that for a couple of uh parsing projects or combining projects
i spent way too long um adding all the strategies to be able to test a URL parser that I was calling into.
But it's fantastic for finding kind of things that you would not have thought of.
Yeah.
I mean, it's finding things, but it's also, and it does make, yeah, that aspect of it seems like the point of it.
But the real value I get on a hypothesis is thinking, making sure I really understand
the data that's going to come in and thinking through those.
It goes on to talk about actually testing your data using things like Pandora,
which I wasn't familiar with, and another package called Great Expectations
to look at putting schemas around the data coming in
and making sure that the data always matches the schema.
Going on to talk about a range act to assert and using PyTest.
PyTest comes in with,
he's only really writing formal tests
when he's writing libraries for other people.
But all these other packages
to be able to test with data science,
I think this is a great addition
to the data science community.
Yeah.
Alvaro talks about how this is,
you know, often referred to as defensive programming.
And then, you know, I feel for him a little bit.
He says, at work, we use this with our Fortran code.
So there's that.
But I do think this is a really interesting way of thinking about defensive code.
I think of writing defensive code as like, oh, I'm going to have a bunch of is statements
to verify this thing's not none or verify that this is the right type and that it has
a reasonable value and raise exceptions.
I haven't really thought so much of it for like notebooks.
So that's pretty interesting.
And one of the neat things about like,
if you're actually putting a search in your code, you can actually,
you can write tests against your code that don't even have any certs in them.
And because the search will happen within your code and the test will still fail and catch it so it's kind of cool yeah yeah very cool good stuff yeah uh steve i am super
excited to hear about what you're you got coming up because this is brand new being a core developer
i feel it is appropriate that you break this i news here i mean i'm not gonna lie when it came
to you know what am i going to talk about, what's the most recently accepted PEP that was somewhat controversial?
And I think just as you kind of look down to the section on rejected ideas, which is considerably longer than the accepted ideas, you could probably get a bit of a sense for just what went on with exception groups.
And I know, Michael, you just had a conversation. You've
learned all about them. So you can take over when I run out here.
I'll share my thoughts with it. But yeah, go ahead. I'd love to hear about it. This is
sort of inspired by Trio, right? The end goal kind of is. So this is an
interesting pep. And we've got a few of these on the go at the moment. It's kind of like a
stepping stone towards a better programming model or a stepping stone
towards better libraries.
So it's something that I think in my opinion, very few kind of application developers, kind
of the last developer in the chain is often not going to use them and they're not going
to need them.
But as you go further in towards the lower levels of libraries,
especially people writing async schedulers,
are going to find incredible value out of them.
Essentially what the idea is,
is that when you're running multiple tasks in parallel,
if some of them fail,
we don't currently have a neat way
to capture the exceptions from all of the ones that failed.
There's some approaches that would be like,
wait for all of them to complete and wrap it in a list. And then you get some exception
that contains a list of exceptions, but that's lost a whole lot of context. You can get just
whichever exception happens first, but then you lose all the other exceptions. And there's just
been no real way to handle it. So an exception group essentially does bundle up all the exceptions
internally in some
way. But the really interesting thing is the except star syntax, which I'm going to have to
scroll a long way down to find where that comes up. But this is really clever because if you're
in that situation where say you're running 10 parallel processes, So here's kind of the first example of it. Then exceptions are no longer control flow at this level.
Because if you've run 10 things
and you're waiting for 10 things to complete,
you're not actually doing control flow
with the exceptions anymore.
What you're doing is handling the exception,
but then the control flow is going to go back
to where it was anyway,
because you're going to be doing something different. So for
example, if a file doesn't open, then you would want to do something different, right? You're
going to stop going on and trying to read from the file. But if you've tried to open 10 files
and three of them failed, at the outside level, so at the inner level for each file that may have
failed, you'll do something different. At the outer level, all you're really going to do is say, hey, this task failed because a file couldn't be opened.
And maybe you do something else, but it's at the outside level.
So AcceptStar takes that exception group and it's going to give you a chance to handle each exception essentially on its own.
It will group them together.
So in this example, if five tasks report report spam error then you'll get into this except
spam error block with all five of them at once um which is just uh what is that a list of spam
exception spam error exceptions something like or tuple something like that i think it's i think
it's a tuple i think with the the star center i believe um something iterable, basically, yeah. Yeah. Something you can iterate over to see the exceptions,
but it's really just this happened at some point,
and you process it.
And if the group actually contains multiple types of exceptions,
then each handler that matches is going to be called
for all the exceptions that match that.
So you could have this try block raise an exception group
that has some spam errors, it has some raise an exception group that has some spam errors,
it has some foo errors, it has some bar errors, and all three except star blocks are going to
get called with the exceptions that match those, which is a bit, it's definitely going to confuse
a lot of people. It confuses me, which is why I was keen to actually spend a bit more time digging
into it and trying to figure out what's really valuable about this.
And I do think the most valuable one is really where the error is canceled error.
Because if for whatever reason, five of your tasks have been canceled,
then you need to capture that and do something with that outside of it.
But it doesn't necessarily mean you want to throw away the five successful results.
And so you do kind of want to
keep a bit of everything going on. And like I say, it's a building block. On its own, this isn't
enough to do anything new and useful. The next thing that comes along is task groups. And that's
being worked on by, I expect a lot of the same people who worked on exception groups, because with task groups,
now you can actually start,
there we go, Guido's just merged task groups.
Excellent.
Then now you can actually run the task group.
And if the group raises any errors,
then you'll catch them through an exception group.
And so that enables a whole lot of new uses
and new ways to use async IO or just async
generally in the metadata library. As you say, trios already had something like this for a while.
Yeah, from their nursery thing.
Yeah. And so that is now being standardized so libraries can kind of share their implementations
and work together on it.
So one of the reasons you need this is if I start two web requests and three database queries,
and then I go to wait on them, you know, then if several of them fail,
the error state captured in totality is a tree of errors that represent,
well, this task started this other task, which then had this error, this other one, right?
So you need some way to deal with a group of errors that could happen kind of all at
once, right?
In one of these task groups that gets kicked off.
Yeah.
So the new task group thing is super cool.
So you say async with task group as TG, and there's two things that are neat about it.
One is right now, if you fire off a bunch of tasks in async and await style,
they're basically unrelated. Like if one fails, that means nothing for the other, right? They're just like, well, here's a bunch of stuff that happened. And this creates a relationship between
them, right? So that if one fails, I think it might not schedule new ones, something to that,
like it's brand new. I'm just seeing the tweets. So I think that that's the story. I believe that
was the story of Brio. The other thing that's interesting here that
in this example, which I'll link to from Yuri that he posted, he tweeted about the news,
was notice that the first one says task group, create task for some task, and then await
something that creates another task. There's nowhere where you say, store all those values
into like some lists of tasks, then go to
the task and iterate them and wait for them or gather them or whatever the heck it was you had
to do before. This now makes tasks fire and forget, I can say run this, run this. And within that,
it could do more of those types of things. And then you just block at the width context manager
level to wait for all the tasks to finish, which I think is a real big improvement. Because right
now you've got to like constantly juggle, well, I've got to return a task from
this so I can go wait on it later and all those sort of oddities.
And this cleans up a lot of that.
And of course, being Python, I don't know exactly how the syntax works, but being Python,
that TG object, the task group doesn't actually disappear at the end of the with block.
So if that's got results stored into it, then you still
have access to those and all of the information about the task group, even after you've waited
for it to complete running. Oh yeah, that's cool. Yeah. So I think this is a nice addition to async
IO and Python. This is cool. And apparently 3.11 is coming. Yeah, coming in 3.11. I do see a
question from Sam Morley in the chat there. Is there a way to short circuit so that you don't
re-catch certain exceptions?
My understanding, and Michael, if you've got a better one, correct me,
is that the accept blocks work in the same way as regular ones.
And the first one that matches a particular exception will handle it.
And the later ones don't, even if they would also match.
So if the spam error is a subclass of foo error,
but there's another subclass of foo error,
then spam errors will get handled
by the spam error handler.
The foo error handler will handle all the other ones
apart from the spam error subclass.
Yeah.
I don't know much about the except star
other than it was basically a requirement
for the task group stuff to be implemented properly. So when came in then the other could come in yeah it's the the
only feasible way to actually do something as a result of an exception group otherwise you you do
end up with you know a very generic exception and then you write a for loop over all the exceptions
that it handled and try and figure it out yourself. So you'd end up rewriting the code and it was just not going to be feasible.
It needed to be syntax.
And so it is.
Yeah, right on.
Very, very exciting and very timely.
Thanks, Steve.
I'm kind of glad that I put off learning how to do async code until 3.11.
This looks easier.
It's a good band and a good time for async.io.
Well, cool.
All right.
I guess I'm up with the
next one, huh, Brian? Yeah. Let's see what you got. I have got some other interesting things.
I'm here about showing off the underappreciated projects or the new projects. Just a couple of
stars here. And we've talked about overloading before, but I thought this was a clean way to
do it that people could think about. And Steve, I'll definitely love to hear your thoughts on this. So Felix the cat created this library called pi overload. And the idea is basically
once you have type information, then you can have method or function overloading the idea of being
like, okay, I have a function called boo or whatever. And if it you can say if it takes an
integer, I want this implementation run. If it takes a string, I want some other implementation
to run, right? That's sort of the traditional c plus plus c sharp definition of it right but
in python we don't have that really because the language started without type so how are you going
to figure out the type to overload it you know right that just like doesn't make any sense
so with this one you could sort of use like traditionally you could use is instance we're
going to do one thing or another is it a single thing or is it a list of those things what are we going to do but with this one you can put just at
overload and then whatever the signature is if you can say it has no functions or has no parameters
or it has like two integers or it has three integers or it has um like a list of them whatever
and there's even a way to sort of say uh somewhere down here, there's a way to say like, if none of them match, call this particular one. So basically, it's just straight function
overloading in Python, if that's the thing you want. Steve, does this make you cringe or do you
like it? Well, you know, I'm not going to lie. I'm not the most into static typing in my Python
code, as a lot of other people. Uh, and, and there's a
lot of, uh, you know, there's a lot of complicated reasons, but I think for a situation like this,
um, I mean, if I know if I was writing a function that took a string or an end,
the very first line would be converted to whichever type I actually want. And then the
rest of the function is going to look identical. Um, and that's sure. And that, in that case where
like there might be a unparse type of thing, for sure. I think you wouldn't really do an overload. That would be
insane. And my, my kind of gut feel, and you know, I'm always open to, to examples proving me wrong.
In which case I, you know, I would write the instance code that's in those examples.
You know, my, my kind of gut feeling is that if you're doing two drastically different things
in the function based on the type, you need two functions.
And once you've got two separate functions,
if the people calling don't know what they're passing you,
then they've got a problem.
And it's not so much my responsibility to fix it with overloading.
That said, overloading is really cool.
And I am the exact opposite person when it comes to C and C++.
I will do all the craziest possible stuff with overloading in those languages because
I think it fits the language and it's a lot of fun.
And there's definitely occasions and value for having it in Python.
We do have the single dispatch decorator, has been part of Python for a while, which
will do this on the very first parameter.
This, you know, very trivially extending it to the whole function
signature is is really cool so you know it's if i needed to do this i would probably want to use
this a library like this um would i you know i'd probably i i would probably reconsider my api
design choices up to that point uh but but can understand the attraction of getting to reuse the
name and not make the person calling it think too hard about what's actually going to run.
Yeah. The place where this sort of seems interesting to me is there's a lot of tricks
and juggling people do with like star args,w args we're like okay depending on how you
pass it stuff we'll do a bunch of things yeah and i'm always looking for a way to like not do that
yeah how can i'm not how can i remove that like it's completely opaque i have to do a google
search and read the docs to figure out what is it at all possible here well one of these days i i'm
going probably going to take all of the kind of patents for that kind of thing that I've collected and turn it into a book.
But writing a book just feels like way too much work.
So, not anytime soon.
Sorry.
My colleagues at work can ping me at any time and I'll give them a patent for what they're trying to do.
But that's...
I do have quite a set of, oh, you're trying to make stuff weirdly work in this way. Here's a nice way that you can
enable that without having to resort to type checks and everything.
Yeah, yeah. I've been using Python for a long time. And I do remember one of the first things
that I noticed is I couldn't do overloading. And at the time, so this was many years ago,
I was using a lot of overloading in my C and C++ code. And, and I was
like, Oh, I can't do overloading. But one of the things I've noticed is actually the instead of
keeping wishing that I had overloading in Python, I've noticed that I don't really use it in C and
C++ anymore. I've, it's gone the other way. Yeah, I really, I'd rather be more explicit about the
and just have a function that
two functions that that are some maybe they're similarly named but they have an appendix that's
uh that's different so that if you have different data you pass it and i'm with you michael i'd
rather have people go well which one do i need i'll look it up then just uh passing the wrong
data type and having me so because you know you know, sometimes if they haven't converted the data,
like string versus number is a scary one for me
because I'm often getting my numbers from an API or something
and they come in as a string if you forgot to convert it
and you passed it to the wrong thing
and you're really doing something completely different.
That's not a good thing but i i got bit by that one just yesterday updating one of my um one of my ci builds
to use python 3.1 i mean 3.10 uh but you know is it exactly is it a string is it a number
interesting yeah yes but yeah certainly that conversion would be where you know would be
worrying the other one is is it
a string or is it a list of strings and that's the one that bites us in python all the time
and i don't even know how you resolve an overloaded function based on is it a string or can i iterate
it well like in that case actually i would rather just have that part be part of the function at
the top of it if it can handle both to to check the type and
and iterate or not but you know yeah well all right let me close this out with two quick thoughts
um first i think this is interesting because it's one of the things that's possible with modern
python like once we've added typing now you could consider this as a thing, whereas previously it really was highly impractical,
I think, as a way to do it.
So I think that's kind of cool.
And then two, I think it might be an entryway for people who are not where Brian and I'll
put myself in there as well.
Yet I've going like, actually, these things I thought I need, I don't need those.
Right.
There's a lot of stuff I thought I needed and I haven't used it for three years.
So maybe I actually don't need it.
But that's not how you maybe first approach,
approach solving your first problem in Python that you're coming from C++ or
whatever,
C sharp,
whatever this might be a gateway.
So anyway,
those are my two thoughts.
One more thought from Dean after Python 3.11,
do we get Python 95?
There was,
you know,
there was a windows 3.12.
So I think Python gets to do a 3.12 as well
i think it was only available in china interesting and i believe i like to follow
on with that dean very funny um i believe that windows um 10 was named you let me know if you
know different steve windows 10 was named windows 10 because there used to be the check windows 9 as the the starting string for 95 and 98 so you can't be nine because
then you're going to be 95 so we got a kick on past it there was some embarrassingly big um
language runtimes out there still doing that check uh that that really struggled with Windows 9 and showed up in enough places that, yeah,
I think it just made sense for everyone to just skip it.
Not skipping 13.
We're skipping 9.
It's too unlucky.
All right.
Awesome.
Brian, what you got for us?
Oh, what do I have next?
I have the next generation Seaborn interface. So Seaborn is a really awesome plotting library built on Matplotlib.
And I, you know, actually, I don't use it that much, but I've always been intrigued by it and kind of watching what plotting libraries do and stuff.
And one of the things I was curious about, which I'm really grateful for this article, is some of the history behind
it. So the article starts off next generation seaboard interface, talks about the background
and goals. But some of the great things in here, let me grab some notes. This work grew out of a
long running effort to refactor seaboard internals so that functions, you know, anyway, where I wanted
to get at was he was developing a refactor of the internals. And he's like, wait, wait a second,
if I want to refactor it, maybe I should expose more stuff. And some of the background was
Seaborn was originally conceived of as a toolbox to do of domain specific statistical graphics
to be used alongside Matplotlib. So the intent was
people would use both Seaborn and Matplotlib together. However, people are doing things
differently. A lot of people just grab Seaborn by itself. Some people even just learn Seaborn
before they even learn Matplotlib, which is an interesting thing. And that's how I thought you
were supposed to be doing this. But the concept was, and then over time, there's a whole bunch of features that
have been added to Seaborn to where it's like really slick looking, but to do the same thing
by hand in Matplotlib is a lot of work. So there's some things that like, if you, Seaborn's almost
there, but you need to tweak it a little bit and you have to do things manually, well, then you have to just do everything by yourself. And it's a lot of work.
So the idea around this, this, uh, a rewrite of the API is let's rework some of the internals
so that a lot of the little sub components that go inside of a plot are exposed. Um, that way
people can get access to it, to more fine-tuned configuration within the,
so they don't really have to just do everything by hand.
It's either all or nothing, Seaborn or Matplotlib.
You can kind of do both more easily, which is a kind of a cool idea.
There's a whole bunch of great details in here that talk about some of the API changes.
Basically, it's exposing the internal, if you create a plot, there's nothing there and it won't show up.
You have to create layers on the plot. And then within the layers, you've got marks and,
and different components that go into it. I kind of like this idea of building things up,
but what I really like is the public aspect of this. So you've got a, you've got a library that's
out in the open. It's being used by a lot of people already. And somebody's saying,
maybe we should tweak the API and do something different.
And just going ahead and doing that in the open saying,
hey, we're going to do this.
There's a note at the top
or I'm thinking about doing this note at the top saying
it's a work in progress.
Don't depend on these examples
because things might change.
But this is the direction we're trying to go,
trying to get feedback from people.
And I think this is a lot of things that a lot of people struggle with when they're maintaining packages that have been around for a long time is, I want to do things a little different, but am I going to break everybody?
And talking through it.
So anyway, this is a great read, especially if you're a data plotting kind of person.
Very nice.
I always want to do more with visualization, and I'm sure that I have some good data plotting kind of person. Yeah, very nice. I always want to do more with visualization
and I'm sure that I have some good data I could pull up.
Yeah.
I end up basically just writing APIs
on websites these days,
but I really should be pulling this up
and doing some of these graphs
and I'm really happy these are around.
Steve, how about you?
See, Seaborn's great.
It's always like back when I first discovered it,
one of its major selling points was simply
importing seaborne would magically make your default matplotlib charts look nicer uh which
which matplotlib is i love it it's like the bootstrap of matplotlib it it really was it's
like they they just apply their style by default and every matplotlib chart suddenly look nicer
which you know matplotlib's done their own styling work now. So it's less valuable for that. I do like this API. It looks good. And as Dean's
pointing out in chat, it's like matplotlib has an object-oriented plotting API similar to this,
possibly identical, just like everyone else. I've never learned the object-oriented API,
but it is there. And it's, you know, that's the modern one. It's like, I know a lot
of people say Matplotlib is impenetrable and kind of hard to build things up, but it does have a
really nice API there. It's just not the PyPlot one that kind of imitates Matlab's old API.
And so, you know, having it there is really nice. And Seaborn, you know having it there is is really nice and seaborne you know having their own is
also great um another uh nice uh thing that to read about in this is um uh he does a hat tip to
gg plot or gg plot 2 or whatever it's called um saying that um yes it's gonna look a lot of this
is similar to gg plot but uh it isn't that i'm trying to copy it or
maybe that's that's definitely influence but um it is uh that seaborn is uh is important because
we think about things differently in python than we do in r and and and just having it would be
but also a hat tip to another library that is a a wrapper around ggplot if you just want
that you can do that in python too that's available so um it is interesting to these are
we think of these as competing libraries but they're really not competing with each other
they're working together to push the push plotting forward so yeah nice dean out there points out you
can do plot.style.use seaborn or ggplot.
Let me throw out. Oh, yeah, go ahead, Steve.
ggplot is certainly the one to copy from.
I mean, there's a reason that one is as universally popular as any plotting library can possibly be.
It's probably competing with excel for popularity of plotting
data realistically it's it's it's a really nice api and it looks good and everyone's familiar with
it and so you know there's nothing wrong with copying from ggplot nice i got one more shout
out to throw uh into this conversation the xkcd plotting style or umplotlib. So you've got, I mean, this is fantastic.
It looks like the, it really does look like XKCD would, you know, the comic would do for these.
So this is fantastic.
I love it.
What I love is I actually see this.
I see this in papers and stuff like that.
People just go ahead and use the XKCD style and for serious stuff.
And it just is,
it's awesome. I love it. I think there's actually some value to having like cartoony looking
graphics, like UI sketches and graphs to say like, look, this is speculative. This is just like,
don't read too much into it. I'm trying to give you an idea rather than an exact thing. And I
think sort of a UI, like cartolooking sketches, and this also plays into that.
Yeah.
Right?
Steve, you got the last one?
I got the last one, yeah.
So this is another kind of recent delivery from the CPython core team.
We can now compile CPython to WebAssembly.
Wow.
So, and to a lot of people,
that probably means very little,
but I guess the brief, brief summary is
WebAssembly is kind
of what the JavaScript in your browser compiles to before it runs. So it's skipped that initial
step of being JavaScript and it's now ready to run in the browser. So it's a lower level.
There are tool chains out there that can compile all sorts of languages directly to WebAssembly. And so in this case, we've taken,
I believe we use one of the, I don't know the exact tool chain that's used and it may not matter,
but it basically takes the C code and compiles that to WebAssembly, gives you a package that can be brought into an Electron app or a Node.js app or a web browser, modern web browser and be
run in the browser. There there is so this page is a
little bit dated uh there's been a bit more work since then but i found this is the best overview
of where things are kind of at long list of c extensions that don't work probably unsurprising
like the browser doesn't have a lot of this stuff in it yeah you don't have what no tk all the
different apis the win32 api, whatever it was delegating to you.
No TK enter? What?
No.
Yeah, no TK enter, no subprocess.
C-types apparently you can do.
I've heard there is a libffi port to the Emscripten kind of platform.
So how this kind of works is when you take you take WebAssembly in, into a browser,
it has access to nothing. Like it starts off in a really enclosed kind of box of things that it can
do. Um, and, and that doesn't actually work for Python at all, because the very early thing that
tries to do is search the file system for what files it should be loading. Uh, so, so you,
we actually build it as part of another platform.
Emscripten is one platform that kind of polyfills
a whole lot of native-looking APIs
so that code that's compiled on top of Emscripten
is able to use it.
And this little demo, which I just hit start,
REPL, this is running on that.
So this is a build of Python 3.11, Alpha 4,
built with Clang, running on Emscripten.
And I believe I can do this.
Do like OSLister.
And it thinks there's a file system there.
Now, that's not my file system.
That's in memory.
It can be changed to browser storage.
But this is entirely in the browser.
Like there's nothing downloaded.
There's nothing running on my machine here.
There's nothing running in the cloud. It's literally in the browser. Like there's nothing downloaded. There's nothing running on my machine here. There's nothing running in the cloud.
It's literally in the browser.
I can probably freeze my browser with this.
Like I can do an infinite loop and do it,
do it.
Let's see if this cuts me off.
I'll just let that run.
Hit clear.
What happens if you hit clear?
Start it again.
Are we going to start again?
Yeah,
now it's done.
A refresh.
Yeah. So, yeah uh so yeah and and the there's a second one that the actual build as it was committed supports which is wazi w-a-s-i
uh that's a slightly different approach to adding all the functionality around a web assembly module
um it's so it's a little bit more flexible a a little bit more controlled. Emscripten is really like,
give me POSIX system inside my browser, all in memory. And so we have two options.
And these are available in the main branch. At the moment, we're not shipping pre-built
modules for WebAssembly. That might be a possibility. If that's something that you'd
like to see, then I guess go to discuss.python.org and post about it.
It's probably a post there.
I should have looked for a post there.
But we're not currently doing pre-built releases,
but I think we could.
I think this is one of these options
where the WebAssembly build is totally portable.
And so if we build it, we can distribute it.
And then websites that want to do
something like this could just download it from our servers and run it. So I think there's a lot
of potential here. And it's at the potential stage, right? This is another stepping stone
to bigger and better things. Our responsibility as the core team is to enable it. And now we really
want people to come in and pick this up
and do awesome things with it.
Firstly, so we can figure out what gaps still need to be filled,
but also just to expand the growth and the reach of Python,
to bring it into places that currently doesn't exist or can't work
and give it new life in new places, open it up to new people.
This is fantastic. Congratulations.
And so the work for this primarily done by Katie Bell,
Christian Himes and Ethan Smith.
So I think Christian got to do all the merge commits,
but it's definitely been a number of people working on this for a while.
Those are the, the, the primary three.
I'm really excited for the possibility for this.
I think one of the things that could be amazing,
obviously running it in the front end
is a thing that could be done.
I saw the documentation said it was about 10 megs
to download it.
I'm sure you can put that on like a CDN.
So you kind of hit it once somewhere
for a particular version of Python.
That's pretty good.
You know, we all have pretty fast things these days.
Yeah.
It's still bigger than Zoom.
Yeah. What gets me really excited though is putting that into an Electron.js app.
Yeah, absolutely.
Right. Because Electron.js is a really interesting way to bring web technologies cross-platform as much as I, I like, oh, I said an Electron app. Still, it's, it's really opened
up the possibility for a lot of things,
but it really has meant,
okay, you're doing TypeScript,
you're doing JavaScript,
and you just have to go full on in that world.
So here you could still do
like your front end and whatever,
but having the core logic of that desktop app
being in Python running in this,
that's exciting if that can be put together.
I should also add two things.
Pyodide is a project that people have probably heard of before,
which has been working on this for considerably longer
than the core team has.
And so I think a lot of the patches that needed to happen
have come from them.
And they now get to spend more time focusing on the data science stack,
which, because they've got ports of NumPy and Pandas
and other libraries to actually do data science in the browser.
And the other interesting thing that I saw was someone from CondaForge
suggesting that they could elevate WASM builds
to their kind of automated level.
And so all of CondaForge may suddenly become available
to use in the browser on top of a build of Python like this.
That would unlock so much. builder Python like this. Wow.
That would unlock so much.
That would be incredible.
Interesting.
I imagine initially it would unlock a lot of bug reports.
But we need to work through those first.
Yeah, I was just thinking of take the top 1,000 most popular
packages.
Could you get 90% of those compiled to other WebAssembly
things that then could be included and then imported here somehow.
Exactly.
And the top 1,000 with native code,
because it's only the native code, right?
The Python code still compiles in the browser,
just like it would in the CPython interpreter.
It's only the native code that has to be ported and built.
And so once that's done, then we're up and running.
So the top 1, 1000 is probably more than
you need yeah absolutely all right awesome i'm looking forward to seeing where this goes
so many neat uh options there's there's just cool ways to say like ship the python runtime
to places where maybe it would have been hard to get now you drop this wasm file plus something
that can run wasm and then now you've got a deployable shippable yeah
c python runtime without tk enter and a few things but still you might not miss it depending
what it depends what you're doing i mean most apps are not tk inter apps is all i'm saying
i'm not trying to bang on it no no but i just haven't every time it comes up that it's still
there i'm like really we still have that okay don't ask me what i've been spending my
week working on brian it's not going to make you happy are you are you creating a tk inter
base killer for against textual no no unfortunately not
awesome all right well brian are we at extras we are at extras do you want to kick us off
i will kick us off so i've got a couple of things that i think are interesting let's start with this
one so we've talked about oh uh oh my z shell right yeah a lot love it i just came across uh
realizing that actually this is a portland company that puts together the sort of core
maintainers of, um, that. So I just thought it was funny to give a quick shout out to, um,
planet Argon. They're not really in the Python space, but they're in Portland, which I thought
was kind of fun. Uh, and then what is this? This, uh, next one comes to us, I think via
PyCoders. That's where I got this. Uh. Django just reformatted all of Django with black.
And I know I was just having a discussion with somebody like,
oh, your code doesn't follow PEP 8.
Oh, I don't want it to follow PEP 8.
Yeah, but if people are going to use your code,
like literally you got imported,
then it probably should follow,
like it should not come up with all sorts of warnings.
And so I thought it was interesting
that Django just said, everything, make it black.
Steve, what do you think about that?
I'm totally on board with just using black on everything.
I don't agree 100% with the style,
but I agree 100% with not arguing about it.
So it's close enough.
It's close enough.
Yeah, plus there's enough tweaks that make it good.
I'm really grateful that you can tweak the line length for instance yes because i mean here's example what
if i want it really short so uh for no for seriously for formatting the code for the the
py test book i wanted them all quite a bit shorter so that they fit better in a book format.
And I could use black to cover with that and convert everything with black to make them like that.
So it was great.
Nice.
Awesome.
All right.
And the final one is I have been doing some stuff with more fun things on YouTube, trying
to put these little short videos together.
So here's a, how long is it?
Six minutes, 44 second video on using time Delta
to get like how many weeks are in some time span.
So, you know, the cool tricks you can do there.
So people should check that out.
That's my latest Python short thing.
And yeah, that's it for my extras.
Okay, so I've got a quick one.
Just, I've got, I don't have a graph,
something to throw up, but I just,
I was looking at looking
at the git history of um of a repo and trying to figure out whether i i included one of my
co-workers uh branches in it if i merged it yet or not things like that and i was on the command
line and i just learned i'm like can i just do this with the command line apparently i didn't
know this exists so apparently a git log dash dash graph just shows you the get graph, your branch history or the branch graph on the command line.
And I didn't know it was there until today.
I started using it, tweeted about it.
And then a whole bunch of people said, oh, you should use these flags, too.
That makes it even nicer.
So it's fun to learn learn something old some as a new thing
and then somebody else told me how about git k so uh git k is a um uh is a graphical browser of
your repository that just comes with most git installs that i didn't know was there i'm like
do i need to install it i'll just type it and see what happens. And it popped up this graphical interface.
I'm like, this is great.
This is exactly what I wanted.
So GitK is pretty cool.
I didn't know about that one.
I've seen the command.
I've never actually run it to see what happens.
So I was not feeling quite brave enough.
Did it mean Git kill or was it something else?
I was just most Git commands scare me until I've run them the first 100 times or so.
How about you?
Yeah, I got a couple of extras.
Can I get my screen back up there?
I was feeling a little bad about being a bit self-serving here,
but then Michael just promoted his video series.
So I don't feel too bad anymore.
Get it on, man.
This is the Python 3.11 Alpha 5 download page.
And we have a new addition this time around,
which is this Windows installer for ARM64.
So ARM64 is not a massive, massive platform for Windows yet,
but it's growing and we want to have Python support on it.
So the builds have been running in the background for a while,
but we've never actually released it.
We're hoping to get it out with 3.11.
That is going to depend largely on do people use it do they love it do they hate it my experience so far with it
has been that it is noticeably faster on at least on the arm 64 devices i've had access to compared
to the intel devices which is really really cool um and there's there's like the test suite is uh kind of 30 to 50 faster
which is huge huge really so so i think there's a lot of potential here i may just have had awesome
hardware i'm not sure it was a virtual machine so it's kind of hard to tell yeah uh but this is
fantastic this is new uh if you have an arm 64 device like a surface pro x or there's a couple
out there from other manufacturers i'm
running windows 11 arm on my macbook pro and through parallels i then you please in download
and install it and let me know how it goes if you get it through the windows store which is
currently still not public you need to get the the link from basically from one of my tweets
to the windows store you'll automatically get the ARM64 version on ARM64 as well.
So this installer is the traditional one.
Otherwise, you get it through the store.
The other thing, which I wasn't going to do,
and then I spent a bit of time working on this,
a couple of years back at the Core Dev Sprints,
I forget who I was chatting with.
I was chatting with one of the other Core Devs
about everyone typing from collections import deck uh and misspelling it
and it's like you tell you know so a deck dque is a double-ended q very useful data type for
certain purposes but people would type it deck like deck as in deck because it's phonetically
what it sounds like exactly and so as as a bit a joke, I made a package that when you installed it,
it would give you from collections import deck.
And obviously the thing that that collection should be is a deck,
a double-ended queue of 52 cards representing the cards in a normal deck.
And over time, for various reasons, it's just kind of grown.
And I recently added support for calculating poker
hands to it and so now you could build a game with this uh it does it uses enums it's got
shuffling dealing uh jokers are optional um and you can calculate a poker hand and yeah
you can even compare them poker hand one greater than poker hand two i i spent a lot of time uh
most of my work on this over the last week was writing the tests
that proved how incorrect that function was until I wrote the tests for it.
But now at this point, yeah, you can look at the values it gives back.
It's actually a tuple with an enumeration saying what the hand is, and then a selection
of the card values in a way that makes the tuples comparable. So you can actually look and see, you know, it's a pair of aces.
It'll have the number 14 there for the ace.
And the next highest card was a 10.
So if someone else has a pair of aces and their next highest card was a nine,
then you're still going to compare higher.
So I'm pretty proud of that function.
Yeah, that's clever.
But yeah, this is, and it's code style black.
Nice.
Oh, very nice. Yeah, it's clever. But yeah, this is, and it's code style black. Nice. Oh, very nice.
Yeah.
It's one short file.
And it does still override DEC in the collections module for you.
I love it.
It doesn't, so that DEC isn't there, right?
It's like D-E-C-K-E is untouched.
But if you try and import D-E-C-K from collections, then you'll get it.
Nice.
Hey, one other quick thing to shout out.
We're hiring contractors to help develop features
for PyPI.org.
It says at the top of PyPI.org.
Do you know anything about this?
I guess if people want to work on PyPI.org,
that's pretty neat.
Yeah, no, they have funding.
And there is a post that describes the surveys i believe this is the
organizational accounts project they're looking at yeah organization accounts so uh if if like me
you are kind of the one of the primary python people at your company then you'll spend a lot
of time helping people publish packages to pipei if that's the business you're in. Certainly is for us.
There's a lot of packages from Microsoft up on PyPI.
And the kind of corporate account for that is,
it does exist.
We have a user account with 483 projects.
This is all manually curated right now because PyPI just doesn't have the functionality
to kind of hand out permissions to it safely.
The teams and all that kind of stuff.
Yeah.
So I believe the idea of this is to add that functionality to PyPI.
So I would love it if someone comes along and does this.
I believe we've contributed some of the funding towards this.
So, you know.
Yeah, it looks like it.
So Steve, I've got a 3.11 question for you.
So 3.11 is an alpha.
So what does that mean really?
Does that mean I can like start using 3.11
or should I wait?
It means you can.
It means we still may change stuff
that will break you
and we won't apologize.
Okay, but if my code runs,
can I trust it?
Or?
Yeah, if all of your tests pass,
then you should be able to trust it fairly well.
Certainly existing code.
There will be new features available in the alpha
that have not been as thoroughly tested yet or may change again.
But again, if you're running existing code, you won't be using those.
So that won't matter.
But yeah, it's totally viable to use.
You can specify 3.11 dev on GitHub Actions. I believe it compiles from source when you do that now. They don't have a
build there. They should for beta. Beta is when we really want people to start doing stuff. At this
point, alpha is so that people can test the new features, kind of targeted testing on anything
new that we've put in. Beta is when we really want people to start porting libraries, especially kind of the core
libraries to be able to work with it and just test it. Because if existing code doesn't work
on the beta, we want to hear about that so we can fix it in the runtime and not force you to fix it
in your code. Okay. But if I'm like a package maintainer, I can start, if it's got GitHub actions for it,
I can start testing,
having my CI test against 3.11 then too.
Absolutely.
You will likely want to mark it as it's okay if it fails.
Okay.
Yeah, awesome.
Okay, thanks.
Should we do a joke?
Let's do a joke.
Let's do a joke.
All right, so this one,
coming from the programming humor one, and it's a,
like you talked about the visualization stuff, right? And this one, it says,
there's a search that says how to get labels on MATLAB bar charts to be horizontal.
Look what the result came back from Google was, says you're not alone. Help is available. If
you're experiencing difficult thoughts, please call 116-123.
Or if this is an emergency, call 999.
And the underlying bit here is it isn't that drastic, Google, but thanks.
And I believe it might also work on Bing.
I'm not sure.
If you scroll down, I think there's a Bing equivalent down here.
Yeah.
Not just Google. Bing thinks you're an emergency as well um yeah that's awesome yeah so it's not
that that much of an emergency i'll go to stack over for you nice to know that the big search
engines are looking out for our mental health that's right people become very upset after
failing to get those bar charts.
This is not the emergency,
but it's coming next when you realize what the answer is. Something, I don't know.
Nice.
Anyway, that's the joke I found for us, guys.
Well, thanks everybody.
Thanks, Steve, for coming. And thanks,
Michael, again. Yeah. Thanks for having me.
Thanks, all. Bye. Bye. Bye, Ron.
Thanks for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
Get the full show notes over at pythonbytes.fm.
If you have a news item we should cover,
just visit pythonbytes.fm and click Submit in the nav bar.
We're always on the lookout for sharing something cool.
If you want to join us for the live recording,
just visit the website and click Livestream to get notified of when our next episode goes live. That's usually
happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Ocken,
this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and
colleagues.