Python Bytes - #256 And the best open source project prize goes to ...
Episode Date: October 29, 2021Topics covered in this episode: * It’s episode 2^8 (nearly 5 years of podcasting)* Where does all the effort go?: Looking at Python core developer activity Why you shouldn't invoke setup.py direc...tly By Paul Ganssle (from Talk Unlock the mysteries of time, Python's datetime that is!) OpenTelemetry is going stable soon Understanding all of Python, through its builtins FastAPI, Dask, and more Python goodies win best open source titles Notes From the Meeting On Python GIL Removal Between Python Core and Sam Gross Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/256
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 256, or as Anthony Shaw likes to put it, 2 to the 8th, recorded October 27th, 2021.
Again, unless you're Anthony, which is probably like a totally different day in the future, because he's in Australia.
I'm Michael Kennedy.
And I'm Brian Akin.
And I'm Anthony Shaw. Hello.
Hey, Anthony. How is the 28th? Is the next day going to be good or things are okay?
Yeah, it's pretty sunny today.
It's nice.
Yeah, right on.
Okay, so the world hangs together for one more day.
Fantastic.
You've been here before.
You've been on TalkPython a bunch of times,
friend of the show, all sorts of stuff.
So I'm sure many people know you,
but just tell people a bit about yourself.
You're doing more techie things these days.
You're a little
closer to the code maybe yeah so earlier this year i started working at microsoft and work with nina
zakarenko on python inside microsoft and yeah a lot of what i'm doing at the moment it's just
running around breaking things sometimes on purpose um yeah just seeing how we can improve
our experience and working with
vs code and azure and a whole bunch of other stuff so yeah yeah it's been a while since um
the last episode was episode 100 i think wow you're hitting the big numbers so yeah this
two to the eighth is a significant milestone i think it is it's pretty cool yeah awesome well
we're happy to have you here thanks for being Also, something to do with a puppy I've seen on Twitter.
Oh, yeah, I got a puppy as well.
He's not golden something.
He's a border collie, but he's kind of golden colored.
And he's not in the room at the moment.
He's not allowed in here while I'm recording.
I thought it would be a bit chaotic.
Yeah, my puppy sometimes is here, but it's very
bizarre the way that puppies socialize around COVID. Instead of us being gone and then we come
home, she now knows and understands the expressions I make to end a Zoom call. So she'll sit quietly
for an hour. And as soon as I say goodbye in Zoom, she's like, we're ready to go. Let's go. It's
super bizarre. But yeah, that's the world we live. So enjoy the new puppy. Brian,
you want to kick us off with our first topic here? Lucas Langa, he's, what is he again? The
developer in residence? Yes. Anyway, he wrote an article called where does all the effort go?
Look at looking at Python core developer activity. And I kind of really liked this article. He not only talks about really what's going on
with developers and who's doing what. To start off with, he talked about how he got this data.
So this is also sort of a data processing, sort of information scraping sort of article. He's looking at the GitHub repository data for CPython, of course,
and specifically pull request data. So there's a discussion about, he's even using dataset,
which is nice. We've covered that on the show, and even lists the SQL queries that he has to
try to get some of this data. So of the neat uh data that he's got oh
also uh since get it's the data is from uh from the time when c python moved to github so that's
uh february 10 2017 um and it's uh he mentions that it's up through october 9th is when he
pulled the data so uh but all the information is. So you could grab it yourself if you want,
even the little scripts he's got for, for modifying some of the data. But so some of the interesting things, the top, top parts of CPython that are modified, it's probably not
that surprising that caval.c is involved in 259 merge requests. It's the top merged file.
C of val.c.
Yeah, that's where the bytecode processor is.
So yeah, that's the center point, or the tunnel
everything flows through.
Does that make sense?
Yeah.
And then it goes on and looks at which contributors have merged.
And this is an interesting thing.
Or had been involved in PRs um it lists the top
he lists the top 50 people but it includes uh some bots which is interesting i was going to ask that
i thought bed of air is probably going to be up there or miss islington yeah both bots by the way
um so that this is a i'd actually love to talk to or or either me or michael or somebody talk to one
of the or python people to talk about the different bots that are used and why they're used and
because that's an interesting thing of large projects using bots to help out with
yeah that's interesting the work um and uh anyway uh the the non-bots, there's a couple of people that stand out, Victor Stinner and Serhi Storkaka,
so I apologize for messing up your name,
but they're really up there,
so that's pretty interesting that they're involved a lot.
And then there's a description here,
a nice note that Lucas writes,
clearly it pays to be a bot or a release manager since this naturally
causes you to make a lot of commits uh victor and sarah he uh are neither of these things and still
generate an amazing activity kudos and also it's not a competition but it's still interesting to
see who makes all these recent changes um by the way this uh that top pr thing was only since
the beginning of January 2020.
So taking a look at the more recent stuff.
And then one of the things that's interesting looking at who contributed where, I didn't know this.
There's an experts index.
So that was linked.
Oh, it's asleep.
An experts index that is part of the Python developer's guide. I didn't know
this was here. It's a kind of lists parts, some parts of the system, but there's blanks. Um,
and so there's, uh, so Lucas, uh, also, or, um, listed to the script and pulled out the top five
contributors to each file, which is kind of an amazing list of all of the different of,
you know, the top five people for every file within CPython. So if this is kind of neat,
because if you're going to do a PR, or you're working on a fix or something, and you're a
little confused by some of the code, one of these people might be able to help you out. So it's kind
of a neat list. So there's a's a, at the bottom of the,
uh, article also, he talks about some of the, uh, some of the takeaways from this. Uh,
don't have this right off the top of my head. Um, merging how long it takes to merge a PR.
So, uh, it's hard to draw information from this data because it's all over the map. The standard
deviations are pretty large,
but if a core developer merges their own PR,
it takes on average about seven days to get through the process,
give or take 42 days.
And,
and then core developer authoring of PR,
which is merged by somebody else,
it takes longer,
about 20 days,
give or take 78.
And then a community author,
it's up to 20 days,
give or take 80 um and then uh community authors up to 20 days give or take 80 but i mean i work on commercial projects that are uh not really that much faster than this so um it's it's not
too bad yeah what do you think this article yeah anthony what do you think of this you spent a lot
of time inside the c python code i mean you did write a book c python internals which people can
check out right yeah that's how you had to write a book about c python source code so it's interesting i'm first of all i'm super excited about lukash
being the new developer in residence i think he's got the right approach and he's already made um
you know really promising progress i think in terms of how trying to make the community
contribution process a bit slicker um yeah that that's that at the
bottom like i just watching the github repository core developers working on the repository and
making changes and stuff from the outside looks looks fairly seamless um my own personal experience
has been sometimes it's quite like if your pr gets responded to
within the first week then it probably get merged pretty quickly and then if it doesn't then it just
kind of ends up in the pile um and i've had ones in there for like three years right the average
was seven plus but but it could go out of like another 40 days and it's probably like really
quick or really far well that yeah that
that metric is how long they take to get merged which i guess requires that they are merged um
yeah oh yeah so that's how that's all i mean there's there's basically just like loads people
contributing stuff and there aren't enough people with enough time to to sift through it all and
it just makes it really tricky and the project needs to continue marching forward.
And there's people who are dedicated
to working with the core developers.
But some of the community contributions
are really valuable.
I think that's what's promising to me
is that Lukasz is kind of looking at that
and not just taking this role on
as I'm going to be 100% core developer.
Yeah. Because yeah, I think there's already lots of other people on the team who are making some amazing contributions um you
know pablo has been working on the new parser and now he's working on this like um stack list
changes in 3.11 yeah there's so many things going on at the moment in in c python so it's really
encouraging to see yeah it's really encouraging to
see. Yeah, it's super encouraging. I think Lukasz is doing a good job sort of smoothing out the
edges to just make it easier for everyone to go faster, which I think a lot of times in teams,
not specifically here, but in general, there's these people who are kind of,
oh, that's the person you can ask to make the CI work again when you break it. This is the person
you asked to set up a new machine and remembers how to do that and like you don't necessarily get
direct credit for doing that work but without them it's just way harder and i feel like he's doing
that for c python behind the scenes yeah the experts index is really helpful if you want to
get involved in bug triaging uh so that's something that people are open to help with
if you go on bugs.python.org and you want to help
to triage bugs um often what you have to do is kind of look at it make sure that the person who's
reported it is filled in all enough information and then basically add people on the experts index
to something called the nosy list which is like a cc list basically um on the bug and then yeah
it's just kind of directing it to the right people
once you've done that for a while then you kind of get given like a triage uh flag in your user
and then if you've been doing that for even longer then you could be promoted up to a core developer
and there's a few people who've gone through that that route um over the last couple of years all
right anthony while you're talking i got two things to share out of the audience dimitri figo hey
dimitri great to see you here dimitri. Great to see you here.
Dimitri says, thanks for inviting Anthony.
He's someone I look up to.
Very nice.
Thanks to meet you.
Good to see you.
Yeah.
And Waylon, who was recently on TalkPython.
Hey, Waylon.
Says, what a great lineup here.
Also kind of for you.
And also Henry Schreiner.
Hey, Henry.
Also recently on TalkPython.
Says, both PRs I've been involved with to see python got in in about a day i believe which
that's that's pretty amazing that's pretty good yeah that's great yeah so before we move off from
this one brian's a good pick one thing i just want to point out as well is all these cool stats and
these graphs and everything we're seeing here apply to see python because it's on github right
yes but you can run the same code and run dataset from Simon-Willison against it,
but against a different repo, I would imagine, right?
Oh, yeah.
Yeah.
So if you run a project, you could probably do a similar analysis for your project.
That's a good idea.
Yeah.
All right.
Speaking of good ideas, and it's interesting that Henry is out in the audience,
because I feel like we might have been responsible for this article.
Clearly, we did not write it.
We may have triggered, is what I'm saying.
Mostly me and not the positive way, right?
So this is a cool article by Paul Gansel,
who was also over on TalkPython
talking about the mysteries of date, time and stuff.
There's all sorts of cool things.
He maintains the date you told package
and set up tools, projects and so on.
Over on episode 271. so he wrote an article said why
you shouldn't invoke setup.py directly and the reason i think i might have somehow had something
to do this is henry was on talking about ci build wheel and all the proper ways to build packages i
said oh you can run setup python setup py space you know wheel or bdist or something they're like
no no no you could but please don't do that and then here we have this article like two days later so i don't know if
that was part of that conversation but it's it's a really good article talking about the state of
building python packages and it says you know look for a long time setup tools and distu tools
distu tills were the only game in town when it came to creating Python packages. Right. So you could do something like invoke Python setup, B dist, S dist, wheel, and so on.
Wait, I see.
So Paul is actually in the audience.
Real time.
Fantastic.
Hey, Paul says, I think I did it because Matthew Fikert asked for it on Twitter and I got
sniped.
Yeah, perfect.
Okay, good.
So I'm just a coincidence.
Fantastic.
But yeah, so my, the reason this is extra interesting to me, and thank you, Paul, for writing it,
is I was still doing this Python setup UI various commands.
And I was talking to Henry.
He's like, no, you shouldn't do that.
You should do it this other way.
I'm like, well, he said, well, OK, well, how should I do this?
You should use build, the build package.
What is this build package you speak of?
So we've talked about pyproject.toml
a bunch of times.
We've talked about things like flit and stuff
that will use it, right?
This all comes from pep517.
And there is a package called build.
You can pip install build.
And then you do things like python-m for module, run build.
And you can say, I want an estus,
I want a wheel and things like that. And this acts
as a front end to things like setup tools, to the various backends that do building for Python.
Yeah. All these different things that understand it. Right. So it says all direct invocations,
Paul says, all direct invocations of setup.py are effectively deprecated in favor of purpose-built standard-based CLI tools like pip, build, and tox.
So this is quite a long article.
There's a lot to go through.
It has some interesting history.
So in the early days, there wasn't even distutils.
And then in Python 2, distutils got added, and then setup tools came along. And then while they work, there's still problems.
Like, for example, you might have dependencies that you have to install to run the setup.
But the way you install stuff and figure out what you depend upon is by running the setup.
So what do you do?
So an example of that would be Cython, right?
So you might have to import Cython, and in the invocation of calling setup, you tell it how to Cythonize the PYX files, right? So for in the, you might have to import Cython and then the invocation of calling
setup, you tell it how to Cythonize the PYX files, right? But that's obviously not going to work
because you're going to have to have Cython installed, but how do you express that? You know,
it's like this chicken and egg problem, right? So let me pull up my notes here. Yeah. So basically
one of the big questions was why am I not seeing deprecation
warnings? Let me go down a little further. Yeah. So if I'm not supposed to do this, why isn't
screaming from the top of its terminal? Stop, stop, stop. Why are you doing this? Right? So
there's a lot of commands that still have indirect uses of the distutils and stuff. So it's a little
tricky to deprecate it, but, you should consider it deprecated.
At the end of the day,
it's better to replace your set of commands
with tools like build
instead of setup py sdist or bdistwheel
or talks and knocks
instead of setup py test and other commands
backed by projects intended to support that.
Yeah, that sound good to you guys?
Where were you on this
uh brian you go well i don't use have opinions i mean i've kind of indirectly used build but i i
basically just use flit so um i'm not writing things with c extensions so pure python stuff
i just do a flit build or whatever that works fine yeah so that's kind of i mean that's using
the pyproject.toml stuff
right yeah yeah anthony i kind of if i'm starting a project now then i use pyproject.toml and the
project doesn't have a setup.py there were some reasons why i had to add one um in the past but
that's mostly fixed now so i'm either using flit or or something similar like poetry um yeah and i've worked on projects years and years ago
where the setup.py was like just ended up just being a script to run ad hoc commands like there
was a test setup.py test and then there's like and lint and yeah what does that have to do with
installing software right why is that nothing it was just like yeah it just ended up being at entry point to to do things um and one happens to be installed but there's a bunch of other stuff
you might randomly do yeah and it's fine that it's being deprecated but it just you know c python
still does that like the setup.py and c python is still used in that way um and called and invoked
directly um in the source code this is so um yeah i it's good that
it'll be deprecated but i don't think the tooling is quite ready yet he's not really saying to get
rid of setup.py just don't use it don't run it directly yeah find find something better pip
should do that pip should do the discovery for you for pet 517 yeah um and and run the correct uh steps for you so yeah absolutely so a couple
comments out in the live stream is that while recommending build it's uh nearly impossible to
google to find it and race as i love and hate the name so authoritative so ungoogleable and a bit
hard to use in conversation but yeah yeah for sure yes So I think if you want to take away from this conversation,
right at the top, there's a TLDR section that Paul put in.
Click on the summary, takes you down to a summary,
and you can go to a table and it says,
I was about to type this.
What should I do instead?
I was about to type setup.py sdist.
What should you type?
Python-m build, having build installed.
Or if I was going to type setup.py bdist wheel,
I should type python-m build dash dash wheel
or something like that.
Setup.py test.
Oh, maybe PyTest or Tox or Nox.
We covered Nox recently with Prason,
which was really fun, I believe, episode.
Setup.py install.
No, that's pip install.
Python setup.py develop.
No, that's pip install dash e. And develop. No, that's pip install dash E.
And then as well as upload, it goes back to Twine.
So yeah, anyway, I think this is the most actionable bit here.
Yeah, it's good.
Yeah, indeed.
All right.
Well, Anthony, let's talk about keeping an eye on things.
Yeah, so I wanted to highlight a project which has been in the works for a while,
but they've just recently finalized the
specification so this is called open telemetry it's a part of the cloud native computing foundation
the cncf and it's a cross-language event tracing performance tracing logging sampling
framework for applications in particular for distributed applications so if
you've got an application which is spread across multiple microservices and you want to
trace things or monitor performance or whatever across all of the stack and it's it's super it's
a super hard problem right maybe you've got a docker container running this thing that docker
container calls some other service on a different docker container and maybe the logs are even transient what what are you going to do to know
if something went wrong where yeah exactly and if you've got an application that's spread across
uh well if it's built into multiple microservices then and one of those services has a fault it's
really hard to know where that fault came from so like like if it just says error, blah, blah, blah, blah, you're like, okay, so what triggered that error?
Which request from a user at the front end
or like how did that error happen in the first place
and how can I fix it?
And also like identifying, I guess,
tracking performance across your application
and looking at that.
So there's been attempts at doing this in the past.
Open tracing and open sensors were the two
kind of projects uh beforehand so this new project open telemetry is a merger of open tracing and
open sensors there's engineers from some big companies working on this including microsoft
amazon splunk google elastic uh new relic and a whole bunch of others as well, including actually
full-time engineers from some of those companies working on this. So yeah, I've been working with
an engineer at Microsoft who works full-time on this project. He works on, actually there's a few
people who work full-time on this, but the person who works full-time just on the Python components
to this. So thedk basically allows you to
instrument lots of different frameworks so you can basically drop it into flask or django or um
stylus so if you're using fast api and you can sort of instantly get capture of what requests
are going into the application when there's been a crash, like where that exception's gone,
all the logging information.
You can look at performance records and stuff.
I've been sharing some examples
of where I've wrapped it around a fast API app,
and then I can see performance
of what's the average request time
for each of these parts of the application,
and where is that time spent,
even down to like-
Can you say like, this is the data layer
section and this is the the business logic and here's the organization or whatever exactly so
i can kind of see like almost like a cool stack but across the actual components of the app so
here's where it came into fast api here's where it went into database uh like here's how long the
query took here's how long the orm took to remodel it here's how long
Ginger took to build the template like so you can kind of see a breakdown of all the different
components and how things are being pulled together so there's two parts of OpenTelemetry
actually more than two parts I am actually really appreciative of even though there are lots of engineers from big companies this hasn't been
over engineered uh yet and i'm really hoping it doesn't is there a factory factory method in here
yeah exactly especially because it's like so generic um there's a real danger of it being
just over engineered so if you go on the website and go to registry and then pick python on on the
right hand side you'll see the kind of different extensions you can get. So instrumentation is basically like,
this is the thing I want to monitor. And it could be like ASCII or Async Postgres, for example,
Database Celery, Django, Elasticsearch, Flask. There's a stack of app stacks that you can just
drop it into and it will give you all the tracing information. And then there's a stack of app stacks that you can just drop it into and it will give
you all the tracing information um and then there's these things called exporters which is
basically like once it's got the information it can send it to somewhere uh like datadog or new
relic or um azure and aws obviously and google um monitoring as well and um yeah actually i just
worked on recently if you just want to
hack around with it there's an exporter for rich um that just basically prints it on the console
so you can see everything that's happening um and it's all color probably yeah yeah yeah yeah so
it's all kind of color coded it's really nice actually i so yeah i'm really excited about this
i've been mostly trying it with fast apAPI as there aren't really many frameworks
for setting up like decent monitoring
and tracing in FastAPI applications.
And yeah, I think it's really promising.
So I suggest you check it out.
And if you see a framework that needs support or something,
then this is all open source
and they're all accepting contributions as well. And it's fairly straightforward to add support.
Yeah, it's got Postgres, MySQL, MongoDB, Pyramid, Redis, all sorts of good stuff in here.
Another thing maybe worth pointing out here is because this crosses languages, right? There's a
Python one, but there's also a.NET one, there's a Swift one and so on, which means there might be scenarios
where I've got like, say, a mobile app written in Swift
and then I've got the backend written in Python
and FastAPI or something.
And you want to put those together.
Because it goes across those languages,
theoretically, that's a thing that could happen.
Absolutely, yeah.
And you can pull that all together.
And it would give a request a trace id um so when a
request comes into the front end a trace id could carry across uh the different stacks as well which
is pretty cool yeah yeah very cool this is neat awesome thanks for for covering it uh now before
we move on brian we have a sponsor for this episode that's cool yay thanks to shortcut
shortcut formerly known as clubhouse so they're a really
cool project management tool and they ask the question have you ever really been happy with
project management you know how's your um how's your uh jira or whatever right how much are you
loving it so so they basically say most most are either way too simple for growing engineering
teams to manage everything or too complex and just throw in the kitchen sink and you don't want to work with it.
You've got to constantly tweak it to make it work for you.
So Shortcut, who used to be known as Clubhouse, is different.
They try to be simple.
It's project management built specifically for software teams.
It's fast, intuitive, flexible, many other nice positive adjectives.
So some of the highlights are team-based workflows,
individual teams can use shortcuts, default workflows, or customize them to match the way they work. Also organizational-wide goals and roadmaps. So these workflows automatically get
tied into larger goals and feed into like a bigger system outside the team. Good source control,
integration, GitHub, GitLab, Bitbucket, all those types of things. One thing that I really love is the web app has hotkeys. So it's keyboard friendly, just like
HR and VS Code, whatever, right? I don't know why more web apps don't have hotkeys. It's not
particularly hard, but they do, which is great. Iteration planning, so you can set your priorities
and let Shortcut run the schedule. You get nice little burndown charts and so on. So check them
out at shortcut.com slash Python bytes, shortcutcom slash python bytes because you shouldn't have to project
manage your project management that does not sound fun so let them do it it's their job now um before
we move off to the next topic robert robinson on the audience hey robert this open telemetry sounds
interesting wants to try it out i i do as well i feel like this is the kind of stuff that you just keep putting off integrating into your system and then once you finally finally do you
like oh look how awesome this is we can see what's going on and it's actually did you know this part
was crashing no i didn't know that nobody looked at the log and it was just eating even the exception
right yeah tricky tricky all right brian you got the next one um so Python's got a few built-ins, not a ton, but quite a few. So
this is a, there's an article called from Tushar Sadwani called understanding all of Python
through its built-ins. And it's a pretty, like he's got a pretty ambitious goal here to understand
everything. But I, I actually kind of really enjoyed
even the first part of it. So I started reading it. I've been especially giving it a shot. I got
a shout out to him. He's been fairly involved on Twitter, answering questions and being involved
in conversations. So that's a good way to get noticed. But there's a, there's, there's a starts off talking about scope. So
what is built-ins are not just things that Python has built in, but there's also, it has a relevance
to the scoping rules and he called it the LEGB scoping rules. So it's when Python, if Python
sees a symbol, first it looks in the local scope, then the enclosing scope, and global scope, and
then the built-in. And built-ins really are just anything that's in the built-in package.
So, and that, actually that discussion, it's a really pretty good discussion, and it helped,
it kind of, it's good for especially newbies to understand, but even advanced beginners sometimes
don't quite understand what's going on here. Yeah. Brian and Anthony, you both come from C style languages historically, right? Or
at least I've spent a lot of time there, right? Brian, do a lot of C++. Anthony, I know you've
done some C sharp and stuff. Did the scoping story of Python confuse you and kind of leave
you a little uncertain in the beginning? Yes, definitely. Especially coming from C++ where it's very well defined.
And if it's in the curly braces, it's alive afterwards, it's gone, right? Like,
wait a minute, that's not the story at all. Right. And also, you've got so many nested
curly braces, it could be anywhere. And it's not really, it seems like, actually,
we just don't do that too much in Python. but Anthony would probably know better than me. If I've gotten multiple nested
curly braces,
we don't have curly braces, but
multiple nested indentations,
does the scope sort of
look in outer and outer and outer
ones? Is that what non-local means?
There's a non-local
keyword, which
is like a whole other thing.
That's a completely different thing okay i think
i don't know capture basically yeah yeah yeah but yeah the the difference in global really
freaked me out because really we were pounded into our heads everywhere is to never use global
variables but global is different the global namespace is not a global variable. It's more like a module level.
Yeah, yeah.
Or like a static variable in a class,
maybe would be what other people might call it.
Yeah, it's not a dangerous thing in Python.
So I didn't mean to derail you that much,
but I think it's interesting to think about
the built-in scope, the global scope,
these different scopes,
because it's such a different world
from the intuition you get
coming from all the C languages. Yeah. Also just sort of just really enjoyed
looking at the language through the scope of built-ins. It's an interesting take on it.
One of the, I will pull out a few things that he mentions, and one is all the constants. I guess
I'd never counted them before, but there's five. There's five constants in Python. True, false, none, ellipsis,
and not implemented. I do like ellipsis. We talked about that the other day, or I guess
one or two weeks ago, using dot, dot, dot instead of pass.
Are you going to start doing that?
I've already started doing that.
Have you? I'm all about it. I think I'm up for it as well.
I don't, I guess I don't think I've ever used not implemented or even looked for it.
But interesting discussion. Also, just like I like looking around. So here's a section on compile,
exec and eval. It's not an alphabetical listing of everything. It's more grouping them together.
It's quite a big article. But I would suggest people just like skim through the list because
it's got a good table of contents at the top. you can just sort of, uh, skim through what he's talking about
and pick a couple and go read about it. You'll probably learn something. So, um, anyway,
a good shout out to too sharp for writing this. Yeah, this looks super handy.
Yeah. Some of their built-ins are super handy. Um, I often have a Python report open just to do,
uh, things that would otherwise be annoying to
do on a calculator like converting hex uh integers and vice versa there's a hex built in which is
really helpful actually um doing yeah i use hex a lot because i'm often uh uh looking at um looking
at data elements in a in a uh a packet or something like that and trying to convert those so yeah very
nice nice one before we move on anthony how do you feel about dot dot dot they should have called
it yada yada yada um yeah i think that would be uh it's way better than ellipsis come on yeah i'll
use it for type stubs uh and that's it so yeah there's times we use pass right and i feel I feel like, you know what, dot, dot, dot kind of says,
I'm not ready to put stuff here yet.
I think we should start calling, instead of ellipses,
we should call it dun, dun, dun.
Exactly.
All right.
How about we hand out some awards?
Okay.
Best open source software of 2021.
Now, who gets to vote on this?
Who gets to say? Well, InfoWorld in this example. Now, who gets to vote on this?
Who gets to say?
Well, InfoWorld in this example.
So this is according to InfoWorld,
but there may be other rules.
But I found this to be pretty interesting, actually.
I heard about it, learned about it because Sebastian Ramirez from FastAPI said,
yay, we've been voted
one of the best open source projects.
So this is called the InfoWorld Bossy 2021 awards.
But what I thought was interesting
is going through here,
there was 30 different projects that won awards. I'm like, oh, that's interesting. Oh, I didn't
know about that. Oh, check this out. Yeah. So I wanted to touch on a couple. So there's some
things that may or may not be interesting to you, like Svelte, which is a JavaScript front end,
like Vue or React. That's not interesting to me. But Minikube, Minikube is pretty interesting.
Minikube is a way to run like a baby Kubernetes cluster right on your computer. Just say Minikube start and guess what? You've got a cool little cluster running. So that might be really helpful
for Python people. Let's see. Pixie, zoom back a little here. Number five is FastAPI.
We're all fans of FastAPI.
I think it's really awesome that it won.
And worth maybe giving a quick shout out to how they described it as Django and Flask
have been leading the Python web frameworks for years.
FastAPI now deserves to be mentioned in the same breath.
I agree.
Calls out the main features,
which are it's truly modern Python web framework
written from the ground up using type hinting,
async and high-speed components by default. That that's true and i also really like that they pointed out that while
its name indicates it's primarily for apis it's also really good at writing more conventional
websites with like ginger templates or even chameleon templates so way to go anthony you
want to add or brian want to add anything well i just think that you're partly to thank for people considering FastAPI for not just APIs
because you've been beating that drum a little bit as well.
Yeah, thanks a bunch.
I even created some decorators that make it really easy
to render templates as response values and stuff.
Yeah, it's fun.
Anthony?
Yeah, I tried out the chameleon thing.
The one you wrote, actually.
Yeah, because I'm working on this uh fast api course with you
at the moment um yeah that's gonna be fun so yeah i'm a big fan of fast api i think it's brilliant
um and testament to sebastian really because he really kind of builds on something which is quite
complicated but he makes it seem so effortless um and just working with fast api like the
documentation is excellent.
The framework itself is just,
it's really logical.
And, you know, it's really easy to use.
In terms of like the,
I've been keeping an eye on the popularity of the different frameworks and stuff
over the last few years.
And Django and Flask are kind of neck and neck
and have been for a while.
And FastAPI now is the third third most popular
according to the metrics that i've yeah out of nowhere to third third most popular yeah yeah
and i know um jet brains are doing the new uh the latest uh well the psf developer survey so
yeah we'll see kind of what happens in this year's this year's number but i'd imagine fast api would
still be the third um most popular so
yeah it's which is brilliant um so yeah i think it's a good it's a good solid pick
in terms of writing like full apps with it at the moment like there's still a lot you have to do for
templating like you you pretty much have to like build in a whole bunch of other templating stuff
and picking an orm at the moment isn't easy, but there are some
brilliant ones to have a play with.
Yeah, there's a couple interesting ones.
I want to give a shout out to.
Yeah, that give, similar even integrating with Pydantic, which is sort of the natural
exchange of FastAPI.
So you want to give a shout out to Tortoise, you say?
Yeah, that's my favorite so far.
I've tried out six different ones so far.
And Tortoise, I think is my favorite at the moment.
Right on.
Well, maybe next year we'll be talking about the award for SQL Model,
which is built on top of Pydantic plus SQL Alchemy by Sebastian as well.
So who knows?
A lot of good ones out there.
It's good to see a lot of the excitement and new ideas coming along there.
All right, what else we got?
Crystal, don't care.
Windows Terminal, I think is actually pretty interesting. Windows has traditionally been not on par with its terminal
experience. And I think, you know, the Windows Terminal, PowerShell 7, Oh My Posh, all these
things come together, nerd fonts, to make it quite an amazing place to be actually.
Windows Terminal is an open source project?
It didn't start out that way, but now it is.
Yeah. Okay.
Yeah. Yeah.
So that's a good one.
OBS Studio, if you're doing video stuff, that's amazing.
There's a bunch of stuff in here that may apply to people that you can all check out that are interesting,
but I don't want to cover them.
Dask, though.
Dask is a big data science one.
Scale computation, like Pandas operations and whatnot,
across cores, across across clusters across compute that's larger than the ram you have by streaming it off disk and all sorts of interesting
stuff i have no idea why my browser is jumping up and down we'll have to ignore that i'm not
in control of it i'm sorry it seems like you know i'll tell you why this is happening i'm i'm looking
up and i see i'm not running my VPN, which would block ads.
And so there's some kind of ad off the screen
that's just running.
And if I turn on my VPN, we'd be good.
All right.
Blazing SQL is another great one.
Rapids from NVIDIA.
And I feel like there's one more
I want to give a shout out to.
Hugging face.
I don't know anything about that.
Now that was it.
So just going through that list,
I thought it called out a lot of neat projects
in addition to just FastAPI.
Yeah.
Yeah. Any of those jump out at you guys either and that i've just screamed by
uh lots of lots of stacks that i don't use so um same yeah there was a bunch of ml stuff though
which i don't use but i think would be relevant to people who are listening maybe well we're not
to extras yet michael no no i know i just closed it because the jumping was driving me insane okay all right anthony you got the last uh main one right all right yeah so i think
lukash is taking up like half of this episode so we're gonna get back to lukash's blog um and and
evolve the discussion that was started last week on this discussion yeah yeah i i'm let's put it mildly i'm excited about this i
think if if this happens it's probably going to be the biggest thing to happen in c python in the
last five years in my opinion and this being the gill removal this be the gill removal but not the
gillectomy not the gillectomy not exactly um yeah so uh no gill or let's just go with no gill um
so almost seemingly out of nowhere um sam gross um who works for facebook uh basically like
submitted to the core developers this uh research paper and a working branch of a gill-less python um and just quickly recap i guess on what that means
um this this article is pretty heavy in technical detail and the stuff that's um yeah the stuff
that's being discussed in the article again is pretty complicated and i actually didn't understand
a lot of it um and i've written a book on the python compiler. If you read this and it's confusing, don't worry.
So the GIL is basically the global interpreter lock.
And it exists as a way of making Python thread safe when it comes to keeping reference counts of specific objects.
So if you create a Python object, for example, there's a counter of how many things are referencing it
um because you don't want to just destroy an object and then like you're working through a
list of objects for example but then one of the items and in the list just disappears has been
deallocated or is a point because everything is a pointer in python like that pointer just goes
nowhere um or actually there's a there's a. Like that pointer just goes nowhere.
Or actually there's a magic pointer that Python uses when it deallocates objects,
which I know from a very painful experience.
So you don't want that to happen.
And if you've got multiple threads
kind of working with the same objects all at once,
you don't want them to,
it's incredibly hard to keep track of what's happening.
Threading is great because you want,
you can have multiple threads working on a computer and the operating system can do the scheduling of which threads
one on which cores and which CPUs, et cetera. So in theory, it's a way of making your Python
applications a lot faster if you write them to be multi-threaded but python's basically built in this lock which says
okay in the evaluation loop in ceval um don't let anyone else run a instruction whilst this thread
is running the instruction yeah with the exception it seems like it's um yeah yeah it seems like this
is a thing to control threading and really it's just a thing to protect memory management, but it has this huge blocking effect for threading, right?
Yeah.
So it's the thing to basically make the reference counter thread safe.
Without locking.
So it's fast.
Without locking.
Yep.
So you don't have to wait to add an income.
So to give you an idea,
like if you run the GC by hand,
you'll just see how many tens of thousands of objects are
just created like all the time in Python applications so what Sam had put together
I say seemingly out of nowhere but if you go through the article and what he proposed he's
actually been working on this almost full time for two years which is astonishing and it's a it's a real feat of engineering to be
honest so kind of what he's proposed is a way of removing the gill um so that there's essentially
um like almost two ways of keeping references into objects and one of them is specific to the local
uh the local thread and then there's also another uh reference count
which is for other threads so why is that important well let's say for example you've got
a python dictionary uh with values in it and then you have multiple threads or working on the same
dictionary like that's that's a complicated problem to solve like how do you make sure that
the keys like the references to the keys
or the values don't disappear um and it does actually go into detail about how that's been
handled and also objects like python dictionaries are not thread safe at the moment either so
you know if you have two threads um working on a dictionary adding values for example to a dictionary do you have to lock the hash table
um anyone who's worked with multi-threading and in low-level languages knows that like the
complexities of uh complexities of doing this so what he's proposing is that uh well in his
prototype he basically replaced the python memory allocator, with another one called Mimalek, which is a sort of thread-safe memory allocator.
It's actually a Microsoft project,
but I think it could have been
any other thread-safe memory allocator.
Writing memory allocators is very involved
for them to be performant and efficient.
And then basically objects get tied to the thread that created them.
And then there's a non-atomic local reference count with the owner thread.
And then there's basically a separate mechanism for what would be slower,
basically reference counting from other threads.
So single threaded performance is equivalent um with this proposal but um when
you're there's still a performance impact of multiple threads working on the same object
which is to be expected yeah there's always a little overhead for that yeah but to give you
an idea like in in his note he implemented a few like common problems as a multi-threaded um implementation and he said
if you give it 20 threads it runs 19.84 times faster um than it would in just regular c python
so like yeah for certain types of problems this can have a enormous impact in performance um
but it is really complicated and that's why i think it's an interesting discussion
to see okay how do we how do we get from this is a cool idea to this actually being released and
being used by you know millions of people and i don't know python's like running on like a
satellites in space and stuff like how do we go from a fork that someone's been hacking around with to
something that's like production ready and this is kind of what the article goes into so like
um you know how would this work would it be a feature flag um which version would we target
and so at the moment it's targeting 3.9 alpha 3 actually so it wasn't even the release of 3.9 so he needs to do some work to update that
to the latest version of 3.9 which is 3.9.7 and then i think the target release if if the core
developers agree to kind of like explore this um if that was 3.11 uh or i don't think anyone wants
to touch the Python 4 topic.
But 11 is like a year away.
Is that even possible,
or would it most likely be a couple years out?
Yeah, it seems pretty soon to me.
And like subinterpreters, for example,
is like an experimental feature.
I think the issue with this is that the volume of changes is so broad
that it's quite hard to kind of like have it in
as a feature toggle.
So like Subinterpreters was in,
there's like a hidden package that you can use
and it's experimental.
Whereas this is like changing-
Everything.
Yeah.
Well, not everything,
but like it's a pretty wide sweeping change
and changing the memory allocator is a massive change um so question
is more how can we introduce this softly i think and have it either as a feature flag um and what
would this break and the main thing is that c extensions haven't really had to worry about
thread safety because the gill kind of handles that for them so c extensions essentially would would need
to if they use the mechanisms that are here that's fine but c extensions often have other objects
which they haven't used the reference counter for um so they've basically kind of like allocated
their own objects and variables and stuff like that that would not be for thread safe and the
head does not have had these kind of collision issues in the past um so introducing this would that potentially
break some st extensions so you know how how could that be introduced gently i think what was
interesting in the article is there's a mention of numpy and numpy has actually done a lot of
its own work already on um basically kind of making it thread safe
and more scalable.
But one of the tricky ones is PyBind 11
is called out in here as being,
anyone who's using PyBind 11
potentially might have to do some refactoring
to support this, if it was supported.
And then in closing,
Lukasz, who wrote this review or post sort of said um you know the team had been really impressed with sam's work and invited him to join
c python project as a core developer and he's interested in uh lukash is going to mentor him
so i think that's brilliant like oh yeah that's brilliant just to come up with this this over like
even two years is like a really short amount of time
for a problem that people have been trying to solve
for well over a decade.
So yeah, very exciting.
Yeah, this is great.
I think we have a record number of core developers
in the audience right now.
Yeah.
So some great comments from Steve Dower.
Hey, Steve.
The big thing needed here is a path forward for native extensions.
They could all need rewriting or else importing them could re-enable the GIL.
That discussion is happening now.
It's very early.
And Henry Schreiner also has similar comments that they're considering that.
But yeah.
And Henry also says we would be up for refactoring PyBind 11 if needed, I believe.
It's also interesting. But this is exciting. There's a lot of stuff coming
here. I think another thing in addition to the no-gill is I got the sense
that Sam had added several other optimizations that were independently
worth adding to Python.
One of the things, I know there's a lot of tension
around whether or not to do 4.0.
But if it ends up being that all of the extensions need possibly tweaked, then that might be
then it's an API change.
And I think the shift to 4 might not be terrible.
Yeah.
Well, we should just go to Python 5.
So no one's worried about 4 and we'll skip the whole conversation.
I'll be fine.
We'll do it at AngularJS. just go to python 5 so no one's worried about 4 and we'll skip the whole conversation i'll be fine we're doing angular js we'll just like make a big fuss about going from one to two and then
just just all of a sudden they're on like version 10 or something yeah we'll just just go crazy
yeah yeah no this is fantastic i'm actually having guido van rossum and mark shannon i believe um
on on monday on talk python to talk talk about performance in the future and stuff.
And I'm sure we'll talk about this stuff a little bit.
Yeah, so it should be a lot of fun.
This was Guido's suggestion when I asked internally
if anyone wanted to share anything.
This is what he sent over.
Okay, fantastic.
Yeah, so I'll try to take that up with him again.
All right.
Well, Brian, does that bring us to our extras?
We are at extras.
Do you have any extras?
Yeah, no, you go first.
Tell us about PyCon.
Well, the call for proposals is open for US PyCon.
I'm pretty excited about that.
I already wrote down like six ideas of things I might want to talk about.
So and of course, there's no guarantee.
No matter who you are, there's no guarantee that you're going to get in.
But it's fun.
It's fun to come up with proposals anyway.
And it's fun.
I'm definitely going.
So I'm pretty excited about that. and uh anybody else is gonna propose anthony you're gonna try to talk there
yeah i've been thinking about that what i'm gonna put forward and i want to put together a talk on
performance anti-patterns um that'd be fun propose that for for next next year. Yeah. Because of your name, like aunt, auntie.
Um,
also,
uh, um,
if anybody doesn't know,
I wrote a book and then I rewrote it.
Um,
and it's,
I,
I,
I'm finished with it actually.
So it's not out yet,
but I'm pretty excited that I'm finished.
Uh,
the,
all the betas,
there's beta seven out has all chapters in it.
So if you're waiting for it to be done, it's done.
It's not in print form yet.
That's going to happen in January or February.
So I'm pretty excited to get that done.
I'm hoping for my copy at PyCon, Brian.
I'm pretty sure I paid for the last one as well.
I actually, I paid you in cash.
So I'm going to give you a copy of my book.
I'll bring at least.
Maybe we can do a swap.
Yeah, that'd be great.
Yeah.
Anthony, I got your book over there.
I'm not sure what I can trade it for though.
That's awesome.
Congratulations, Brian.
Thanks.
Anthony, you got any extras you want to share?
Yeah, I'll be shipping fairly soon.
The JIT compiler that I've been working called Pigeon.
I'll be going version one in two weeks so it's a Python uh Python 3.10 JIT compiler it's a you basically just drop it
into CPython and turn it on and then run your code and it just JIT compiles it in the background
um and in some cases makes it a lot faster and in other cases makes no
difference um but yeah some of the benchmarks i've been doing um like uh float uh floating
point math and um integer math like makes a massive difference so um yeah like the scientific
side of thing right yeah um so stuff that you would otherwise think oh i'm gonna redo this in
scython or something like um that you you don't have to add all the extra stuff you just kind of
turn it on and um yeah the n-body benchmark is now 60 faster um than standard c python
um that's great and yeah some of the other benchmarks i've got a 60 upwards um that's
super cool so uh this work with sam and the no gill does that throw a
spinner in the works or is it uh it would make my life quite hard for a few weeks if it gets merged
um yeah so yeah that could be interesting and i'm also working on another secret project but
i'll share that in a in a few weeks uh yeah pigeon does uh there's a comment in the chat pigeon does use psychic build um which uh yeah i did want to call that out when we're talking
about setup.py earlier because um yeah so pigeon uses is all c++ uh and it uses cmake um which
generates make files um so yeah and it uses psychic build which is a c make extension i guess
around python extension modules um so that's how it kind of compiles it's really cool
psychic build yeah and i recommended using build earlier uh henry on our episode together mentioned
that if you have external non-python code like c code or Fortran or whatever, then instead of build scikit-build
would be a good option to build the binary bits for that. This is the other question I wanted to
ask and Steve Dower beat me to it. He states it as an assertion. I was going to ask you the question.
I bet once Pigeon ships, you'll get people interested in helping add optimizations. Yeah.
So it's one thing to JIT compile. It's another to just then straight up run it versus go, oh,
we can inline this method. Oh, and I see we can do this. And then like we could actually reuse yeah so it's one thing to jit compile it's another to just then straight up run it versus go oh we
can inline this method oh and i see we can do this and then like we could actually reuse this field
because it's not used below and are early free all that kind of stuff where's the optimization
of that one uh yeah i've got it like on the documentation page there's a optimization
section and i've kind of written up um a lot of the optimizations and how they work
assertions that they make and compromises and
stuff like that um so yeah if you're interested there's there's some info on there um but yeah
i'd love love more help on this the learning curve on the project is quite steep but i'm
trying to make it easier um i mean it's a compiler so like yeah like um and i just added ARM support as well so M1
Apple M1 and
I tested Linux ARM 64
and in theory
Windows ARM but I don't have access to any
machines to test the Windows one
and I could only test
the Apple one remotely
if you need a periodic test you can reach out
I got Windows 11 running on ARM
oh really okay yeah maybe I'll take you up on that i know very cool all right i have a couple throughout there as well um python software
foundation on twitter the psf analysis we're happy to announce the python developer survey 2021
take part in it this is the one that is then hosted and then the data analysis is done by
jetbrain so but not influenced by jetbrain So I'll link to that in the show notes.
Be sure to get out there and take that.
Henry, on the audience, I have something as well.
The feature for what you said the other day on Twitter.
I said, after Python's bytes mentioned on yesterday's show,
I asked for a new feature and it's already in PipX.
PipX run PyPI command line wheels.
And basically this is added to pi pi command line
and it'll tell you all sorts of cool stuff like the details of the wheel so you could run pipx
basically run ipi dash command dash line wheels numpy however you run that and it'll tell you
like for numpy on mac os 10 universal does it have a signature? Is there a binary distribution? What versions
are updated? Supported? How old
is it? How big is it? Same thing for Linux
architecture arm
on Windows and so on and so on.
So you get like just this cool graph
using Ridge of like tables
of tables telling you about the
status of wheels on different platforms
straight out of PyPI, which I thought was cool.
Nice. Yeah, so that's pretty good.
So, Henry, thanks for making that happen.
Also, on the last episode, out in the
YouTube, not live, comments,
we got a message from,
I want to make sure I get the attribution,
from Bahram, and
said, we talked about,
what is it? T-Bump. T-Bump.
That was it. T-Bump for bumping the versions.
He said, oh, that's cool. I use bump to version, which is another option to do some similar types of things uh can work with with
or without um source control all kinds of stuff so fun one to check out and um brian you sound
really good this this time like last time i thought maybe a b had gotten into your microphone
what was the story of that um it's a long story basically i had to
throw a mic uh so i had a bad mic um and a bad cable but i have a new it's tough when the two
things that are connected together are both broken at the same time the buzzing i think was definitely
my cable i think there was a feedback thing going on you're getting an sms um but the then and then
i was examining everything in my in my audio chain and uh
i just got rid of the stuff that wasn't working so yeah oh you sound great new mics even better
than before so like a phoenix you're back nice image too yeah and then have you got your mac pro
yet no i just bought a mac a couple years ago i'm not gonna buy another one right now anthony are you using one of these to test your own version no i don't have a spare four thousand
dollars yes for another another laptop and also i was like i don't really need a laptop because
i never leave the house so like yeah that is a big problem i mean i am so loving my mac mini and my
4k monitor that i'm just like i don't want to leave i don't really all right well that's it for the extras i think it's time for a joke maybe robert's got the first one out
there um can't complain about brian it's all about the hair you got to see the live stream for that
one but yeah i agree with that uh next next halloween i want to go as cousin it so i got a
ways to go anthony are you up for doing this joke yeah Yeah. Yeah, I got it on my screen.
Oh, you got it on yours?
Yeah, I'll put it on yours.
All right.
Okay, okay.
So it's a picture, so I'll have to describe it.
I couldn't stop laughing at this when I saw it.
So this is Frodo explaining to Gollum.
And there's Gollum sitting at a computer looking quite confused,
looking at a picture of the ring.
And it says, buy now one ETH.
As in Ethereum, right? Yeah right yeah yeah as in ethereum and frodo is basically um trying to convince golem to buy
an nft of the ring instead of actually having the ring and underneath my digital precious
so underneath it says so you can't own the precious physically, but you can pay to have your name listed as its owner in an online distributed database.
It's only what is that like 400 US dollars, 500 Australian, something like that.
I know that's a lot for a listing.
I don't own any NFTs yet, nor have I sold any.
I don't plan to either.
Man, I feel like we're missing an opportunity
to brand some of our former episodes maybe like i could just take screenshots of brian laughing
at different times out of the live stream and then like turn it into a stream of nfts that
we'll retire upon oh yeah let's let's do that yeah oh fantastic all right oh that was a good one thanks anthony
and thanks for being here on this big episode 256 yeah i feel like we've maybe gone slightly
over this this is not really a python bite this week is more of a python lunch sandwich
yeah it's a proper meal a python dinner but it was a good one we talked about a lot of stuff and
a bunch of great people in the audience gave us like really good inside information on where
things are going so yeah so thanks everyone thanks brian yeah all right bye y'all thank you