Python Bytes - #256 And the best open source project prize goes to ...

Episode Date: October 29, 2021

Topics covered in this episode: * It’s episode 2^8 (nearly 5 years of podcasting)* Where does all the effort go?: Looking at Python core developer activity Why you shouldn't invoke setup.py direc...tly By Paul Ganssle (from Talk Unlock the mysteries of time, Python's datetime that is!) OpenTelemetry is going stable soon Understanding all of Python, through its builtins FastAPI, Dask, and more Python goodies win best open source titles Notes From the Meeting On Python GIL Removal Between Python Core and Sam Gross Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/256

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 256, or as Anthony Shaw likes to put it, 2 to the 8th, recorded October 27th, 2021. Again, unless you're Anthony, which is probably like a totally different day in the future, because he's in Australia. I'm Michael Kennedy. And I'm Brian Akin. And I'm Anthony Shaw. Hello. Hey, Anthony. How is the 28th? Is the next day going to be good or things are okay? Yeah, it's pretty sunny today.
Starting point is 00:00:28 It's nice. Yeah, right on. Okay, so the world hangs together for one more day. Fantastic. You've been here before. You've been on TalkPython a bunch of times, friend of the show, all sorts of stuff. So I'm sure many people know you,
Starting point is 00:00:39 but just tell people a bit about yourself. You're doing more techie things these days. You're a little closer to the code maybe yeah so earlier this year i started working at microsoft and work with nina zakarenko on python inside microsoft and yeah a lot of what i'm doing at the moment it's just running around breaking things sometimes on purpose um yeah just seeing how we can improve our experience and working with vs code and azure and a whole bunch of other stuff so yeah yeah it's been a while since um
Starting point is 00:01:10 the last episode was episode 100 i think wow you're hitting the big numbers so yeah this two to the eighth is a significant milestone i think it is it's pretty cool yeah awesome well we're happy to have you here thanks for being Also, something to do with a puppy I've seen on Twitter. Oh, yeah, I got a puppy as well. He's not golden something. He's a border collie, but he's kind of golden colored. And he's not in the room at the moment. He's not allowed in here while I'm recording.
Starting point is 00:01:40 I thought it would be a bit chaotic. Yeah, my puppy sometimes is here, but it's very bizarre the way that puppies socialize around COVID. Instead of us being gone and then we come home, she now knows and understands the expressions I make to end a Zoom call. So she'll sit quietly for an hour. And as soon as I say goodbye in Zoom, she's like, we're ready to go. Let's go. It's super bizarre. But yeah, that's the world we live. So enjoy the new puppy. Brian, you want to kick us off with our first topic here? Lucas Langa, he's, what is he again? The developer in residence? Yes. Anyway, he wrote an article called where does all the effort go?
Starting point is 00:02:17 Look at looking at Python core developer activity. And I kind of really liked this article. He not only talks about really what's going on with developers and who's doing what. To start off with, he talked about how he got this data. So this is also sort of a data processing, sort of information scraping sort of article. He's looking at the GitHub repository data for CPython, of course, and specifically pull request data. So there's a discussion about, he's even using dataset, which is nice. We've covered that on the show, and even lists the SQL queries that he has to try to get some of this data. So of the neat uh data that he's got oh also uh since get it's the data is from uh from the time when c python moved to github so that's uh february 10 2017 um and it's uh he mentions that it's up through october 9th is when he
Starting point is 00:03:19 pulled the data so uh but all the information is. So you could grab it yourself if you want, even the little scripts he's got for, for modifying some of the data. But so some of the interesting things, the top, top parts of CPython that are modified, it's probably not that surprising that caval.c is involved in 259 merge requests. It's the top merged file. C of val.c. Yeah, that's where the bytecode processor is. So yeah, that's the center point, or the tunnel everything flows through. Does that make sense?
Starting point is 00:03:55 Yeah. And then it goes on and looks at which contributors have merged. And this is an interesting thing. Or had been involved in PRs um it lists the top he lists the top 50 people but it includes uh some bots which is interesting i was going to ask that i thought bed of air is probably going to be up there or miss islington yeah both bots by the way um so that this is a i'd actually love to talk to or or either me or michael or somebody talk to one of the or python people to talk about the different bots that are used and why they're used and
Starting point is 00:04:31 because that's an interesting thing of large projects using bots to help out with yeah that's interesting the work um and uh anyway uh the the non-bots, there's a couple of people that stand out, Victor Stinner and Serhi Storkaka, so I apologize for messing up your name, but they're really up there, so that's pretty interesting that they're involved a lot. And then there's a description here, a nice note that Lucas writes, clearly it pays to be a bot or a release manager since this naturally
Starting point is 00:05:06 causes you to make a lot of commits uh victor and sarah he uh are neither of these things and still generate an amazing activity kudos and also it's not a competition but it's still interesting to see who makes all these recent changes um by the way this uh that top pr thing was only since the beginning of January 2020. So taking a look at the more recent stuff. And then one of the things that's interesting looking at who contributed where, I didn't know this. There's an experts index. So that was linked.
Starting point is 00:05:39 Oh, it's asleep. An experts index that is part of the Python developer's guide. I didn't know this was here. It's a kind of lists parts, some parts of the system, but there's blanks. Um, and so there's, uh, so Lucas, uh, also, or, um, listed to the script and pulled out the top five contributors to each file, which is kind of an amazing list of all of the different of, you know, the top five people for every file within CPython. So if this is kind of neat, because if you're going to do a PR, or you're working on a fix or something, and you're a little confused by some of the code, one of these people might be able to help you out. So it's kind
Starting point is 00:06:21 of a neat list. So there's a's a, at the bottom of the, uh, article also, he talks about some of the, uh, some of the takeaways from this. Uh, don't have this right off the top of my head. Um, merging how long it takes to merge a PR. So, uh, it's hard to draw information from this data because it's all over the map. The standard deviations are pretty large, but if a core developer merges their own PR, it takes on average about seven days to get through the process, give or take 42 days.
Starting point is 00:06:54 And, and then core developer authoring of PR, which is merged by somebody else, it takes longer, about 20 days, give or take 78. And then a community author, it's up to 20 days,
Starting point is 00:07:07 give or take 80 um and then uh community authors up to 20 days give or take 80 but i mean i work on commercial projects that are uh not really that much faster than this so um it's it's not too bad yeah what do you think this article yeah anthony what do you think of this you spent a lot of time inside the c python code i mean you did write a book c python internals which people can check out right yeah that's how you had to write a book about c python source code so it's interesting i'm first of all i'm super excited about lukash being the new developer in residence i think he's got the right approach and he's already made um you know really promising progress i think in terms of how trying to make the community contribution process a bit slicker um yeah that that's that at the bottom like i just watching the github repository core developers working on the repository and
Starting point is 00:07:54 making changes and stuff from the outside looks looks fairly seamless um my own personal experience has been sometimes it's quite like if your pr gets responded to within the first week then it probably get merged pretty quickly and then if it doesn't then it just kind of ends up in the pile um and i've had ones in there for like three years right the average was seven plus but but it could go out of like another 40 days and it's probably like really quick or really far well that yeah that that metric is how long they take to get merged which i guess requires that they are merged um yeah oh yeah so that's how that's all i mean there's there's basically just like loads people
Starting point is 00:08:35 contributing stuff and there aren't enough people with enough time to to sift through it all and it just makes it really tricky and the project needs to continue marching forward. And there's people who are dedicated to working with the core developers. But some of the community contributions are really valuable. I think that's what's promising to me is that Lukasz is kind of looking at that
Starting point is 00:08:58 and not just taking this role on as I'm going to be 100% core developer. Yeah. Because yeah, I think there's already lots of other people on the team who are making some amazing contributions um you know pablo has been working on the new parser and now he's working on this like um stack list changes in 3.11 yeah there's so many things going on at the moment in in c python so it's really encouraging to see yeah it's really encouraging to see. Yeah, it's super encouraging. I think Lukasz is doing a good job sort of smoothing out the edges to just make it easier for everyone to go faster, which I think a lot of times in teams,
Starting point is 00:09:35 not specifically here, but in general, there's these people who are kind of, oh, that's the person you can ask to make the CI work again when you break it. This is the person you asked to set up a new machine and remembers how to do that and like you don't necessarily get direct credit for doing that work but without them it's just way harder and i feel like he's doing that for c python behind the scenes yeah the experts index is really helpful if you want to get involved in bug triaging uh so that's something that people are open to help with if you go on bugs.python.org and you want to help to triage bugs um often what you have to do is kind of look at it make sure that the person who's
Starting point is 00:10:11 reported it is filled in all enough information and then basically add people on the experts index to something called the nosy list which is like a cc list basically um on the bug and then yeah it's just kind of directing it to the right people once you've done that for a while then you kind of get given like a triage uh flag in your user and then if you've been doing that for even longer then you could be promoted up to a core developer and there's a few people who've gone through that that route um over the last couple of years all right anthony while you're talking i got two things to share out of the audience dimitri figo hey dimitri great to see you here dimitri. Great to see you here.
Starting point is 00:10:45 Dimitri says, thanks for inviting Anthony. He's someone I look up to. Very nice. Thanks to meet you. Good to see you. Yeah. And Waylon, who was recently on TalkPython. Hey, Waylon.
Starting point is 00:10:54 Says, what a great lineup here. Also kind of for you. And also Henry Schreiner. Hey, Henry. Also recently on TalkPython. Says, both PRs I've been involved with to see python got in in about a day i believe which that's that's pretty amazing that's pretty good yeah that's great yeah so before we move off from this one brian's a good pick one thing i just want to point out as well is all these cool stats and
Starting point is 00:11:15 these graphs and everything we're seeing here apply to see python because it's on github right yes but you can run the same code and run dataset from Simon-Willison against it, but against a different repo, I would imagine, right? Oh, yeah. Yeah. So if you run a project, you could probably do a similar analysis for your project. That's a good idea. Yeah.
Starting point is 00:11:34 All right. Speaking of good ideas, and it's interesting that Henry is out in the audience, because I feel like we might have been responsible for this article. Clearly, we did not write it. We may have triggered, is what I'm saying. Mostly me and not the positive way, right? So this is a cool article by Paul Gansel, who was also over on TalkPython
Starting point is 00:11:54 talking about the mysteries of date, time and stuff. There's all sorts of cool things. He maintains the date you told package and set up tools, projects and so on. Over on episode 271. so he wrote an article said why you shouldn't invoke setup.py directly and the reason i think i might have somehow had something to do this is henry was on talking about ci build wheel and all the proper ways to build packages i said oh you can run setup python setup py space you know wheel or bdist or something they're like
Starting point is 00:12:23 no no no you could but please don't do that and then here we have this article like two days later so i don't know if that was part of that conversation but it's it's a really good article talking about the state of building python packages and it says you know look for a long time setup tools and distu tools distu tills were the only game in town when it came to creating Python packages. Right. So you could do something like invoke Python setup, B dist, S dist, wheel, and so on. Wait, I see. So Paul is actually in the audience. Real time. Fantastic.
Starting point is 00:12:53 Hey, Paul says, I think I did it because Matthew Fikert asked for it on Twitter and I got sniped. Yeah, perfect. Okay, good. So I'm just a coincidence. Fantastic. But yeah, so my, the reason this is extra interesting to me, and thank you, Paul, for writing it, is I was still doing this Python setup UI various commands.
Starting point is 00:13:11 And I was talking to Henry. He's like, no, you shouldn't do that. You should do it this other way. I'm like, well, he said, well, OK, well, how should I do this? You should use build, the build package. What is this build package you speak of? So we've talked about pyproject.toml a bunch of times.
Starting point is 00:13:26 We've talked about things like flit and stuff that will use it, right? This all comes from pep517. And there is a package called build. You can pip install build. And then you do things like python-m for module, run build. And you can say, I want an estus, I want a wheel and things like that. And this acts
Starting point is 00:13:45 as a front end to things like setup tools, to the various backends that do building for Python. Yeah. All these different things that understand it. Right. So it says all direct invocations, Paul says, all direct invocations of setup.py are effectively deprecated in favor of purpose-built standard-based CLI tools like pip, build, and tox. So this is quite a long article. There's a lot to go through. It has some interesting history. So in the early days, there wasn't even distutils. And then in Python 2, distutils got added, and then setup tools came along. And then while they work, there's still problems.
Starting point is 00:14:26 Like, for example, you might have dependencies that you have to install to run the setup. But the way you install stuff and figure out what you depend upon is by running the setup. So what do you do? So an example of that would be Cython, right? So you might have to import Cython, and in the invocation of calling setup, you tell it how to Cythonize the PYX files, right? So for in the, you might have to import Cython and then the invocation of calling setup, you tell it how to Cythonize the PYX files, right? But that's obviously not going to work because you're going to have to have Cython installed, but how do you express that? You know, it's like this chicken and egg problem, right? So let me pull up my notes here. Yeah. So basically
Starting point is 00:15:01 one of the big questions was why am I not seeing deprecation warnings? Let me go down a little further. Yeah. So if I'm not supposed to do this, why isn't screaming from the top of its terminal? Stop, stop, stop. Why are you doing this? Right? So there's a lot of commands that still have indirect uses of the distutils and stuff. So it's a little tricky to deprecate it, but, you should consider it deprecated. At the end of the day, it's better to replace your set of commands with tools like build
Starting point is 00:15:30 instead of setup py sdist or bdistwheel or talks and knocks instead of setup py test and other commands backed by projects intended to support that. Yeah, that sound good to you guys? Where were you on this uh brian you go well i don't use have opinions i mean i've kind of indirectly used build but i i basically just use flit so um i'm not writing things with c extensions so pure python stuff
Starting point is 00:15:58 i just do a flit build or whatever that works fine yeah so that's kind of i mean that's using the pyproject.toml stuff right yeah yeah anthony i kind of if i'm starting a project now then i use pyproject.toml and the project doesn't have a setup.py there were some reasons why i had to add one um in the past but that's mostly fixed now so i'm either using flit or or something similar like poetry um yeah and i've worked on projects years and years ago where the setup.py was like just ended up just being a script to run ad hoc commands like there was a test setup.py test and then there's like and lint and yeah what does that have to do with installing software right why is that nothing it was just like yeah it just ended up being at entry point to to do things um and one happens to be installed but there's a bunch of other stuff
Starting point is 00:16:50 you might randomly do yeah and it's fine that it's being deprecated but it just you know c python still does that like the setup.py and c python is still used in that way um and called and invoked directly um in the source code this is so um yeah i it's good that it'll be deprecated but i don't think the tooling is quite ready yet he's not really saying to get rid of setup.py just don't use it don't run it directly yeah find find something better pip should do that pip should do the discovery for you for pet 517 yeah um and and run the correct uh steps for you so yeah absolutely so a couple comments out in the live stream is that while recommending build it's uh nearly impossible to google to find it and race as i love and hate the name so authoritative so ungoogleable and a bit
Starting point is 00:17:40 hard to use in conversation but yeah yeah for sure yes So I think if you want to take away from this conversation, right at the top, there's a TLDR section that Paul put in. Click on the summary, takes you down to a summary, and you can go to a table and it says, I was about to type this. What should I do instead? I was about to type setup.py sdist. What should you type?
Starting point is 00:18:00 Python-m build, having build installed. Or if I was going to type setup.py bdist wheel, I should type python-m build dash dash wheel or something like that. Setup.py test. Oh, maybe PyTest or Tox or Nox. We covered Nox recently with Prason, which was really fun, I believe, episode.
Starting point is 00:18:18 Setup.py install. No, that's pip install. Python setup.py develop. No, that's pip install dash e. And develop. No, that's pip install dash E. And then as well as upload, it goes back to Twine. So yeah, anyway, I think this is the most actionable bit here. Yeah, it's good. Yeah, indeed.
Starting point is 00:18:34 All right. Well, Anthony, let's talk about keeping an eye on things. Yeah, so I wanted to highlight a project which has been in the works for a while, but they've just recently finalized the specification so this is called open telemetry it's a part of the cloud native computing foundation the cncf and it's a cross-language event tracing performance tracing logging sampling framework for applications in particular for distributed applications so if you've got an application which is spread across multiple microservices and you want to
Starting point is 00:19:12 trace things or monitor performance or whatever across all of the stack and it's it's super it's a super hard problem right maybe you've got a docker container running this thing that docker container calls some other service on a different docker container and maybe the logs are even transient what what are you going to do to know if something went wrong where yeah exactly and if you've got an application that's spread across uh well if it's built into multiple microservices then and one of those services has a fault it's really hard to know where that fault came from so like like if it just says error, blah, blah, blah, blah, you're like, okay, so what triggered that error? Which request from a user at the front end or like how did that error happen in the first place
Starting point is 00:19:52 and how can I fix it? And also like identifying, I guess, tracking performance across your application and looking at that. So there's been attempts at doing this in the past. Open tracing and open sensors were the two kind of projects uh beforehand so this new project open telemetry is a merger of open tracing and open sensors there's engineers from some big companies working on this including microsoft
Starting point is 00:20:18 amazon splunk google elastic uh new relic and a whole bunch of others as well, including actually full-time engineers from some of those companies working on this. So yeah, I've been working with an engineer at Microsoft who works full-time on this project. He works on, actually there's a few people who work full-time on this, but the person who works full-time just on the Python components to this. So thedk basically allows you to instrument lots of different frameworks so you can basically drop it into flask or django or um stylus so if you're using fast api and you can sort of instantly get capture of what requests are going into the application when there's been a crash, like where that exception's gone,
Starting point is 00:21:05 all the logging information. You can look at performance records and stuff. I've been sharing some examples of where I've wrapped it around a fast API app, and then I can see performance of what's the average request time for each of these parts of the application, and where is that time spent,
Starting point is 00:21:23 even down to like- Can you say like, this is the data layer section and this is the the business logic and here's the organization or whatever exactly so i can kind of see like almost like a cool stack but across the actual components of the app so here's where it came into fast api here's where it went into database uh like here's how long the query took here's how long the orm took to remodel it here's how long Ginger took to build the template like so you can kind of see a breakdown of all the different components and how things are being pulled together so there's two parts of OpenTelemetry
Starting point is 00:21:56 actually more than two parts I am actually really appreciative of even though there are lots of engineers from big companies this hasn't been over engineered uh yet and i'm really hoping it doesn't is there a factory factory method in here yeah exactly especially because it's like so generic um there's a real danger of it being just over engineered so if you go on the website and go to registry and then pick python on on the right hand side you'll see the kind of different extensions you can get. So instrumentation is basically like, this is the thing I want to monitor. And it could be like ASCII or Async Postgres, for example, Database Celery, Django, Elasticsearch, Flask. There's a stack of app stacks that you can just drop it into and it will give you all the tracing information. And then there's a stack of app stacks that you can just drop it into and it will give
Starting point is 00:22:45 you all the tracing information um and then there's these things called exporters which is basically like once it's got the information it can send it to somewhere uh like datadog or new relic or um azure and aws obviously and google um monitoring as well and um yeah actually i just worked on recently if you just want to hack around with it there's an exporter for rich um that just basically prints it on the console so you can see everything that's happening um and it's all color probably yeah yeah yeah yeah so it's all kind of color coded it's really nice actually i so yeah i'm really excited about this i've been mostly trying it with fast apAPI as there aren't really many frameworks
Starting point is 00:23:27 for setting up like decent monitoring and tracing in FastAPI applications. And yeah, I think it's really promising. So I suggest you check it out. And if you see a framework that needs support or something, then this is all open source and they're all accepting contributions as well. And it's fairly straightforward to add support. Yeah, it's got Postgres, MySQL, MongoDB, Pyramid, Redis, all sorts of good stuff in here.
Starting point is 00:23:54 Another thing maybe worth pointing out here is because this crosses languages, right? There's a Python one, but there's also a.NET one, there's a Swift one and so on, which means there might be scenarios where I've got like, say, a mobile app written in Swift and then I've got the backend written in Python and FastAPI or something. And you want to put those together. Because it goes across those languages, theoretically, that's a thing that could happen.
Starting point is 00:24:19 Absolutely, yeah. And you can pull that all together. And it would give a request a trace id um so when a request comes into the front end a trace id could carry across uh the different stacks as well which is pretty cool yeah yeah very cool this is neat awesome thanks for for covering it uh now before we move on brian we have a sponsor for this episode that's cool yay thanks to shortcut shortcut formerly known as clubhouse so they're a really cool project management tool and they ask the question have you ever really been happy with
Starting point is 00:24:51 project management you know how's your um how's your uh jira or whatever right how much are you loving it so so they basically say most most are either way too simple for growing engineering teams to manage everything or too complex and just throw in the kitchen sink and you don't want to work with it. You've got to constantly tweak it to make it work for you. So Shortcut, who used to be known as Clubhouse, is different. They try to be simple. It's project management built specifically for software teams. It's fast, intuitive, flexible, many other nice positive adjectives.
Starting point is 00:25:22 So some of the highlights are team-based workflows, individual teams can use shortcuts, default workflows, or customize them to match the way they work. Also organizational-wide goals and roadmaps. So these workflows automatically get tied into larger goals and feed into like a bigger system outside the team. Good source control, integration, GitHub, GitLab, Bitbucket, all those types of things. One thing that I really love is the web app has hotkeys. So it's keyboard friendly, just like HR and VS Code, whatever, right? I don't know why more web apps don't have hotkeys. It's not particularly hard, but they do, which is great. Iteration planning, so you can set your priorities and let Shortcut run the schedule. You get nice little burndown charts and so on. So check them out at shortcut.com slash Python bytes, shortcutcom slash python bytes because you shouldn't have to project
Starting point is 00:26:09 manage your project management that does not sound fun so let them do it it's their job now um before we move off to the next topic robert robinson on the audience hey robert this open telemetry sounds interesting wants to try it out i i do as well i feel like this is the kind of stuff that you just keep putting off integrating into your system and then once you finally finally do you like oh look how awesome this is we can see what's going on and it's actually did you know this part was crashing no i didn't know that nobody looked at the log and it was just eating even the exception right yeah tricky tricky all right brian you got the next one um so Python's got a few built-ins, not a ton, but quite a few. So this is a, there's an article called from Tushar Sadwani called understanding all of Python through its built-ins. And it's a pretty, like he's got a pretty ambitious goal here to understand
Starting point is 00:27:02 everything. But I, I actually kind of really enjoyed even the first part of it. So I started reading it. I've been especially giving it a shot. I got a shout out to him. He's been fairly involved on Twitter, answering questions and being involved in conversations. So that's a good way to get noticed. But there's a, there's, there's a starts off talking about scope. So what is built-ins are not just things that Python has built in, but there's also, it has a relevance to the scoping rules and he called it the LEGB scoping rules. So it's when Python, if Python sees a symbol, first it looks in the local scope, then the enclosing scope, and global scope, and then the built-in. And built-ins really are just anything that's in the built-in package.
Starting point is 00:27:52 So, and that, actually that discussion, it's a really pretty good discussion, and it helped, it kind of, it's good for especially newbies to understand, but even advanced beginners sometimes don't quite understand what's going on here. Yeah. Brian and Anthony, you both come from C style languages historically, right? Or at least I've spent a lot of time there, right? Brian, do a lot of C++. Anthony, I know you've done some C sharp and stuff. Did the scoping story of Python confuse you and kind of leave you a little uncertain in the beginning? Yes, definitely. Especially coming from C++ where it's very well defined. And if it's in the curly braces, it's alive afterwards, it's gone, right? Like, wait a minute, that's not the story at all. Right. And also, you've got so many nested
Starting point is 00:28:34 curly braces, it could be anywhere. And it's not really, it seems like, actually, we just don't do that too much in Python. but Anthony would probably know better than me. If I've gotten multiple nested curly braces, we don't have curly braces, but multiple nested indentations, does the scope sort of look in outer and outer and outer ones? Is that what non-local means?
Starting point is 00:28:59 There's a non-local keyword, which is like a whole other thing. That's a completely different thing okay i think i don't know capture basically yeah yeah yeah but yeah the the difference in global really freaked me out because really we were pounded into our heads everywhere is to never use global variables but global is different the global namespace is not a global variable. It's more like a module level. Yeah, yeah.
Starting point is 00:29:27 Or like a static variable in a class, maybe would be what other people might call it. Yeah, it's not a dangerous thing in Python. So I didn't mean to derail you that much, but I think it's interesting to think about the built-in scope, the global scope, these different scopes, because it's such a different world
Starting point is 00:29:42 from the intuition you get coming from all the C languages. Yeah. Also just sort of just really enjoyed looking at the language through the scope of built-ins. It's an interesting take on it. One of the, I will pull out a few things that he mentions, and one is all the constants. I guess I'd never counted them before, but there's five. There's five constants in Python. True, false, none, ellipsis, and not implemented. I do like ellipsis. We talked about that the other day, or I guess one or two weeks ago, using dot, dot, dot instead of pass. Are you going to start doing that?
Starting point is 00:30:16 I've already started doing that. Have you? I'm all about it. I think I'm up for it as well. I don't, I guess I don't think I've ever used not implemented or even looked for it. But interesting discussion. Also, just like I like looking around. So here's a section on compile, exec and eval. It's not an alphabetical listing of everything. It's more grouping them together. It's quite a big article. But I would suggest people just like skim through the list because it's got a good table of contents at the top. you can just sort of, uh, skim through what he's talking about and pick a couple and go read about it. You'll probably learn something. So, um, anyway,
Starting point is 00:30:54 a good shout out to too sharp for writing this. Yeah, this looks super handy. Yeah. Some of their built-ins are super handy. Um, I often have a Python report open just to do, uh, things that would otherwise be annoying to do on a calculator like converting hex uh integers and vice versa there's a hex built in which is really helpful actually um doing yeah i use hex a lot because i'm often uh uh looking at um looking at data elements in a in a uh a packet or something like that and trying to convert those so yeah very nice nice one before we move on anthony how do you feel about dot dot dot they should have called it yada yada yada um yeah i think that would be uh it's way better than ellipsis come on yeah i'll
Starting point is 00:31:39 use it for type stubs uh and that's it so yeah there's times we use pass right and i feel I feel like, you know what, dot, dot, dot kind of says, I'm not ready to put stuff here yet. I think we should start calling, instead of ellipses, we should call it dun, dun, dun. Exactly. All right. How about we hand out some awards? Okay.
Starting point is 00:32:00 Best open source software of 2021. Now, who gets to vote on this? Who gets to say? Well, InfoWorld in this example. Now, who gets to vote on this? Who gets to say? Well, InfoWorld in this example. So this is according to InfoWorld, but there may be other rules. But I found this to be pretty interesting, actually.
Starting point is 00:32:13 I heard about it, learned about it because Sebastian Ramirez from FastAPI said, yay, we've been voted one of the best open source projects. So this is called the InfoWorld Bossy 2021 awards. But what I thought was interesting is going through here, there was 30 different projects that won awards. I'm like, oh, that's interesting. Oh, I didn't know about that. Oh, check this out. Yeah. So I wanted to touch on a couple. So there's some
Starting point is 00:32:34 things that may or may not be interesting to you, like Svelte, which is a JavaScript front end, like Vue or React. That's not interesting to me. But Minikube, Minikube is pretty interesting. Minikube is a way to run like a baby Kubernetes cluster right on your computer. Just say Minikube start and guess what? You've got a cool little cluster running. So that might be really helpful for Python people. Let's see. Pixie, zoom back a little here. Number five is FastAPI. We're all fans of FastAPI. I think it's really awesome that it won. And worth maybe giving a quick shout out to how they described it as Django and Flask have been leading the Python web frameworks for years.
Starting point is 00:33:13 FastAPI now deserves to be mentioned in the same breath. I agree. Calls out the main features, which are it's truly modern Python web framework written from the ground up using type hinting, async and high-speed components by default. That that's true and i also really like that they pointed out that while its name indicates it's primarily for apis it's also really good at writing more conventional websites with like ginger templates or even chameleon templates so way to go anthony you
Starting point is 00:33:40 want to add or brian want to add anything well i just think that you're partly to thank for people considering FastAPI for not just APIs because you've been beating that drum a little bit as well. Yeah, thanks a bunch. I even created some decorators that make it really easy to render templates as response values and stuff. Yeah, it's fun. Anthony? Yeah, I tried out the chameleon thing.
Starting point is 00:34:01 The one you wrote, actually. Yeah, because I'm working on this uh fast api course with you at the moment um yeah that's gonna be fun so yeah i'm a big fan of fast api i think it's brilliant um and testament to sebastian really because he really kind of builds on something which is quite complicated but he makes it seem so effortless um and just working with fast api like the documentation is excellent. The framework itself is just, it's really logical.
Starting point is 00:34:30 And, you know, it's really easy to use. In terms of like the, I've been keeping an eye on the popularity of the different frameworks and stuff over the last few years. And Django and Flask are kind of neck and neck and have been for a while. And FastAPI now is the third third most popular according to the metrics that i've yeah out of nowhere to third third most popular yeah yeah
Starting point is 00:34:51 and i know um jet brains are doing the new uh the latest uh well the psf developer survey so yeah we'll see kind of what happens in this year's this year's number but i'd imagine fast api would still be the third um most popular so yeah it's which is brilliant um so yeah i think it's a good it's a good solid pick in terms of writing like full apps with it at the moment like there's still a lot you have to do for templating like you you pretty much have to like build in a whole bunch of other templating stuff and picking an orm at the moment isn't easy, but there are some brilliant ones to have a play with.
Starting point is 00:35:27 Yeah, there's a couple interesting ones. I want to give a shout out to. Yeah, that give, similar even integrating with Pydantic, which is sort of the natural exchange of FastAPI. So you want to give a shout out to Tortoise, you say? Yeah, that's my favorite so far. I've tried out six different ones so far. And Tortoise, I think is my favorite at the moment.
Starting point is 00:35:46 Right on. Well, maybe next year we'll be talking about the award for SQL Model, which is built on top of Pydantic plus SQL Alchemy by Sebastian as well. So who knows? A lot of good ones out there. It's good to see a lot of the excitement and new ideas coming along there. All right, what else we got? Crystal, don't care.
Starting point is 00:36:04 Windows Terminal, I think is actually pretty interesting. Windows has traditionally been not on par with its terminal experience. And I think, you know, the Windows Terminal, PowerShell 7, Oh My Posh, all these things come together, nerd fonts, to make it quite an amazing place to be actually. Windows Terminal is an open source project? It didn't start out that way, but now it is. Yeah. Okay. Yeah. Yeah. So that's a good one.
Starting point is 00:36:29 OBS Studio, if you're doing video stuff, that's amazing. There's a bunch of stuff in here that may apply to people that you can all check out that are interesting, but I don't want to cover them. Dask, though. Dask is a big data science one. Scale computation, like Pandas operations and whatnot, across cores, across across clusters across compute that's larger than the ram you have by streaming it off disk and all sorts of interesting stuff i have no idea why my browser is jumping up and down we'll have to ignore that i'm not
Starting point is 00:36:56 in control of it i'm sorry it seems like you know i'll tell you why this is happening i'm i'm looking up and i see i'm not running my VPN, which would block ads. And so there's some kind of ad off the screen that's just running. And if I turn on my VPN, we'd be good. All right. Blazing SQL is another great one. Rapids from NVIDIA.
Starting point is 00:37:15 And I feel like there's one more I want to give a shout out to. Hugging face. I don't know anything about that. Now that was it. So just going through that list, I thought it called out a lot of neat projects in addition to just FastAPI.
Starting point is 00:37:24 Yeah. Yeah. Any of those jump out at you guys either and that i've just screamed by uh lots of lots of stacks that i don't use so um same yeah there was a bunch of ml stuff though which i don't use but i think would be relevant to people who are listening maybe well we're not to extras yet michael no no i know i just closed it because the jumping was driving me insane okay all right anthony you got the last uh main one right all right yeah so i think lukash is taking up like half of this episode so we're gonna get back to lukash's blog um and and evolve the discussion that was started last week on this discussion yeah yeah i i'm let's put it mildly i'm excited about this i think if if this happens it's probably going to be the biggest thing to happen in c python in the
Starting point is 00:38:11 last five years in my opinion and this being the gill removal this be the gill removal but not the gillectomy not the gillectomy not exactly um yeah so uh no gill or let's just go with no gill um so almost seemingly out of nowhere um sam gross um who works for facebook uh basically like submitted to the core developers this uh research paper and a working branch of a gill-less python um and just quickly recap i guess on what that means um this this article is pretty heavy in technical detail and the stuff that's um yeah the stuff that's being discussed in the article again is pretty complicated and i actually didn't understand a lot of it um and i've written a book on the python compiler. If you read this and it's confusing, don't worry. So the GIL is basically the global interpreter lock.
Starting point is 00:39:10 And it exists as a way of making Python thread safe when it comes to keeping reference counts of specific objects. So if you create a Python object, for example, there's a counter of how many things are referencing it um because you don't want to just destroy an object and then like you're working through a list of objects for example but then one of the items and in the list just disappears has been deallocated or is a point because everything is a pointer in python like that pointer just goes nowhere um or actually there's a there's a. Like that pointer just goes nowhere. Or actually there's a magic pointer that Python uses when it deallocates objects, which I know from a very painful experience.
Starting point is 00:39:53 So you don't want that to happen. And if you've got multiple threads kind of working with the same objects all at once, you don't want them to, it's incredibly hard to keep track of what's happening. Threading is great because you want, you can have multiple threads working on a computer and the operating system can do the scheduling of which threads one on which cores and which CPUs, et cetera. So in theory, it's a way of making your Python
Starting point is 00:40:18 applications a lot faster if you write them to be multi-threaded but python's basically built in this lock which says okay in the evaluation loop in ceval um don't let anyone else run a instruction whilst this thread is running the instruction yeah with the exception it seems like it's um yeah yeah it seems like this is a thing to control threading and really it's just a thing to protect memory management, but it has this huge blocking effect for threading, right? Yeah. So it's the thing to basically make the reference counter thread safe. Without locking. So it's fast.
Starting point is 00:40:55 Without locking. Yep. So you don't have to wait to add an income. So to give you an idea, like if you run the GC by hand, you'll just see how many tens of thousands of objects are just created like all the time in Python applications so what Sam had put together I say seemingly out of nowhere but if you go through the article and what he proposed he's
Starting point is 00:41:18 actually been working on this almost full time for two years which is astonishing and it's a it's a real feat of engineering to be honest so kind of what he's proposed is a way of removing the gill um so that there's essentially um like almost two ways of keeping references into objects and one of them is specific to the local uh the local thread and then there's also another uh reference count which is for other threads so why is that important well let's say for example you've got a python dictionary uh with values in it and then you have multiple threads or working on the same dictionary like that's that's a complicated problem to solve like how do you make sure that the keys like the references to the keys
Starting point is 00:42:06 or the values don't disappear um and it does actually go into detail about how that's been handled and also objects like python dictionaries are not thread safe at the moment either so you know if you have two threads um working on a dictionary adding values for example to a dictionary do you have to lock the hash table um anyone who's worked with multi-threading and in low-level languages knows that like the complexities of uh complexities of doing this so what he's proposing is that uh well in his prototype he basically replaced the python memory allocator, with another one called Mimalek, which is a sort of thread-safe memory allocator. It's actually a Microsoft project, but I think it could have been
Starting point is 00:42:55 any other thread-safe memory allocator. Writing memory allocators is very involved for them to be performant and efficient. And then basically objects get tied to the thread that created them. And then there's a non-atomic local reference count with the owner thread. And then there's basically a separate mechanism for what would be slower, basically reference counting from other threads. So single threaded performance is equivalent um with this proposal but um when
Starting point is 00:43:28 you're there's still a performance impact of multiple threads working on the same object which is to be expected yeah there's always a little overhead for that yeah but to give you an idea like in in his note he implemented a few like common problems as a multi-threaded um implementation and he said if you give it 20 threads it runs 19.84 times faster um than it would in just regular c python so like yeah for certain types of problems this can have a enormous impact in performance um but it is really complicated and that's why i think it's an interesting discussion to see okay how do we how do we get from this is a cool idea to this actually being released and being used by you know millions of people and i don't know python's like running on like a
Starting point is 00:44:19 satellites in space and stuff like how do we go from a fork that someone's been hacking around with to something that's like production ready and this is kind of what the article goes into so like um you know how would this work would it be a feature flag um which version would we target and so at the moment it's targeting 3.9 alpha 3 actually so it wasn't even the release of 3.9 so he needs to do some work to update that to the latest version of 3.9 which is 3.9.7 and then i think the target release if if the core developers agree to kind of like explore this um if that was 3.11 uh or i don't think anyone wants to touch the Python 4 topic. But 11 is like a year away.
Starting point is 00:45:09 Is that even possible, or would it most likely be a couple years out? Yeah, it seems pretty soon to me. And like subinterpreters, for example, is like an experimental feature. I think the issue with this is that the volume of changes is so broad that it's quite hard to kind of like have it in as a feature toggle.
Starting point is 00:45:27 So like Subinterpreters was in, there's like a hidden package that you can use and it's experimental. Whereas this is like changing- Everything. Yeah. Well, not everything, but like it's a pretty wide sweeping change
Starting point is 00:45:41 and changing the memory allocator is a massive change um so question is more how can we introduce this softly i think and have it either as a feature flag um and what would this break and the main thing is that c extensions haven't really had to worry about thread safety because the gill kind of handles that for them so c extensions essentially would would need to if they use the mechanisms that are here that's fine but c extensions often have other objects which they haven't used the reference counter for um so they've basically kind of like allocated their own objects and variables and stuff like that that would not be for thread safe and the head does not have had these kind of collision issues in the past um so introducing this would that potentially
Starting point is 00:46:29 break some st extensions so you know how how could that be introduced gently i think what was interesting in the article is there's a mention of numpy and numpy has actually done a lot of its own work already on um basically kind of making it thread safe and more scalable. But one of the tricky ones is PyBind 11 is called out in here as being, anyone who's using PyBind 11 potentially might have to do some refactoring
Starting point is 00:46:59 to support this, if it was supported. And then in closing, Lukasz, who wrote this review or post sort of said um you know the team had been really impressed with sam's work and invited him to join c python project as a core developer and he's interested in uh lukash is going to mentor him so i think that's brilliant like oh yeah that's brilliant just to come up with this this over like even two years is like a really short amount of time for a problem that people have been trying to solve for well over a decade.
Starting point is 00:47:31 So yeah, very exciting. Yeah, this is great. I think we have a record number of core developers in the audience right now. Yeah. So some great comments from Steve Dower. Hey, Steve. The big thing needed here is a path forward for native extensions.
Starting point is 00:47:46 They could all need rewriting or else importing them could re-enable the GIL. That discussion is happening now. It's very early. And Henry Schreiner also has similar comments that they're considering that. But yeah. And Henry also says we would be up for refactoring PyBind 11 if needed, I believe. It's also interesting. But this is exciting. There's a lot of stuff coming here. I think another thing in addition to the no-gill is I got the sense
Starting point is 00:48:12 that Sam had added several other optimizations that were independently worth adding to Python. One of the things, I know there's a lot of tension around whether or not to do 4.0. But if it ends up being that all of the extensions need possibly tweaked, then that might be then it's an API change. And I think the shift to 4 might not be terrible. Yeah.
Starting point is 00:48:39 Well, we should just go to Python 5. So no one's worried about 4 and we'll skip the whole conversation. I'll be fine. We'll do it at AngularJS. just go to python 5 so no one's worried about 4 and we'll skip the whole conversation i'll be fine we're doing angular js we'll just like make a big fuss about going from one to two and then just just all of a sudden they're on like version 10 or something yeah we'll just just go crazy yeah yeah no this is fantastic i'm actually having guido van rossum and mark shannon i believe um on on monday on talk python to talk talk about performance in the future and stuff. And I'm sure we'll talk about this stuff a little bit.
Starting point is 00:49:07 Yeah, so it should be a lot of fun. This was Guido's suggestion when I asked internally if anyone wanted to share anything. This is what he sent over. Okay, fantastic. Yeah, so I'll try to take that up with him again. All right. Well, Brian, does that bring us to our extras?
Starting point is 00:49:20 We are at extras. Do you have any extras? Yeah, no, you go first. Tell us about PyCon. Well, the call for proposals is open for US PyCon. I'm pretty excited about that. I already wrote down like six ideas of things I might want to talk about. So and of course, there's no guarantee.
Starting point is 00:49:37 No matter who you are, there's no guarantee that you're going to get in. But it's fun. It's fun to come up with proposals anyway. And it's fun. I'm definitely going. So I'm pretty excited about that. and uh anybody else is gonna propose anthony you're gonna try to talk there yeah i've been thinking about that what i'm gonna put forward and i want to put together a talk on performance anti-patterns um that'd be fun propose that for for next next year. Yeah. Because of your name, like aunt, auntie.
Starting point is 00:50:07 Um, also, uh, um, if anybody doesn't know, I wrote a book and then I rewrote it. Um, and it's, I,
Starting point is 00:50:17 I, I'm finished with it actually. So it's not out yet, but I'm pretty excited that I'm finished. Uh, the, all the betas, there's beta seven out has all chapters in it.
Starting point is 00:50:26 So if you're waiting for it to be done, it's done. It's not in print form yet. That's going to happen in January or February. So I'm pretty excited to get that done. I'm hoping for my copy at PyCon, Brian. I'm pretty sure I paid for the last one as well. I actually, I paid you in cash. So I'm going to give you a copy of my book.
Starting point is 00:50:44 I'll bring at least. Maybe we can do a swap. Yeah, that'd be great. Yeah. Anthony, I got your book over there. I'm not sure what I can trade it for though. That's awesome. Congratulations, Brian.
Starting point is 00:50:55 Thanks. Anthony, you got any extras you want to share? Yeah, I'll be shipping fairly soon. The JIT compiler that I've been working called Pigeon. I'll be going version one in two weeks so it's a Python uh Python 3.10 JIT compiler it's a you basically just drop it into CPython and turn it on and then run your code and it just JIT compiles it in the background um and in some cases makes it a lot faster and in other cases makes no difference um but yeah some of the benchmarks i've been doing um like uh float uh floating
Starting point is 00:51:33 point math and um integer math like makes a massive difference so um yeah like the scientific side of thing right yeah um so stuff that you would otherwise think oh i'm gonna redo this in scython or something like um that you you don't have to add all the extra stuff you just kind of turn it on and um yeah the n-body benchmark is now 60 faster um than standard c python um that's great and yeah some of the other benchmarks i've got a 60 upwards um that's super cool so uh this work with sam and the no gill does that throw a spinner in the works or is it uh it would make my life quite hard for a few weeks if it gets merged um yeah so yeah that could be interesting and i'm also working on another secret project but
Starting point is 00:52:20 i'll share that in a in a few weeks uh yeah pigeon does uh there's a comment in the chat pigeon does use psychic build um which uh yeah i did want to call that out when we're talking about setup.py earlier because um yeah so pigeon uses is all c++ uh and it uses cmake um which generates make files um so yeah and it uses psychic build which is a c make extension i guess around python extension modules um so that's how it kind of compiles it's really cool psychic build yeah and i recommended using build earlier uh henry on our episode together mentioned that if you have external non-python code like c code or Fortran or whatever, then instead of build scikit-build would be a good option to build the binary bits for that. This is the other question I wanted to ask and Steve Dower beat me to it. He states it as an assertion. I was going to ask you the question.
Starting point is 00:53:16 I bet once Pigeon ships, you'll get people interested in helping add optimizations. Yeah. So it's one thing to JIT compile. It's another to just then straight up run it versus go, oh, we can inline this method. Oh, and I see we can do this. And then like we could actually reuse yeah so it's one thing to jit compile it's another to just then straight up run it versus go oh we can inline this method oh and i see we can do this and then like we could actually reuse this field because it's not used below and are early free all that kind of stuff where's the optimization of that one uh yeah i've got it like on the documentation page there's a optimization section and i've kind of written up um a lot of the optimizations and how they work assertions that they make and compromises and
Starting point is 00:53:45 stuff like that um so yeah if you're interested there's there's some info on there um but yeah i'd love love more help on this the learning curve on the project is quite steep but i'm trying to make it easier um i mean it's a compiler so like yeah like um and i just added ARM support as well so M1 Apple M1 and I tested Linux ARM 64 and in theory Windows ARM but I don't have access to any machines to test the Windows one
Starting point is 00:54:15 and I could only test the Apple one remotely if you need a periodic test you can reach out I got Windows 11 running on ARM oh really okay yeah maybe I'll take you up on that i know very cool all right i have a couple throughout there as well um python software foundation on twitter the psf analysis we're happy to announce the python developer survey 2021 take part in it this is the one that is then hosted and then the data analysis is done by jetbrain so but not influenced by jetbrain So I'll link to that in the show notes.
Starting point is 00:54:45 Be sure to get out there and take that. Henry, on the audience, I have something as well. The feature for what you said the other day on Twitter. I said, after Python's bytes mentioned on yesterday's show, I asked for a new feature and it's already in PipX. PipX run PyPI command line wheels. And basically this is added to pi pi command line and it'll tell you all sorts of cool stuff like the details of the wheel so you could run pipx
Starting point is 00:55:13 basically run ipi dash command dash line wheels numpy however you run that and it'll tell you like for numpy on mac os 10 universal does it have a signature? Is there a binary distribution? What versions are updated? Supported? How old is it? How big is it? Same thing for Linux architecture arm on Windows and so on and so on. So you get like just this cool graph using Ridge of like tables
Starting point is 00:55:37 of tables telling you about the status of wheels on different platforms straight out of PyPI, which I thought was cool. Nice. Yeah, so that's pretty good. So, Henry, thanks for making that happen. Also, on the last episode, out in the YouTube, not live, comments, we got a message from,
Starting point is 00:55:53 I want to make sure I get the attribution, from Bahram, and said, we talked about, what is it? T-Bump. T-Bump. That was it. T-Bump for bumping the versions. He said, oh, that's cool. I use bump to version, which is another option to do some similar types of things uh can work with with or without um source control all kinds of stuff so fun one to check out and um brian you sound really good this this time like last time i thought maybe a b had gotten into your microphone
Starting point is 00:56:20 what was the story of that um it's a long story basically i had to throw a mic uh so i had a bad mic um and a bad cable but i have a new it's tough when the two things that are connected together are both broken at the same time the buzzing i think was definitely my cable i think there was a feedback thing going on you're getting an sms um but the then and then i was examining everything in my in my audio chain and uh i just got rid of the stuff that wasn't working so yeah oh you sound great new mics even better than before so like a phoenix you're back nice image too yeah and then have you got your mac pro yet no i just bought a mac a couple years ago i'm not gonna buy another one right now anthony are you using one of these to test your own version no i don't have a spare four thousand
Starting point is 00:57:08 dollars yes for another another laptop and also i was like i don't really need a laptop because i never leave the house so like yeah that is a big problem i mean i am so loving my mac mini and my 4k monitor that i'm just like i don't want to leave i don't really all right well that's it for the extras i think it's time for a joke maybe robert's got the first one out there um can't complain about brian it's all about the hair you got to see the live stream for that one but yeah i agree with that uh next next halloween i want to go as cousin it so i got a ways to go anthony are you up for doing this joke yeah Yeah. Yeah, I got it on my screen. Oh, you got it on yours? Yeah, I'll put it on yours.
Starting point is 00:57:47 All right. Okay, okay. So it's a picture, so I'll have to describe it. I couldn't stop laughing at this when I saw it. So this is Frodo explaining to Gollum. And there's Gollum sitting at a computer looking quite confused, looking at a picture of the ring. And it says, buy now one ETH.
Starting point is 00:58:07 As in Ethereum, right? Yeah right yeah yeah as in ethereum and frodo is basically um trying to convince golem to buy an nft of the ring instead of actually having the ring and underneath my digital precious so underneath it says so you can't own the precious physically, but you can pay to have your name listed as its owner in an online distributed database. It's only what is that like 400 US dollars, 500 Australian, something like that. I know that's a lot for a listing. I don't own any NFTs yet, nor have I sold any. I don't plan to either. Man, I feel like we're missing an opportunity
Starting point is 00:58:47 to brand some of our former episodes maybe like i could just take screenshots of brian laughing at different times out of the live stream and then like turn it into a stream of nfts that we'll retire upon oh yeah let's let's do that yeah oh fantastic all right oh that was a good one thanks anthony and thanks for being here on this big episode 256 yeah i feel like we've maybe gone slightly over this this is not really a python bite this week is more of a python lunch sandwich yeah it's a proper meal a python dinner but it was a good one we talked about a lot of stuff and a bunch of great people in the audience gave us like really good inside information on where things are going so yeah so thanks everyone thanks brian yeah all right bye y'all thank you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.