Python Bytes - #206 Python dropping old operating systems is normal!

Episode Date: November 8, 2020

Topics covered in this episode: Making Enums (as always, arguably) more Pythonic Python 3.10 will be up to 10% faster Python 3.9 and no more Windows 7 Writing Robust Bash Shell Scripts Ideas for 5x... faster CPython CPython core developer sprints Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/206

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 206. Wow. Recorded October 28th, 2020. I am Brian Ocken. And I'm Michael Kennedy. Yeah, and we have a special guest today, Steve Dower. Hi, thanks for having me on. Hey, Steve. Thanks for coming. It's great to have you here. I also want to throw out that this is sponsored by Tech Meme Ride Home Podcast.
Starting point is 00:00:21 Check them out at pythonbytes.fm slash ride. Steve, I'm sure many listeners know you but maybe just give us the quick rundown you do cool stuff at microsoft getting python better on windows and you're a core developer yeah so my main day job is for microsoft where i work as basically a python engineer um kind of a wide-ranging resource to to the company so i haven't shipped anything of my own in a while, but I've had my fingers in a lot of Python related things that have gone out recently. So it's a lot of fun, a lot of bouncing around between different teams, getting to work with a lot of different people.
Starting point is 00:00:54 And yeah, as you say, I'm a CPython core developer, one of the Windows experts on the team. I'm responsible for the builds that go up on python.org and just generally keeping python running well on windows yeah that's awesome and uh you've done some interesting talks like python is okay on windows actually and talked about some of the popularity of it and how we as a community shouldn't discount it just because we might happen to be on a mac or use linux or whatever right a lot of people do python on windows yeah yeah the estimates vary and and every time i get new numbers they seem to show up slightly different so it's real hard to get a good fix on how many python developers there even are in the world i did get some numbers recently that i had a few people double check because they were saying there's like 20 million installs of python on windows in in the entire
Starting point is 00:01:39 ecosystem wow which sounded like too many to me so i had them double check and then i had someone else double check and they all came back saying yeah it, it's about that. So I'm like, okay, there's a lot of Python on Windows out there, but yeah, it doesn't show up in conferences, doesn't show up on Twitter that much. And a lot of people just look at the packages that don't work and go, well, I guess it doesn't exist on Windows because otherwise this package would work. And so, you know, chicken and egg problem, right? Yeah. There's a lot of chicken and egg problems in the Python space. I mean, it's a beautiful place, but there are some of these know chicken and egg problem right yeah there's a lot of chicken and egg problems in the python space i mean it's a beautiful place but there are some of these weird chicken and egg ones yeah it's weird i've been using python on windows since i started python so have i but one thing i haven't been using very much is enums so that was an attempt at a transition so
Starting point is 00:02:20 why not brian tell us more i've tried. Many times I've tried to use enums. And I actually, just to be honest, I don't very much. And partly because I'm used to using enums in C, in C++, and they just will act like symbols in C and C++. They work pretty good. There is some weirdness with enums in Python. And I'm going to highlight an article called Making Enums, as always, Arguably More Pythonic by Harry Percival. He starts it off by saying, I hate enums. So Harry's a funny guy. And this is a fairly hilarious look at why enums are frustrating sometimes. And then he presents a semi-reasonable workaround, I think, to make them more usable. So what's wrong with enums?
Starting point is 00:03:06 Well, he gives an example, and just as a simple enum with string values. And you can't directly, if you try to compare one of the enum elements to the value, like the value you gave, like a similar string, it's not equal. It doesn't compare. But you can use a dot value or in the enum value but that's it's just weird he also said he kind of thinks it'd be neat if you could do a random choice of all the enum values i think that would be neat and you can't directly convert them to a list there's just interacting with the enum type itself or the the class itself is has problems in the documentation there is a suggestion that you can,
Starting point is 00:03:46 instead of strings, use int and do an int enum, and it works a little better. And if you like it like that but want strings, you can make your own string enum class. I'm not sure why they didn't just build this into the default or one of the standard types anyway, but string enum is not there. But there's an example and it sort of fixes a lot of stuff, but not everything. It doesn't,
Starting point is 00:04:11 still doesn't allow for those direct comparisons. So the solution that Harry came up with is just kind of like the solution. The documentation says derived from both enum type and stir when you're creating your enum class, but then also define this little snippet of a dunder stir method so that the stir method works better. And at that point, most of this stuff works that he wants to work. It still doesn't do random choice, but apparently he's gotten over that a little bit.
Starting point is 00:04:43 So I actually think this is really this is still just a really small snippet a little like two lines of extra code to add to your enum types to make them a little bit more usable so i think it's reasonable if you're judging by like value add per character this is awesome because it makes working with the enumeration so much nicer you can do like the natural things that you would expect, especially around testing and comparison. And it's like also add a derive from str and just add a dunder str method and you're good.
Starting point is 00:05:14 Steve, what do you think about this? Yeah, I'd just add that the gotcha that you have by doing this is now you can have two values from different enums compare as equal, which as I recall from the original discussions was the reason this wasn't put in by default. Say you've got a color enumeration and a fruit enumeration. Is orange the same between the two? And I think the decision was made for enums to say no, it's not. If it's a fruit orange, it's different from the color orange. Making this change or using an int enum is going to make them equal. So as long as you're prepared to deal with that, which to be quite honest, every time I've reached
Starting point is 00:05:48 for enums, I am much happier with string literals and quite comfortable with them matching equal. Yeah. But that is just one thing to watch out for. Usually it's about constraining the list of things. Like I know there's five things I want to make sure I don't like somehow mix up what those five are. I just want to go, go you know class dot and here's the five let my editor tell me which one of those it is it seems like it's all that so what do you think about if you overrode like if you added dender eq dender any q and so like the comparison would say it has to be uh this type of enumeration and the str value has to match yeah that would certainly deal with that gotcha again when these were being designed basically anything that gets designed and added to python has a very large group of
Starting point is 00:06:29 very smart people work through it and you know as a result things always get missed uh so it's possible that one was just missed it's also possible that someone did figure out a reason why that was also risky and and you know risky when you're developing one of the most popular languages in the world is just anything that might surprise anyone so someone has deliberately designed around using enums everywhere they're probably not going to be surprised someone who is using code and that developer has swapped out all of their you know they they had a class with static variables and they turned it into an enum and now stuff breaks because of the defaults that were chosen for enum that's the kind of thing that you're trying to avoid in, you know, in a language that has
Starting point is 00:07:09 anywhere between, you know, five and 20 million kind of regular users. But as a workaround, I mean, if you know where your enum is going to be used, there's a reason you can derive from string and it's exactly for stuff like this. Yeah. Thanks Harry for putting out there. That's quite a neat, a little bit of a advice there. And I'm so glad that we have Steve here, because I picked some sort of semi internal type of pieces, and I'm going to make some statements about it. And then Steve can correct me to make it more accurate. And also we get the core developers, but a perspective, not that you represent all core developers, but at least a slice. All right. So this next one I want to cover is that Python 3.10 will be 10% faster. And this is the 4.5 years in the making. So Yuri Stelovanov long ago did some work on optimizing, I think, what was it?
Starting point is 00:07:58 Load attribute or was it load method, load call method. So about some of these load operations down at the CPython C eval level. And then Pablo Galindo, who's also a core developer and the Python 3.10, 3.11 release manager, picked up this work. And now we have load method, call method, and load global going faster. So basically, there's some optimizations around those opcodes that make this much faster. And this idea apparently first originated in PyPy, P-Y, P-Y. So I'm pretty excited to see that, you know, some simple internals that don't seem to
Starting point is 00:08:41 change a whole lot of stuff might make this a lot faster. What do you think, Steve? This one is real. Like I was so excited about this when it was first being proposed. The basis of the idea is that Python keeps everything in dictionaries, which means every time you look up dot name of anything, it takes that name, makes it a string, turns it into a, gets the hash value, looks it up in a dictionary, finds the value, maybe wraps that up in some extra stuff like if it's going to be a method it's not stored as a method you turn it into a method when you know what it's being kind of dotted through to get to and then returns that that's a whole lot of work
Starting point is 00:09:14 and if you're regularly calling the same method over and over again why not cache it that's the heart of this right it does that cache around load adder right yeah it does that cache and the insight that yuri had that he made work and in fact i think someone else had suggested it earlier and hadn't gotten it to work was what happens when things change because again as i say it's we're designing language for many many people do all sorts of weird things and if you cache a method lookup and then someone tries to monkey patch it you know we've now broken their code for the sake of an optimization which you know is a no-no in Python. Correctness beats performance in every case. That's just the trade-off that the language chooses to make.
Starting point is 00:09:52 That's almost always what you want. When would you want to be faster and wrong rather than slower and right? I'd be happy with faster and no monkey patching, but... Yes, yes, sure. Faster and fewer restricted capabilities might be a really good trade-off but faster and wrong is not a good one yeah we did some benchmarking and basically found that there was a way to track all dictionary updates in the entire runtime with a version tag that was not going to instantly overflow and not going to break everything so So it became really easy to say, has this dictionary changed since I last looked at it with one single value comparison? And so it looks at that value.
Starting point is 00:10:31 If it has changed, it's going to do the full lockup again. 99.999% of the time it hasn't changed. So it can give you the cached value and you saved big dictionary lookup, possibly error handling, descriptor protocol, all of this extra stuff that just adds so much weight to every operation. Yeah, and that's everywhere. I mean, that's everywhere in the language. Absolutely everywhere. That's fantastic. One of the things when I was first getting into Python that made me sort of have a sad face was realizing that having function calls was pretty expensive, right?
Starting point is 00:11:03 Like having a big call chain, like actually the active call in a function has a fair amount of overhead. When I wanted to break my code into like a bunch of small functions to make it real readable, I'm like, this part needs to go a little faster. Maybe that's not what I want.
Starting point is 00:11:17 You know, and so hopefully this helps with that as well. Yeah, that and vector call is another optimization that we got in recently. I think that might've been the PEP 509, actually, was vector call. Also designed to make that faster. Just removing some of the internal steps for doing a function call. Fantastic. And like Brian said, this is everywhere.
Starting point is 00:11:37 So everyone's going to benefit. This is fantastic. Yeah. Well, we would like to thank our sponsor. So this episode is brought to you by Tech Meme Ride Home podcast. This is a great podcast. For more than two years and nearly 700 episodes, the Tech Meme Ride Home has been Silicon Valley's favorite tech news podcast.
Starting point is 00:11:57 The Tech Meme Ride Home is a daily podcast, only 15 to 20 minutes long, and every day by 5 p.m eastern it's all the latest tech news but it's more than just headlines you could get a robot to read your headlines the tech meme ride home is all the context around the latest news of the day it's all the top stories the top posts and tweets and conversations about those stories as well as behind the scenes analysis the tech meme ride home is is the TLDR as a service. The folks at TechMeme are online all day reading everything so they can catch you up. Search your podcast app now for Ride Home and subscribe to the TechMeme Ride Home podcast
Starting point is 00:12:36 or visit pythonbytes.fm slash ride to subscribe. Yeah, thanks for sponsoring the show. And Brian, every day, like we do this once a week and it's a lot of work. These guys are on it the show. And Brian, every day. We do this once a week and it's a lot of work. These guys are on it. I could totally do this every day. If I didn't have another job, I would not have any problem with catching up on Python News Daily. Actually sounds quite lovely. I wonder how that podcast is doing now that not so many people are having to commute to and from work. That sounds like one of the things where you hope that you've given people
Starting point is 00:13:08 the excuse to tell their employer on my commute between 4 and 5 p.m. I can't do it. I've logged off and go listen to a podcast during that time. I listen to podcasts while I'm, I realized I was missing out. So I started listening to podcasts while I'm doing my laundry. I do it when I'm, we're doing some, doing i listen when i'm doing yard work stuff like that yeah i recently broke out some other older podcasts just to catch up on stuff with a big mess i had around the home like it's a long story with a new puppy that's not worth going into
Starting point is 00:13:38 maybe i'll tell you guys after the show it's pretty outrageous anyway yeah and it's so enjoyable but what i've actually found is shows like python bites and right home that have like a bunch of short little things that you can just drop in and out of match a lot better now that people are not commuting so much you can like do that 10 minutes while you're like folding laundry and get like a whole segment and so i think he gets varied but actually it's pretty interesting i think that they and us here are well positioned like talk python has more of a dip than python bytes have you surveyed your listeners to find out what they're doing while they listen to you no not really everyone
Starting point is 00:14:13 should tweet at these guys twitter what were you doing when you were hearing this podcast yeah that's awesome i've got a few anecdotes but nothing like a survey that would give me a proper answer steve i think yes your next item is super, super interesting. And people, speaking of Twitter, right? Like this whole conversation started as a, hey, Twitter message. Like, why did this happen? Well, let's ask Steve. Yeah.
Starting point is 00:14:36 As you know, I spend a decent amount of time on Twitter. My handle there is Zubo, Z-O-O-B-A, which you'll probably never find in search if you're looking for my name, but that's where I am. I really like actually searching for what people are saying about Python on Windows. It's kind of the most honest feedback you get when they think you're not listening. And so I go and listen and one of these popped up, which was, oh, I tried to install Python 3.9, newest release about a little bit under a month ago, on my Windows 7 machine, and I couldn't install it.
Starting point is 00:15:11 And since then, I've actually seen a few more posts. Someone managed to bypass the installer completely and get it all the way onto their Windows 7 machine and then found out it wouldn't run. Oh, man. Yeah. And so the question was asked, like, why would you do this? Windows 7 is still a fairly big platform. Why would you take it out? And, you know, the answer was just a bit too long for a tweet but someone you know kindly
Starting point is 00:15:28 included python bytes in the reply and so i said hey i'll come on and talk about it so let's do this topic and the answer is multiple legal looking documents all come together and have to be read in parallel to figure out why we dropped it. Yeah, the small business owners know what I'm talking about. So one of those documents is PEP 11, one of the lowest number PEPs that we have, and it's titled Removing Support for Little Used Platforms. The title was not originally about Windows, but there is a section in that PEP
Starting point is 00:16:02 that describes Python's policy for supporting Windows. On release day of 3.x.0, all the supported versions of Windows covered by Microsoft Support Lifecycle will be supported by CPython. And that's on the 3.x.0 release date. So what that means is then you now have to go and look at Microsoft Support Lifecycle website and look up all of the different versions of Windows to see which ones are still covered by support to that date. Windows 7 fell out of extended support in January. There was quite a bit of noise about that because that means no more security patches, except Microsoft did do a couple
Starting point is 00:16:42 more security patches because some really bad stuff was found, but that's largely stopped. Essentially, it's the point when the only people who get Windows 7 support are paying for it, and I assume they're paying large amounts of money for it. I don't actually know how much it costs, but that's the point where you bought it, but you don't get the free support anymore. So CPython follows that because no one is paying the core team to support Python on all of these platforms. And so it seems like the fairest point to draw that line. Because at some point we have to say our volunteers can no longer keep a Windows 7 machine running. Even I can't keep a Windows 7 machine running safely because there's no security updates for it. How am I meant to develop Python on it? How am I meant to test Python on it? The burden there is too high for
Starting point is 00:17:30 volunteers to handle. So we just say that's the point where it goes away. So because those two documents lined up, Windows 8 actually dropped off a couple of years ago because the support lifecycle ended early for that to force everyone onto 8.1. Windows 8.1 has about three more years. So I think Python 3.12 will be the last one to support 8.1. And then it's all Windows 10, or whatever comes out in the future. Yeah, yeah. Windows 10 X or whatever they call it. That's a different one. That's the Xbox. Yeah. So I, you know, I think this makes a ton of sense. And two thoughts I had as you were laying out the case here. One is if you're running on Windows 7, and you can't upgrade to even Windows 8, or more reasonably, Windows 10, or one of the server equivalents, right? I'm sure there's like a server equivalent, like Windows 2003 server. I don't know how long I supported, but whenever it goes out, it probably falls under that banner as well right yeah windows server is a bit more interesting their life cycles tend to last longer but historically c python has only kind of tied itself to the client operating systems gotcha oh interesting okay so to me i feel like if you're
Starting point is 00:18:37 running code or you're running systems that old you must be running it because it's like some super legacy thing so do you absolutely necessarily need to have the most cutting edge Python or whatever language? Like it's probably something that's that way because it's calcified and you probably don't need, or you probably shouldn't be putting like the newest Chinese things on it. Right. That's that one.
Starting point is 00:19:02 What do you think? Yeah, no, I totally agree with that. If that setup that you're running is so critical that you can't upgrade the Chinese things on it, right? That's that one. What do you think? Yeah, no, I totally agree with that. If that setup that you're running is so critical that you can't upgrade the operating system, how can you upgrade a language runtime? How can you upgrade anything on that? I feel like it's in the category of, please just don't touch it. It's over there. Just don't even walk by. Just leave it alone. We cannot have it break. Just leave it over there. It probably still has a PS2
Starting point is 00:19:24 keyboard plugged into it. Oh, it might with a little blue or break. Just leave it over there. It probably still has a PS2 keyboard plugged into it. Oh, it might with a little blue or pink round thing. Yeah, absolutely. The screen has probably got at least 16 colors. Yeah, and the monitor is probably really heavy. Windows 7 is not that old. Some of the... Seriously, like this stuff is old and you probably don't want to touch it, right?
Starting point is 00:19:40 Yeah, that's exactly it. It's all the motivation that you would have for updating to Python 3.9 from 3.8. And again, we're talking about a version that's only one year old, like Python 3.8 is not that old. And you desperately need to upgrade to 3.9, you even more desperately need to upgrade Windows. And there's just really, there really is no question about that. The same thing applies to early versions of Ubuntu. People running Ubuntu 14, or even 16 at this point, like need to be facing the same thing applies to early versions of ubuntu people running ubuntu 14 or even 16 at this point like need to be facing the same thing and we have similar discussions around open ssl where occasionally people will be oh i need python 3.9 to run on open ssl 0.9 to which our answer is basically that's pretty hot bleed it's It's like... Okay, I'm going to play the other side. I totally get the reasons, but I also get the questions because the users and the developers
Starting point is 00:20:32 or whoever's wanting to install Python, they usually don't get to choose what operating system they're using, but they do get to choose which version of Python they're using. So I do get, in some cases, and in some of those cases, I totally understand where the question's coming. Yeah, we joke about how old these machines are,
Starting point is 00:20:53 and they're really not. Like, people are setting up new machines, probably with Windows 7. They certainly were within the last year, and there's good legitimate reasons for that. And, you know, we're not, you know, we're making fun of some of the apparent contradictions, but we're definitely not making fun of the people who have you know often been
Starting point is 00:21:07 forced into these positions but the reality is we can't afford as a volunteer team to maintain python against unmaintained operating systems and so you know the advice is stay on the previous version of python that the latest version of python that works for you, it's not going to break. We're not changing it. Anything new that comes up, security fixes will still come out. At some point, there just has to be a line drawn. And that's the point where we've chosen to draw it. The other thing I want to point out that we changed in this release, which people are more excited about, is if you go to python.org to download Python for Windows, you get this real big obvious button up front
Starting point is 00:21:45 that just says download for Windows or download now or something. As of Python 3.9, that's now getting the 64-bit version rather than the 32-bit version. For a long time, it's been 32-bit. The reason for that was compatibility. We knew a 32-bit one would run anywhere. When we put Python in the Windows store, that was 64-bit only. We kind of wanted to test the waters and see, hey, will people notice that we haven't put a 32-bit version here? Turns out no one did. And so when we got to 3.9, had that change, we made it 64-bit by default.
Starting point is 00:22:15 So that has a flow-on effect to the ecosystem. A lot of particularly data science packages would rather just do 64-bit only packages. Some of them certainly get theirs done first right and not the 32-bit ones so we expect to see some flow and impact from that just broader use of 64-bit python throughout the the windows ecosystem yeah that's super cool and just like the final thought i had was you know django dropped python 2 and they're like we were able to remove so much code and it is easier for
Starting point is 00:22:45 new people to contribute because they don't have to write for two ecosystems, they write for one. NumPy did the same thing. And I feel like this is sort of the same story. You guys can just not worry about yet another older, outdated operating system and stay focused on what most people care about. One thing that someone did suggest in one discussion was why not dynamically light stuff up for the newer operating system? And the answer is we do that. And when we drop the older operating system, we get to delete about a hundred lines of code for each point where we do
Starting point is 00:23:17 that. So it is, we get to do a cleanup. We get to say, Oh, we don't have to dynamically check for this API, this API, load this cash,
Starting point is 00:23:24 this store, that call, that call this forward this, cache this, store that, call that, call this forward. We can just condense that into, oh, we know that this API is there, so we can use it and just reduce a lot of effectively dead code on newer operating systems. Nice. Is that a pre-compile hash if-def sort of thing, or is it a runtime thing? Does it make a performance difference? It definitely makes a performance difference, though we try and minimize it. But again, there's always some impact. It tends to be in operating system calls anyway, so you expect a
Starting point is 00:23:54 bit of overhead. And so it's not going to add a significant kind of percentage overhead compared to whatever operation you're doing. But it does certainly add a lot of cognitive burden to someone who's reading the code. One example that we got to clean up recently, not in a previous version, was we had about, I think, 70 or 80 lines of code to concatenate two path segments. And this is before Python's loaded. So we have to do this with the operating system. The API call up until Windows 7, I think think so pre-windows 7 was not secure and it would you know buffer overruns all sorts of horrible stuff but it was the best available function there for handling certain cases so we'd use it but first we dynamically look for the newer safer one
Starting point is 00:24:38 and call that as soon as we dropped i think, we could delete all of that code and just unconditionally call the one safe path combine function. And that code got a whole lot simpler. Yeah, lovely. That's awesome. Yeah. Brian, would you say it's more robust now? Yes. I think it would be more robust.
Starting point is 00:24:58 Actually, I thought I showed up to the Bash podcast. Is this the Python podcast? Yeah, this is not as bash bytes okay i love python of course i still use bash regularly and i know a lot of people that are like sysops people and other people are using bash daily as well so i wanted to highlight this cool article this is an article by david pashley called writing robust bash cell scripts and And even though I've been writing scripts for decades, I learned a whole bunch in this and I'm going to start changing the way I do things right away. The first two tips right away are things I'm going to do. First tip is to include a set dash u
Starting point is 00:25:37 and never even heard of this. What it does is it makes your bash script exit if it encounters an uninitialized variable. So the problem without this is like, let's say you're constructing a path name or something or a path of long path. And one of the directories or file names you have in a variable, if that's never set a bash normally just silently just deletes it and it's just not there and it'll still keep executing anyway, but it's not going to be what you want it to do. So yeah, I definitely want to turn this on so I don't use uninitialized variables. Similarly, if any of your script statements returns a non-true
Starting point is 00:26:19 value, so that's usually in scripts or shell work, Non-true value means something bad happened. If you use set dash E, that will make your script exit at any point if one of the sub statements returns an error value. So you don't want to just keep rolling with an error condition. So this is good. I hopefully, I'll cautiously add this to scripts because I want to make sure they keep working. And then a tip just to say, expect the unexpected. There will be times where you'll have missing files or missing directories or directory that you're writing into is not created.
Starting point is 00:26:55 So there's ways to make sure it's there before you write to it. If you're especially if you're running on Windows, be prepared for spaces and file names. And so variable expansion in bash does not really isolate spaces. So you have to put quotes around expansion to make sure that it's a single thing. And one of the things right away, the next one is using trap and I've never actually knew how to do this before. So if you've got a bash script that's running and it,
Starting point is 00:27:24 it just something's wrong and it won't exit you can kill it or other ways to get it to stop but if you have the system in a state that um needs some cleanup so that this uh there's a way to use a trap command to exit gracefully and clean up things the last couple points were be careful of race conditions and be atomic those are good things to do, but at least a handful of these I'll put to use right away. So it's good. Yeah,
Starting point is 00:27:50 this is neat. And a lot of the stuff I didn't really know about. So yeah, like continue on, if something went wrong, just plow on ahead. Yeah. That's,
Starting point is 00:27:58 that's cool to know that you can make it stop. Steve, do you ever do any bash? You got a WSL thing going on over there? I've certainly done a lot more bash since I started using WSL for a lot of things. I was aware that using an uninitialized variable would substitute nothing, but I'm very happy to know that there's a way to kind of turn that off because that has certainly caught me out in the past many times. And this looks like just a good article that I'm going to have to go read myself now because it has everything that you learn from doing scripts in like command prompt or PowerShell or even Python to some extent. I have not personally mapped those to bash equivalents.
Starting point is 00:28:37 So it sounds like this would be a good place for me to go through that and up my skills a little bit. My favorite thing was the find command. And i got that that felt as powerful as a regex and i'm kind of like oh i don't need to write a whole script now i can just do one excessively long find command nice yeah you find a lot as well all right this next one i don't want to spend too much time on because i feel like you could easily just go and spend an hour on it but we for time's sake we don't have a whole lot of time left for the episode because we have a bit of a hard stop so i'm going to go through this and uh get your guys thoughts on it real quick there was a a tweet about a github repository that was a conversation on the python mailing list lots of lots of places so anthony shott tweeted uh calling attention to
Starting point is 00:29:26 a roadmap by mark shannon called ideas for making for five times speed of five times faster c python so he laid out a roadmap and a funding map and some interesting ideas and i'm going to go through them quick and then especially steve will see you, what your thoughts are here, how reasonable this might be. So the idea is like, there's going to be four different stages. And each stage thinks you can get 50% speed improvement, you do that four times, that's, you know, compounding performance interest, you get five. So I think it talks about three nines somewhere. But anyway, I think maybe it's got to shift its numbers a little. Anyway, so Python 3.10, stage one was to try to improve will be adaptive specialized interpreter
Starting point is 00:30:11 that will adapt types and values during execution, exploiting type stability without the need for runtime code generation. That almost sounds a little bit like what you were talking about with the 10% increase earlier, Steve. And then 3.11, stage two would be improved performance for integers, a less than one machine word, faster calls and returns through better handling of frames and better object memory layout.
Starting point is 00:30:34 Stage three, Python 3.12 requires runtime code generation and a simple JIT for small regions. Python 13, extending the JIT to do a little bit more. And I'm linking to a conversation a long threaded conversation over on python dev there's a whole bunch of stuff going on here so I encourage people to read through it but there's just like a lot of interesting implications about like how do we pay this if we pay someone to do it people like Steve work on CPython and they don't get paid like how is it fair to pay someone else to do it when other people are volunteering their
Starting point is 00:31:07 time? There's a lot going on here. Steve, what do you think about this? Have you been following this? I read through the original proposal. I haven't had a chance to chat with Mark directly about it. I will, I guess, start by saying that Mark is a very smart guy and he has done all of this planning off on his own in secret
Starting point is 00:31:25 and kind of come out and shared this plan with us, which, you know, it's not an ideal kind of workflow, certainly when you're part of a team. But I have certainly found in the past that when you get a very smart guy or a very smart girl goes off and disappears for a few weeks and comes back and says, I've solved it, there's a good chance they've solved it.
Starting point is 00:31:43 So I'm very interested to see where it goes. The part of the discussion that you didn't mention is, or that you hinted at, is this is kind of a proposal for the Python Software Foundation to fund the work. And part of that funding is conditional on delivery. So the way he's proposed this would work, and the implication seems to be that Mark will do the work himself and be the one getting paid for it. Yeah, that seemed like it wasn't clear from his GitHub repo.
Starting point is 00:32:09 But if you read the conversation was like, look, I'm pretty sure I can do these things. This is how much would make sense for me to spend the next couple years working on it and getting paid. How do we do a fundraiser so that I can do this for everyone? Yeah. And you know, I think under those conditions, if the PSF is able to put the budget towards it, they are in a bit of a tight spot since PyCon is normally the big fundraiser for the year and that didn't happen. On the other hand, it's also the big expense.
Starting point is 00:32:35 But financially, the PSF is not in their normal place where they'd be for the year because PyCon didn't happen in the same way. But I think if they're prepared to put funding towards this, I guess if the community consensus is that this is the most important thing for us to do, and there's certainly potential downsides to doing it. Code complexity is the big one.
Starting point is 00:32:59 And I don't actually think there's a way that you implement this or even achieve these performance gains without making the code much more complex and hence less accessible to new contributors and you know people in earlier stages of learning to code at least on the c side yeah yeah so there's trade-offs i'm very interested to see what would come about i assume that because 310 is targeted for the first pass that it's already done and he's already got the code and he's trying to actually and he's just trying to get confirmation that he can spend the next few years heavily investing in it instead of having to go find a full-time job
Starting point is 00:33:37 and go back to doing this in the evenings yeah which you know i'm i'm fully supportive of again it's really just a big open question of is this the most important thing for python to be funding right now for c python to be getting in particular someone i forget who raised the question of what if we put that money towards pi pi instead you know what could they do with it in that amount of time and ultimately it's going to come down to someone or probably a small group presumably the steering council will have some involvement from the technical side. The Python Software Foundation board will no doubt be involved
Starting point is 00:34:10 in just deciding, is this the best use of the money that we have or can go out and get for what benefits it would produce? When I look at it with the funding side, I see it as very fraught with challenges on the sort of community funding, the PSF funding. But I know there's so many huge companies out there that spend an insane amount of money
Starting point is 00:34:32 on compute and infrastructure that make a lot of money. And that if they could have a 5X speed up on their code, they could probably save that money right away on infrastructure. So it seems like that they could also get funded that way but we should probably move on just because i've got to i'm going to make sure we have time for everything else before we uh end up running out of time i just do want to
Starting point is 00:34:53 call out like you should go check out that conversation there's a very funny excerpt from larry hastings says speaking as the galactomy guy they were talking about borrowed references being a challenge saying barbed references are evil The definition of a valid lifetime of borrowed reference doesn't exist because they are a hack baked into the API that we mostly get away with because of the gill. If I still had wishes left on my monkey's paw, I'd wish them away. Unfortunately, I used my last wish back in February,
Starting point is 00:35:17 wishing I could spend more time at home. So bad. All right, Steve, let's get a little bit more insight from you on this last one, huh? Because you were at the Core Developer Sprints, which recently happened. Yeah, so I don't know exactly what day this is going to go out, but last week from recording day,
Starting point is 00:35:35 we had the CPython Core Developer Sprints. So this is kind of a get-together, generally in-person event that the Core Development team has done for five years now. I think this is the fifth year. In the past, we've all gone down to Facebook or Microsoft, or last year we hung out at Bloomberg in London and basically spent a week in a room together coding, discussing, reviewing things, designing things, planning things, and otherwise just getting to actually meet our other contributors because we all work online. We all mostly work over email and kind of bug tracker and GitHub pull requests throughout the year. And so it's
Starting point is 00:36:17 a really good opportunity to get to meet each other, get to see who we're dealing with. It's a lot harder to be angry at someone over email when you've met them. Yes. And so it's been a really good event. This year, because we're obviously not traveling for it, we were hosted by Python Discord, which is at pythondiscord.com. There's a server that is really well managed. It's really well organized. I was impressed. I have not been there before, but it was great. They set up, felt like thousands of channels for us far too many but it gave us plenty of space to kind of mingle with other core devs while we were discussing and working and planning anything we also did a q a so the there'll be a link in the notes for that
Starting point is 00:36:57 from youtube that we live streamed we had people submit questions ahead of time everything from what situations should i use a mangled name in, like a double underscore led name, through to what's your least favorite part of Python? What do you most want to replace? Did you ever expect Python to get so big? And we had a lot more people involved. We normally do a panel for the host company. So we'll get kind of their employees together. And it's like part of the perk for funding the venue and typically meals and coffee and everything for the week this time it was public on youtube it was all kind of over video so everyone got a bit of a turn to jump in so you'll get to see a lot more core developer
Starting point is 00:37:35 faces than you've probably ever seen before you'll get to hear from a lot more of us than you have before and a lot of interesting things the big kind of ideas that came out of the week kind of hard to say a lot of us did come out feeling like we didn't get as much forward momentum on stuff as we normally would in person but at the same time a lot of things did move forward i think there were about seven or eight peps passed up to the steering council during the week various things one of mine was deprecating distrutiles which is an entire podcast on its own so i might might have to call you guys another time to talk about that one through to a proposal to change how we represent 3.10 because a lot of places we put the version numbers back to back with no separator and so you have you know three eight three nine with no nothing in between now up to three ten or is it thirty one zero yeah okay how do we fix that and
Starting point is 00:38:31 we had a lot of discussions about that there there was obviously a lot of talk about jit about the c api all the usual things that we talk about but again because it was online it was really good to have such a range of people involved from you time zones and people who would not normally get to travel. Yeah, it makes it more accessible. Yeah, that's awesome. Yeah, we have core developers in countries who can't leave. They literally cannot leave their country either because the populace is just strictly controlled or they know they would not get back in it when they tried to go home.
Starting point is 00:39:03 And so they were able to participate. And that was great to see and meet some of those people we had a few mentees come along to interact with the rest of the team and just overall a good week awesome yeah cool and yeah people can check it out the the youtube stream i definitely want to check that out sounds neat all right brian we've got two minutes left do you think we should do a joke? Yeah, let's do the joke. Oh, let's just cut to the joke so we don't miss that, right? So you and I spoke about Hacktoberfest going wrong and random PRs to config files and changing the spelling and config file settings. So there was a guy
Starting point is 00:39:39 who posted on Twitter, said, hey, let me double check the name. It was Stuart McCrudden. And he posted this cool t-shirt that he got. It says, Hacktoberfest 2020. Any PR is a good R. And it's Lua.py. And it has import pi in vim. Then the PR just adds hash.
Starting point is 00:40:01 This imports a package. That's awesome. Steve, did you suffer from any of these? I did not. I might have done. My GitHub notifications are a mess. So yeah, I don't even know yet. Yeah, I don't see poor requests until I actually go look at the repo myself.
Starting point is 00:40:19 For the most part. Yeah, I got a bunch. I got a whole bunch. Yeah, me too. Okay, I wanted to do one. This should have been a topic, but the five most. I got a whole bunch. Yeah, me too. Cool. Okay, I wanted to do one. This should have been a topic, but the five most difficult programming languages in the world.
Starting point is 00:40:34 This was submitted to us by Troy Caudill, I think. It's not really a full topic, but I thought it was hilarious. This is an article where the author, Locate, I guess, actually took five programming languages, Malbage, Intercal, Brain, you know, we all know that one, Cal and Whitespace, and wrote Hello World in that language. And these are hilarious.
Starting point is 00:40:57 And my favorite is Whitespace because the entire language depends on space tab and line feed for writing the program. And any non-Whitespace character is considered a comment. So this is great. That's crazy. I don't know why the APL wasn't on there.
Starting point is 00:41:13 APL is just fully insane. I'll put just in the show notes here at the bottom of this an example of APL. That right there that I put on is, I can't try to even speak this, but that is an entire thing that finds all prime numbers from one to R in that line. If you guys see at the bottom of the notes, that's insane, isn't it? That's not even intentionally bad, is it? No, it's meant to be a real programming language. It's as if the Egyptians who only wrote in
Starting point is 00:41:41 hieroglyphics decided to write a programming language. That's how I feel. It's insane. But it's a legitimate language. People try to use it. They do use it. Anyway. Not for very long, I expect. Only as long as they must,
Starting point is 00:41:56 and then they immediately stop. I just like that this Intercal example, it's so polite. It's please, please do, please do, please, oh, please give up. Yeah, apparently you have to sprinkle pleases in it or else it'll like error because you're not polite enough. But if you do too much, it also errors because you're overly polite. I like that.
Starting point is 00:42:18 We need more passive aggressive languages like that. Lovely. Cool. Well, thanks a lot, guys. It was fun. Yeah. Yeah. Thanks, Brian. Michael. Yeah. Thanks. Well, thanks a lot, guys. It was fun. Yeah. Yeah. Thanks, Brian.
Starting point is 00:42:27 Michael. Yeah. Thanks. And thanks for being here, Steve. It's great to have your perspective. Thank you for listening to Python Bytes. Follow the show on Twitter at Python Bytes. That's Python Bytes as in B-Y-T-E-S.
Starting point is 00:42:38 And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. This is Brian Ocken, and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.