Python Bytes - #128 Will the GIL be obsolete with PEP 554?

Episode Date: May 2, 2019

Topics covered in this episode: Solving Algorithmic Problems in Python with pytest * DepHell -- project management for Python* Dask Animations with Matplotlib PEP 554 -- Multiple Interpreters in th...e Stdlib Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/128

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 128, recorded April 30th, 2019. I'm Michael Kennedy. And I'm Brian Ocken. And this episode is brought to you by DigitalOcean. They're great supporters of the show. Please check them out at pythonbytes.fm slash digitalocean and get $100 free credit. Brian, I am super excited.
Starting point is 00:00:22 Are you? What about? We are on PyCon Eve. Yeah, but when is this going to go out though? I mean, I think this is going out before PyCon. At least it'll be out during PyCon, if nothing else. We're going to rush. Yeah, I'm excited that I don't have to pack my banner thing this time. I know. We're going to be at the JetBrains booth and they're going to have all that stuff set up for us. So we can just roll in like a normal attendee with a relatively small amount of gear
Starting point is 00:00:49 in tow. It'll be great. Yeah. And I want people to show up at the 6.30 at the Thursday night thing because I'm going to be recording at 6.30 live. So stop by there. Awesome. Yeah. I feel like with the JetBrains stuff and just independent of that, we're going to be doing a lot of live recording there for Test & Code, for TalkPython, and for Python Bytes. And we'll try to get the word out about that. But it's to be doing a lot of live recording there for test and code for TalkPython and for Python Bytes.
Starting point is 00:01:06 And, you know, we'll try to get the word out about that, but it's going to be a lot of fun. Yeah, for sure. So let's just jump right in the first one here. I see this one as well within your wheelhouse. Yeah, I forgot about the order. Yes, there's an article from Adam Johnson called Solving Algorithmic Problems in Python with PyTest. And yes, PyTest is definitely my wheelhouse. I like the highlighting of this.
Starting point is 00:01:28 Here's the idea. You've got coding challenges, especially algorithmic ones like Project Euler or Advent of Code or other. There's lots of coding challenges projects. And doing those goes through one example and shows how to translate the project description, the problem description and the problem description, and the specification into some quick little tests. And they're like showing it how small tests can be, just testing one aspect of the answer. So the example goes through basically, yeah, it's a little bit of a TDD practice thing of coming up with some tests to test the test case, going
Starting point is 00:02:03 through an exercise, translating all the specification into tests, and then just working through it and looking at all the failures, failure cases, especially. So one of his example was to just create a stub answer, just the function you're trying to actually write, just have it return some constant and then build that up. And it's a little bit of a TDD practice, but also just practice writing tests to do these sort of projects. And it's a short article and I really liked it. Yeah, it's a cool way to explore solving some problems with the whole Project Hewler thing. And it's also a cool way to get some experience working with PyTest. And I feel like these types of problems are pretty good entry-level PyTest-type problems.
Starting point is 00:02:46 It's not, well, we have the website and we want to make sure the user can register. And like, okay, well, how do I mock out the database call and then like stub out the email service? You know, like all that complicated stuff, right? Where here it's like a pretty constrained problem, right? Like find the minimum path, you know, amongst these bridges or, you know, whatever the problem is, like these sort of mathematical algorithmic ones. Yeah. And his example is like super easy. It's just returning the minimum value in a list as long as it's positive.
Starting point is 00:03:13 Nice. And it's easy to get your head around. And actually, I love doing little practice things like that. Cool. Yeah, I do too. And I'm a fan of Project Healer, so quite cool. Now, this next one that I have, I feel like maybe we haven't really discussed it enough, but it has to do with Python packaging. Do you think? Oh, yeah. We haven't covered packaging much.
Starting point is 00:03:33 No. So it's good that it's finally coming up on the show. This one is interesting because it's kind of like a meta packaging tool. So we've talked about the pyproject.toml. We've talked about the pipfile.lockfile, requirements.txt, poetry, all these things, right? Well, this one that I found or was actually sent to us by Dr. Igleby on Twitter, it's called dep hell as in dependency hell. So it just comes out and says it right like there's flames on the logo come on so the idea is that it will let you work in these different modes and even automatically translate between them so one of
Starting point is 00:04:11 them is like hey we have a setup.py for managing this particular project i would rather have a pyproject.toml file that expresses the same thing and you just run a command line against that project and it'll generate the pi project.toml based on the current setup in your project oh nice yeah so if you want to say switch it to poetry you can run a command it'll switch to poetry if you want to switch it over to pip env fine you run that and it'll do it so that's one of the things it does it's pretty cool in that it'll let you's one of the things it does. It's pretty cool in that it'll let you use all these different things. It doesn't really try to replace them. But it more tries to like tie them together, right? Like I've grabbed a project, but maybe it's not my
Starting point is 00:04:53 the tooling that I like, it's pretty easy to extend is really quite nice. I like this ability to translate between them. It also has some of the features of like pipenv. So it'll create like a shell, it'll install your command line utilities into their own isolated virtual environment, like pytest could be in its own environment that has nothing to do with your project, but is available to run against your project, things like that. Yeah, nice. Yeah. So yeah, not too much else to say, I guess it's based on asyncIO, so that means it only supports modern Python. That's pretty awesome. So it's super fast.
Starting point is 00:05:30 All of its network calls and stuff are made asynchronously. And yeah, it's pretty good. I think something like this would be great for a project that has a maintainer that doesn't care about pipenv or something. It's fine with requirements or whatever they're using. But there's a couple of maintainers that really't care about PIP end for something is fine with requirements or whatever they're using. But there's a couple of maintainers that really like PIP end and want to, want to have the PIP file around and the lock file. Yeah.
Starting point is 00:05:52 That's interesting. Like you could have one of them. That's the source of truth. And then use this tool to generate the others. If you'd rather work that way. Right. Yeah. You could even do this part of a automated build,
Starting point is 00:06:03 right? Delete the PIP file and then recreate it as part of the build and then check that back in. Who knows? Yeah, you could even do it as part of an automated build, right? Delete the pip file and then recreate it as part of the build and then check that back in. Who knows? Yeah. Okay, cool. Yeah, pretty cool. This next one that you found is a bit of a rant, huh? Yes, definitely. Mike Croucher, which I'm not sure what he does.
Starting point is 00:06:17 I think he writes a lot of great articles and come across his name every once in a while, so thanks, Mike. This rant, he even says it in the name, Python rant from foo import star is bad. But basically just import star is bad. And I thought this was just done, that nobody did this anymore.
Starting point is 00:06:36 But I actually see quite a bit of code that still has this in it. And I was actually in looking for different blog posts. There's some blog posts that have great advice in Python, but the example code has import star. And I'm not going to point people to that because it's just a bad practice. And his example is, for instance, is the square root function, SQRT. If you just have like result equals square root of minus one what does that mean you don't know what it means because you don't know where that came from and there's some
Starting point is 00:07:12 really confusing examples that he's showing how it may have been from the math library it may have been from numpy or cmath or scipy or simpy all of them have the same function name. And we like namespaces. But when you use import star, you throw away the namespace ability. You just import everything into your current namespace. So don't do that. Yeah, I like the hat tip to full metal jacket. This is my rant on import star. There are many like it, but this one is mine. This one is my own. know i've i totally uh like this as well i'm a big fan of having like super explicit namespaces to like really tell where something comes from in fact i typically try to shy away from from thing import something yeah not just import star just even that it's more like import this module then module dot function module dot class you know sometimes
Starting point is 00:08:06 if it's like deeply nested i might do import thing as like just the last part of that name or but you know something to give you a hint like where the heck did this come from right and like for instance the um numpy is the convention is to import numpy as np NumPy is really long to type. I know. However, you're using a ton of it. And so, yeah, there are conventions. And if you notice the conventions, you follow those. Yeah, absolutely.
Starting point is 00:08:34 I'm with it too. So sometimes when I'm refactoring some code or trying to understand it, it's frustrating when somebody has a bunch of from library import, like five different functions. Yeah. And, you know, it's not terrible if you've got something like visual studio code or you've got pycharm and you can like go to definition or you hover over it and it'll say more but if you see it in a blog post or you see it printed or you see it in like a gist or somewhere that doesn't have like understanding of the environment then you're like okay what is that right like so just you know think. Yeah, I actually kind of wish they deprecated, but that's probably never going to happen. Yeah, probably not. I talked about digitalization at the top. Let me just
Starting point is 00:09:12 tell you about something that's new and cool. Brian, are you familiar with GitHub actions? I just heard a little bit about them. Yeah, same. I haven't really done anything with them. But it's basically GitHub actions are like a series of workflows that can be triggered. When you do like a push to a repository, you create a release or you create an issue, right? And it runs a series of actions that then you can kick off CI or do other sorts of tests. Well, DigitalOcean has come out with GitHub actions for DigitalOcean. So you can do really cool things like I would like to upgrade my Kubernetes cluster anytime I push to the release branch or the master branch in my GitHub repo
Starting point is 00:09:53 and just bake that straight into GitHub. Oh, wow. Right, just the fact of you doing a commit or the integration test passing or whatever. Whatever you want to do with the GitHub Actions, you can set that up. So there's special GitHub Actions for DigitalOcean. And yeah, check them out. Really cool stuff. They have on their blog post, they have something about working with the Kubernetes service there,
Starting point is 00:10:14 and then using the GitHub actions to sort of keep it always up to date. So check them out at pythonbytes.fm slash DigitalOcean. Get $100 free credit for new users. And I feel like GitHub actions are something I just want to learn in general. How about you? Definitely. Yeah, it seems like something that could help out with workflows. Yeah, absolutely. Cool. Well, this next one I want to talk about is not super new, but I don't know how we've gone this long really without talking about it. So Dask. So Dask is a way to natively scale Python. And when I first thought about it, I first heard about it, I thought, okay, well, Dask is like this thing that takes data science workloads and runs them on clusters.
Starting point is 00:10:52 And the reason I didn't get super excited is like, well, I don't do that much data science, and I don't have enough that require clusters to run. They're usually pretty small little graphy things or something if I'm doing any data science. However, I recently had Matthew Rocklin, who's behind Dask on TalkPython in episode 207. And we talked a lot about it. And there's actually some really cool stuff. And I think more applicability to more people than I first realized. So basically, the idea is like, Dask will take the NumPy, SciPy sort of stack
Starting point is 00:11:22 and scale it out. Right? So you have NumPy, you have Pandas, you have Scikit-learn code, all that. So there's Dask versions of like NumPy arrays and Panda data frames. So there's like Dask data frames. And what you can do is just work with those, basically the same API, but instead of working just locally,
Starting point is 00:11:40 it will work with them on the cluster. So suppose you have like three terabytes of data you need to process and it can't fit into RAM. So you can't just load it up into a NumPy array or a Panda data frame, but you can tell Dask to process and it'll share it across the cluster and do all the work and the computation
Starting point is 00:11:56 and the cross server communication that you need. Isn't that cool? That is neat. So it sounds really neat. And that workload like to me doesn't really help that much because I don't have to do a whole lot with that. I know some people that'll be super valuable for, but you can also just run Dask locally on your machine
Starting point is 00:12:10 and it'll create like a little mini cluster that runs locally and it'll use like threads and processes and whatnot. And it'll even let you process more data locally than will fit into RAM and do like, you know, lazy loading and all sorts of interesting stuff there. So pretty cool it even lets you escape the gill so you get better like parallels and even on your own computer and it runs arbitrary python code not just numpy and pandas even though that is main that's its main
Starting point is 00:12:36 use that's actually pretty darn cool yeah that's what i thought so i decided to make it one of our topics for today yeah the large file, I definitely hit that occasionally. Yeah, it's pretty sweet. And I don't really want to think about it just for special cases, but being able to use running it under Dask might just speed it up. Yeah, you basically just create like a Dask client or something
Starting point is 00:12:56 and it'll like locally create a little server cluster that'll process it all. It's pretty cool. Nice. Or run on a thread pool, something like that. Last thing I thought
Starting point is 00:13:03 that was kind of interesting, maybe do this for me. Click on the Dask thing to open it up and just go to the bottom. Notice the supported by there? Wow. Isn't that cool? Supported by the NSF, supported by NVIDIA,
Starting point is 00:13:14 supported by DARPA, Anaconda Inc., things like that. So here's a really interesting example of not just a project that's cool, but an open source project that's really supported by some neat companies or organizations. So anyway, I just kind of thought that was a cool thing that jumped out at me as well as kind of the proper support this project's getting.
Starting point is 00:13:37 Oh, I definitely need to check this out. Yeah, neat. Yeah, pretty cool. It might tie in with graphing. Yeah, actually there's some pretty graphs on the Dask website. But if you don't want to, I don't know how they're using it,
Starting point is 00:13:50 but it's possible to do animations even within Matplotlib. And I'm highlighting an article by Parul Pandey, sorry if I'm getting that name wrong, called Animations in Matplotlib. I thought we'd already covered this, but we haven't yet. Just the fact that you can do animations.
Starting point is 00:14:06 And I guess I hadn't realized when I first started working with plots in Python that Matplotlib did it. And you can do lots of different ways you can simulate or do animations within Matplotlib. And the top picture of the article is this raindrop simulation. And I could just sit and watch this for like an hour. I was thinking the same thing. It's like the equivalent of white noise, but visual.
Starting point is 00:14:34 It's just like, yeah, it really does look like raindrops hitting a little pond or a puddle or something. It's quite cool. Right. So it has these random circles that appear dark. And then as they get bigger, they get lighter, and then they eventually disappear, and that just happens all over the page. And it's pretty neat. But that's all using Matplotlib animations. And there is a link to ways to do animations and the author prefers funk animation and has a tutorial for animating a sine wave the confusing part of that to me was that the x-axis doesn't really mean anything at that point because the sine wave keeps moving right but it's a pretty small concise example of how animations work yeah it's super easy so basically you create a figure set it up in matplotlib you create an initialized function that sets using data that is changing,
Starting point is 00:15:46 you can live update those. Animating turning a 3D plot, and that's really pretty. Yeah, yeah. There's a bunch of cool graphs here. And yeah, I could see if I had stuff to graph, I would be all over this. I guess there's a third-party package called Celluloid that makes some of the animations a little bit more concise. So she gives some examples of that too. Cool. Yeah, that's a good one. All right, the last one here for us is PEP554, multiple interpreters or sub-interpreters in the standard lib. Oh, wow.
Starting point is 00:16:15 So this is kind of meta and interesting and possible. So this, I don't believe, is approved yet. This is potentially possibly coming. So I don't think it's out. Yeah, proposed in Python three, nine, and we'll just see if it's, I don't see whether it's approved or not. But you know, maybe probably in three, nine coming is this pep 554, which allows for multiple sub interpreters in the standard lib. So apparently CPython already had this capability to have multiple sub interpreters run, but it was never
Starting point is 00:16:45 exposed. Right? So deep, deep down, there's some module you could use. And here's like a public API on top of that. Okay, so why do you care about it? Well, it says, the proposal introduces the standard lib interpreters module, like import interpreters. And currently, it's provisional. And it basically exposes this core functionality of sub interpreters already provided by the c api along with new functionality here's the most important part for sharing data between interpreters so the idea is you're going to set up some kind of channel which is like a queue or a name pipe or something like that to pass data back like i can't take an object i've created in one part of my program and share it with one of these sub interpreters, I've got to like, JSON serialize
Starting point is 00:17:29 it or pickle it and then bring it back or like there's no data sharing, which is really interesting for isolation. So the main use cases of this are well, one running code in isolation, like if you want to work within your process, and you don't want to kick off another process, say with multiprocessing or something, you can still run code that you don't necessarily trust with some restrictions here because it won't have access to your memory structures or anything like that. It'll just be like a little isolated Python, but you don't have to result to multiprocessing. Oh, that's cool. So that's kind of cool. Maybe plug in systems or something like that, or maybe even incompatible versions of modules. Like maybe, you know, I've run into this with doc opt all the time for some reason, like MailChimp will only use this version, but something else requires another version,
Starting point is 00:18:15 like one's less than something and some other has to be greater than that. You know what I mean? So maybe you could run like that part of the code in one of these sub interpreters and have it run on a different version. I'm not sure. That might take a little bit of juggling. But another one that the one that stood out to me, I think is pretty interesting here is the gill. The global interpreter lock is there because, you know, basically the way people perceive it is, is it blocks parallelism. But the reason it blocks parallelism is around memory management of
Starting point is 00:18:45 shared objects, right? So any PI object you have, it has to have reference counting and whatnot to keep its memory managed. Well, these sub interpreters don't share objects. So they don't share the gill, which means you could have like true computational parallelism in your code. So they all have like their own gIL. Effectively, yes. Yeah, okay. Yeah, exactly. So it's still the GIL, but if you have a bunch of them, then it doesn't really matter.
Starting point is 00:19:09 And you don't have the overhead of multiple processes or passing data from multiple processes or all that. The case that I'm thinking about is people that have tried to write their own little IDE or even a big IDE in Python to run Python. You've got this issue that you still only have one GIL, so you've got to launch another thread, you have to have another task or something.
Starting point is 00:19:31 And this would allow something like that to be easier. Yeah, I agree. I don't know what actually would come out of this, but it looks like it has some interesting potential. And it's also interesting that it basically just formalizes what was already there. So that's pretty cool too. Yeah. Awesome. All right, well well that's it for our main topics you got any extra stuff you want to throw out there?
Starting point is 00:19:50 other than I'm just super excited about PyCon yes it's PyCon Eve I'm so excited it's going to be good I'm looking forward to seeing you and everyone in Cleveland when are we going to be around? do you remember our times? middle of the day yeah after lunch so come try to find us after Do you remember our times? Middle of the day-ish. Yeah, ish.
Starting point is 00:20:05 After lunch. So come try to find us after lunch. Yeah, we won't be at the booth all the time. We're going to be doing other events like open spaces and live recordings and other places. And maybe even attend a talk. Who knows? The times that we will be there, it should be posted on the booth. So there'll be at least three hours each day that we're doing something interesting there.
Starting point is 00:20:23 You can come by and see us. Yep. And get stickers. And stickers. And stickers. Definitely find us at PyCon. How about you? You've got some big news. Yeah, I got some big news. The big news is my iOS app is finally out after negotiating, let's call it, with the App Store folks who were better than Google Play, but still it was quite the back and forth to get everything right. So finally, the TalkPython training iOS app is out.
Starting point is 00:20:47 Check it out at training.talkpython.fm slash apps. If you install it, log in, you can get two free courses in addition to the ones you might already have. So that's pretty cool. Yep, I installed it this morning. You did? You're already on top of it? Yeah, yeah, it looks great. Right on, thanks. That's very cool.
Starting point is 00:21:02 And then lastly, you know, like I've said, our listeners are awesome. Anytime we say here's something that's kind of unique, they're like, and the five other amazing ones. So here's one called blessings. We talked about bullet and we talked about cooked input and blessings is kind of in that realm of making terminal input nicer. And this is also output. It's not exactly the same, but blessings lets you do things like bold your terminal output and move the cursor around and do all sorts of cool stuff so this is from eric rose and prasen daniel sent this over and said hey you should check this out in addition to the ones you're talking about so here it is looks pretty cool oh i have the exact use case for this in mind so yay nice what are you going to do with it i want to do like i just got
Starting point is 00:21:42 finished with uh reading the TDD by example. Yeah, I know. You would have thought I would have learned that beforehand. But, yeah, I finally read it. And one of the things is a to-do list that is bold for stuff you're working on and, like, not. Nice. Anyway, it uses that.
Starting point is 00:21:59 Super cool. Yeah, there it is. Awesome. All right. We have a joke coming in from Topher Chung. Do you want to do it or should I do it? Oh, you do it. All right. Knock, knock. Race condition. Who's there?
Starting point is 00:22:14 Knock, knock. Race condition. Who's there? Oh, perfect. All right. Well, these never get old. We could do... I'm starting to notice that the pie joke well is starting to run a little dry. We've been emptying it a lot. so people have to start sending in their jokes. That was a good one. Thank you, Topher. Yeah, thanks. All right.
Starting point is 00:22:31 Well, Brian, we shall reconvene in Cleveland. Yeah, talk to you there. All right. See you, everyone there. See you, Brian. Bye. Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes.
Starting point is 00:22:41 That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Auchin, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.