Python Bytes - #282 Don't Embarrass Me in Front of The Wizards

Episode Date: May 3, 2022

Topics covered in this episode: pyscript Memray from Bloomberg pytest-parallel Pooch: A friend for data files Extras Joke See the full show notes for this episode on the website at pythonbytes.f...m/282

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 282, recorded May 3rd, 2022. I'm Michael Kennedy. And I am Brian Ocken. It's great to have you here, Brian. It's just us, just the two of us. Yeah, just like old times. I know, but we have our friends out in the audience, so we're not entirely alone. It's great.
Starting point is 00:00:23 So, let's kick it off i'm i know you have a particularly exciting announcement uh topic to cover here so definitely let's let's go do it okay so um py script so this was a an announcement at pycon us by uh anaconda's ceo peter wang uh during a keynote i wasn't there but like everybody was tweeting about it so it almost felt like i was i was there so um uh but but i i haven't seen the presentation so i can't wait can't wait till that goes online so uh and if i know are the videos i have not seen the videos for the presentations at pycon out yet are they out yet and i just missed it i haven't looked. Is my YouTube broken?
Starting point is 00:01:06 It should be full of this stuff. Like what's up with, is it supposed to be next day or something? I don't know. I know, I know. Anyway. I would have loved to live stream it, but I didn't see an option.
Starting point is 00:01:15 So anyway, I'm looking forward to watching this one in particular when it comes out, because this is big news. So PyScript is Python in the browser. So what does that mean? It is built on top of Pyodide, which is a port of CPython based on WebAssembly. I'm pretty sure we've covered Pyodide before.
Starting point is 00:01:32 But so this is a pretty neat thing. And one of the things that this, so the PyScript.net, you go to it, it's got a little, it's kind of actually, it's like hype and it sounds neat and you can do python in the browser neat with the py script tags but what does that mean so there's a if you go down to the bottom there's a github uh repo that you can go look at this is what i suggest and this will talk about um there's a getting started guide um but what i did is uh just followed this um i cloned the repo and then i went in and did the into the javascript area and then did npm install and then did this dev run run dev thing so this will only take me like five minutes to get this far and um and what you have is you've got one of the things that it
Starting point is 00:02:20 has is it has an examples folder and you can just open this up now in your local your local browse local host um and there's all these cool demos like there's a a repl where you can just do it's kind of like um uh jupiter where you can say like x equals three let's do this uh and then x and then if i do shift enter it evaluates it how neat is that that's pretty neat oh that's awesome yeah um to do app here so make sure you listen to our podcasts go by python testing the pytest we'll check that because we know you already bought that so um and then uh here's an example with d3 uh graphics this is neat i i don't think i've ever done this the there's an altair example and this is pretty fun because you click around
Starting point is 00:03:02 and it changes the above it's like an interactive thing thing. This is fun. We use Altair with a project at work. So this is neat. The Mandelbrot set. So there's some code. So all of this code is in the repo. So you can look at the examples and look exactly how the code is done. There's a HTML file and a Python file for all of these. So you can check it out. Actually, I don't know about the Python thing. It's it's it's HTML and Python within the HTML code embedded. So there isn't a separate file, but you have, you can do imports and all this sort of stuff too. Um, Oh, I went too far, but I wanted to bring up, there's also an article that we're going to link to in the show notes that is called, um, uh, PyScriptash the Power of Python in Your Browser.
Starting point is 00:03:46 This is by Eric Lewinson, and it runs through. It's a pretty interesting little quick read of what it is if you're not familiar with WebAssembly and Pyodide. So it's nice. What do you think, Michael? So excited. I am very excited. You know, there's been progress on the WebAssembly plus Python side on several occurrences that were, they give you a sense of what's possible, but they didn't give you a thing to build with.
Starting point is 00:04:15 You know what I mean? Yeah. So, for example, Pyodide is awesome, but it's kind of like, well, if I want to sort of host Jupyter kernel in my browser, like I can, I can kind of do that. Right. The WebAssembly Python itself is great, but it doesn't specify a way to have a UI of your web page interact with Python. It's just, oh, you could execute Python over here. Well, like, and then what? You know what I mean?
Starting point is 00:04:38 Which is, which is still good, but there's not something where like I can have a button on there that like wires up to this thing in Python and I have this list that binds in that way and so on. Yeah. And this looks like we might be there. Like one of the things they talk about on the page is not just running Python in the browser and the Python ecosystem, as you pointed out. But really importantly, two more things. Python with JavaScript, bidirectional communication between Python and JavaScript objects. Yeah. So you can wire into like events on the page and other DOM type of things.
Starting point is 00:05:13 Yes. And then a visual application development ties in with that with use readily available curated UI components such as buttons, containers, text boxes, and more. Oh yeah. Yeah. I mean mean like these are just little quick examples but i'd love to see some some uh bigger examples of things like that like being able to connect um uh it you know yeah javascript interaction with uh stuff on on the python side it'll be neat yeah it's weird to see python written just straight in the browser you
Starting point is 00:05:42 know yeah like here you have like angle bracket pi dash script and just import anti-gravity anti-gravity dot fly and like wait what well so this this is a good example i picked this example for one is because it does it does do an import so this there's like a path thing you so you can set up so you can put code you can put code all your code doesn't have to be in html it can be in in a python file so you can debug it there which that's where you want to debug it and then you can import and call it within python this is this is probably more where i would use it is uh putting most of my code somewhere else and then yeah that's what i want to see i would want to see just python files
Starting point is 00:06:22 and just uh effectively a script tag for it. I mean, maybe you can't do it directly as a script tag, but you could do bracket, PyScript, and then just import and run, right? Yeah, so I haven't looked at this before. So the antigravity.py that it's bringing in is bringing in some Pyodide stuff to be able to work it. I'm seeing some from document doc this is python code from document or sorry from js import document yeah and set interval and so those are the things you do
Starting point is 00:06:54 there uh let's see are there any any callbacks i don't see any callbacks there oh yeah yeah this set interval as a callback self..move when the JavaScript interval fires. So under fly, that is hooking into a timer there. Yeah. Timer callback. So we should check that out. So where's that? So the anti-gravity.
Starting point is 00:07:15 I should have done this ahead of time. The anti-gravity is not linked to, but I'll just bring it up. Anti-gravity. Based on. Wow. Oh, my gosh. This this is so amazing people have to do this oh this is cool we all know import anti-gravity and we've got to know the xkcd that comes up but yes having it animated it's great it's alive it's not just is the person who who says how are you flying the person says i'm playing with python like that thing is alive and cruising around. I love it.
Starting point is 00:07:45 Yeah. And that's based on the callback, right? That's calling Python based on the set interval timer callback in JavaScript. Yep. Yeah. And to me, that has been the missing piece. Like, how do I wire up? It's like, great if I can just execute Python
Starting point is 00:07:59 and have, you know, like a number come out. But what I want is view in Python or React. I want to build the UI in Python and just not deal with JavaScript and be able to do so many more things on the front end. I mean, this opens up stuff like progressive web apps, which could be really amazing for the Python space, right? Like I'm here in Vivaldi.
Starting point is 00:08:21 If I go to my email client, just in the browser, I can right-click and install. It gets its own app that works offline. It pulls its data down into local DB or whatever. Theoretically, you could do this, right? You could pull down the CPython WASM. You could pull down the 5K Pyscript file and then just somehow use JavaScript to Python to talk to local DBs.
Starting point is 00:08:43 I mean, what if we get like orms in python going oh yeah we have one of our back ends is the web browser uh local db yeah or something that would i mean this is great i would love i'm very excited for where this might go sky's the limit right that's what that little flying character is saying at least yeah okay so well good job anaconda folks and i believe this was fabio and and crew so really really nice that was super psyched how am i gonna follow that one up right i mean come on it's just i'll give it a try no i've got some good items they're just not flying around amazing python in the browser amazing so So Bloomberg has a lot of Python going on and Bloomberg actually
Starting point is 00:09:27 has a pretty cool like tech engineering blog where they talk about some of the stuff going on at Bloomberg, right? Yeah. One of the really good articles I read from this, from them was about how to really set up and run micro whiskey in production. And it was like this huge, long, deep list of like, here's a bunch of flags you probably never thought about and here's why you should care about them in python really good stuff so they're back with another thing that they use that is cool called memory like memory but memory it is a memory profile for python so if you want to understand the performance of your application, especially around memory, here's a pretty neat tool. Now, let me just get that right out of the way before I
Starting point is 00:10:11 forget Linux only. So if you're not using Linux, just close your ears. No, just kidding. Like you could all if you're on Windows, you could just run your Python app under WSL, and then profile it and then go back to running on Windows. Or if you're on Mac, just do a VM or something, right? Anyway, it only runs on Linux, but because Python is so similar across the platforms, I'm sure you could just test your code there, even if that's not the main use case. All right. So you get all these different visualizations of memory usage. It can track allocations for Python code in native extension modules like NumPy or something like that. And even within CPython itself. So you get sort of a holistic view of the memory, which is pretty awesome. Yeah. Yeah. And it'll give you a different memory reports. We'll talk about them a
Starting point is 00:10:56 little bit and you can use it as a CLI tool, just like kind of like time it or whatever. You can just say memory run my app. And then when your app exits, it's like, and here's what happened. One of the things that's super challenging about complicated applications and web apps and stuff is you want to focus on a particular scenario and there's so much overhead of like startup and other things.
Starting point is 00:11:19 So for example, if I just want to profile a fast API, API call, if I just say run it up, and then I go hit that API, all of the infrastructure starting up UVA corn, and fast API and Python is like it just dwarfs whatever that little thing is usually. So there's also a programmable API that says, you know, you could create like a context manager, like, I don't know if it actually is that way. But you could certainly build it if it doesn't exist. Like with memory profile here and just do a little block of code and then get an answer,
Starting point is 00:11:49 which I think is pretty neat. Alvaro asks if it accepts an entry point. I suspect you could call an entry point because you just do the run on the command prompt. So you could probably pass it over. Whatever you run, yeah. Yeah, exactly. But the problem is there's still like the startup
Starting point is 00:12:06 of just CPython itself, right? Like I always find just the imports and all that is just way more overhead than, you know, it clutters it up. Anyway, let's hit some notable features of Memray. It traces every function call as opposed to sampling it. So instead of just going every millisecond, what are you doing now? What are you doing now? Let's just record that, right? It actually exactly traces so you don't
Starting point is 00:12:30 miss any functions being called, even if they're brief. It handles native calls in C++ library. So the entire stack is represented in the results, which is pretty cool. That's pretty neat. Yeah, that's pretty dope. Apparently it's blazing fast. There's some kind of character. I think it's a race car there. It causes minimal slowdown in the app if you're doing Python racing. If you do the native code stuff, it's a little bit slower, it says,
Starting point is 00:12:53 but that's optional. You get a bunch of reports. We'll see those in a minute. It works on Python threads. So you can see, I know all these people watching, but you check out the webpage. There's a little thread,
Starting point is 00:13:03 like a sewing thread emoji. Or a Twitter thread. Yeah, indeed. So it also works on native threads, like C++ threads and native extensions, which it represents as an alien plus the thread icon. I love it. Alien threads, yeah.
Starting point is 00:13:19 Yeah, yeah, yeah. So let's look over here real quick. We'll look at just, I guess the reporting, right? I mean, the running is super simple, as I said. Memory run Python file with arguments or memory run dash M module with arguments. These are the places you could put your entry point and so on.
Starting point is 00:13:35 And Dean in the audience says, we've had a rich spotting. Okay, I haven't pulled that up yet, but very nice. So there's different ways in which you can view it. And the first one that I ran across, which is pretty interesting, if you're familiar with glances or you want to go old school, like top or one of these things, you can run in just the terminal and get,
Starting point is 00:13:54 not really with rich, not rich, not rich with top, but rich output like glances, is you can run it in a live mode where while it's running, it'll show you what's happening with the memory. That is so awesome. That's pretty cool. Yeah. Yeah.
Starting point is 00:14:10 Yeah. So instead of just showing you a memory graph, it's like, guess what? We're running here right now with this many allocations and, and so on. Yeah. Like that looks super neat.
Starting point is 00:14:18 And if you've got the dash dash life, and if you've got something interactive, you can interact with it and watch the memory change then. So, yeah. Yeah, yeah. You can cycle through threads. You can sort by total memory or its own memory. That's a common thing you do in profiling like this and all the stuff it's called or just this method itself.
Starting point is 00:14:41 Sort by allocations versus memory usages, all kinds of stuff. So that's really neat. It will track the allocations across forks, as in process, sub-process. Why would you care? Because multi-processing. If you want to track some kind of multi-processing memory workflow, it'll actually do that. Just you do dash, dash, follow fork, and it'll like aggregate the stats across the different processes. Kind of insane. Let's see if we can get down here. You can do, they have the summary reporter, which is kind of a nice just,
Starting point is 00:15:09 this is probably what you would expect. Flame graphs, if I can get down here somewhere, it'll show like sort of the color and the width of these bars. It'll show you how significant it is. There's a nice tree version that'll show you the biggest 10 allocations and then a call stack sort of in and out with trees and like how much memory is being allocated in each one of those and so on
Starting point is 00:15:29 that's nice yeah this is a nice app right nice uh utility definitely cool yeah indeed indeed so if you want to track down memory leaks or you're just wondering like why is my program using so much memory fired up let it run for a while see what happens yeah cool all right back to you brian well i want to bring up a a pi test tool so um it was a i i have uh recent i've often used a pi test x dist uh for parallel so x dist is a way you can just say that it's it's the one that I heard about first for running PyTest in parallel. So you've got, you know, like tons of unit tests maybe, and you want to just speed them up. You can throw them, throw a dash N for something like that at it,
Starting point is 00:16:17 and it'll just throw them, launch different processes and, and run PyTest in parallel on a bunch of them. So it cuts time down. But there's overhead. And I was recommending this to somebody on Twitter, and I think it was Bruno Olivier suggested a couple of alternatives, and one of them was PyTest Parallel, which I know I've run across, but I haven't played with it for a while, so I tried it out. And it's actually like really cool. So one of the, one of the, PyTest X does a lot.
Starting point is 00:16:56 One of the things it does is it not just, it's not just multiprocessor, but it can be on different actual different computers. So you can launch them on. Oh, nice. Like grid computing almost. Yeah. You can SSH into different systems and have it run in parallel. But I don't usually need that kind of power. The one thing it doesn't do is thread, so it's process-based.
Starting point is 00:17:14 And PyTest Parallel does both. So you can say you can give it where we have. I'm going to go down to the examples. So you can give it number of workers, I'm going to go down to the examples. So you can give it number of workers and it'll tell it to, that's how many processes it'll spin up or how many CPUs. Now you can also give it test per worker and then it'll run in multi-threading mode and you can give it auto on both of these.
Starting point is 00:17:41 And this is extremely useful for, you you have to by default this is turned off by default the the features if you just say workers equals five or something it won't do multiple threat multi-threading and the reason is it because you need to make sure your tests are thread safe um and many are not so i tried it on a couple of my products. Even if they're isolated, they might not be thread safe, right? Yes. That's another level of consideration. However, there's a lot of small, especially small, not really unit,
Starting point is 00:18:15 like system tests, but a lot of unit tests are just testing a little Python code. If you've got a part of that, a lot of projects, that's a big chunk of the test load. So being able to do multi-threading is really nice but you know even with just multi-processing i tried this on a few different
Starting point is 00:18:31 projects and there were like i tried it on flask and the um uh the the parallel version using pytest parallel was like three times faster than the exodus version so um so based on your i there's there was another one that bruno mentioned but i think these two are really solid exodus and parallel so if you want to speed up your test run times i would try both on your project and just see play with them and see see which one's faster on uh of the projects I tried. Parallel was at least as fast or faster than X-Test. So it's kind of nice. Yeah, it's cool. This looks great.
Starting point is 00:19:11 I like it. And having your test run faster is always good. Do you do anything crazy? Like, do you set up your editor to auto run tests on file change or anything like that? Sometimes. One of the things that I've always, I've done it a few times but it
Starting point is 00:19:25 always makes me nervous i'm like i just it's unnerving to me that it just keeps running one of the the things that i really like around that was added to pytest not too long ago was um is stepwise so that's not really all the running it all the time but um stepwise will and this would be a handy one to to run all the. So what stepwise does is it takes, you can run all your tests in stepwise and when you run it again, it'll start at the first failing test because it assumes you're trying to fix something.
Starting point is 00:19:56 It'll start at that and then run until it finds a failure. So if you haven't fixed this first failure, it'll just keep running that one until you fixed it and then it'll go to the next one. And so i do that a lot while i'm trying to debug something um that's cool and and hooking that up with like an auto like a watch feature there's a bunch of ways you can watch your code too to do that um yeah yeah it's fun nice very cool so let's do
Starting point is 00:20:21 some real-time follow-up here first alvaro is being all mischievous asking i wonder what would happen if i install both plugins both xdisk and parallel i you could i don't know if i've you can run them at the same time i should try i have it installed on like the flask one i ran it i installed both of them and then tried them both but not at the same time i'll have to try to work the forks it's gonna go so fast and then just going both, but not at the same time. I'll have to try that. Fork the forks. It's going to go so fast. And then just going back to PyScript, there's like tons of excitement about PyScript. Yeah.
Starting point is 00:20:50 JL's excited. Brandon's excited. And David says, I hope someday I can say, back in my day, you couldn't just learn Python. You had to learn JavaScript too. Yeah. Indeed, indeed. Let's see.
Starting point is 00:21:07 So I got one more to cover that is going to be fun as well. And this one comes to us from former guest co-host, Michael Fikert, sorry, Matthew Fikert. And Matthew is a great supporter of the show. He sends all sorts of interesting things in to help us out and good ideas. And this is yet another one coming from the data science side of things, saying, you know, one of the things you have to do often in say a Jupyter notebook is go download a file off of an API or just some link or S3 bucket or whatever,
Starting point is 00:21:36 and you want to process it. And if you use requests, wow, great. You end up making the request, verifying that it worked, reading the stream into bytes, writing the bytes to a file, picking a file name and then using that file name to open it and then say, now you can process it. Right. Yeah.
Starting point is 00:21:53 So there's this thing called pooch, a friend to fetch your data. All right, pooch, go get my files. Like a little friendly dog that also seems to hold a snake in its mouth. So that's pretty cool. Anyway, who wouldn't want a dog that can wrangle snakes to go help you with your notebooks? Anyway, the idea is you can do all of what I described with requests. You can do that in one line of code. Oh, wow.
Starting point is 00:22:17 Yeah. And you get other cool features as well. So it says, look, you can just make this one function call and it'll save it. And it'll also cache your files locally. So some of these files that data scientists especially work with are massive, right? You know, it's like a gig. And every time you run the notebook, you don't want it to download the gig again. You just want it to run more quickly. So you can set up a location for it to cache it. You can pass in a hash of the file to say, I want to get this file and I expect
Starting point is 00:22:44 it to be this MD5 or whatever the heck the hash is that they're using so that you can be sure it doesn't change, right? So if you're doing like reproducible data science, you say, what you do is you download this file, then you apply this algorithm, then you get this picture. Well, if the data changes, I bet the picture changes, right? And so you can put it like a layer of verification that it's unchanged from the last time you decided what it should be. That's pretty cool. You can do multiple protocols. So not just HTTP, HTTPS, but FTP. Oh my gosh. SFTP. Oh yeah. It's what else basic off. It'll also automatically resolve DOIs, digital object identifiers, which are used in places like Figshare and Zenodo.
Starting point is 00:23:28 And this is about the reproducible science. Like here's the file and like we've been assigned an immutable ID that we can always refer back to it. So you can just say, here's the ID and it'll actually get the file and it'll even unzip and decompress files upon download. Neat.
Starting point is 00:23:41 Pretty neat, huh? Yeah. Yeah. Pretty straightforward. Let me see if I can find an example of i love i like the the the section of learning about it it's called training your pooch that's cute oh nice i love it apparently it has progress bars post download actions logging and uh you get multiple files but the main use case is just file equals pooch.retrieve URL done. That seems pretty nice.
Starting point is 00:24:08 Yeah, that's great. It's my data. Here it is. Oh, cool. So Pamphil Roy out in the audience says, hey folks, funny, we're adding this to SciPy optional to have a SciPy dataset submodule. Scikit-image is using this as well.
Starting point is 00:24:22 I had no idea. Very cool. Thanks for that extra background there. Cool. Yeah. But I think this is great. In fact, I know it sells itself. It bills itself as being for data science. I also like to download files sometimes and not go through five or six lines of code. I could use this. Yeah. Yeah. There's a lot of stuff that data science people are doing that we can use in lots of other fields. So indeed, I do think that's actually one of the really interesting aspects of Python is we have so many people from these different areas that it's not just all, you know,
Starting point is 00:24:53 CS grads doing the same thing. Yeah. Yeah, for sure. All right. Well, those are my items for today, Brian.
Starting point is 00:25:01 Nice. I don't have any extras today. Do you have any extras today. Do you have any extra stuff? I do. I do have extras. So this one I'm very, very excited about. I have a new course that I just released called Up and Running with Git,
Starting point is 00:25:18 a Pragmatic UI-Based Introduction. So I'm really excited. I just released it. I haven't really even announced it yet, but I finished getting it all public and online and turned all the GitHub repos public and all that stuff right before we jumped on the call today. And the idea is there are tons of Git courses. So why create a Git course? Well, I feel like so many of them are just like, okay, we're just going to work in the terminal or the command prompt. And you're just going to assume
Starting point is 00:25:43 that like, that's the world of Git that you live in. Like kind of a least common denominator approach. And while that is useful, like I don't think that's how most people are working, right? If you're in Visual Studio Code or PyCharm, like there's great hotkeys just to do the Git stuff and see the history and whatnot. And there's other tools like SourceTree and Power and others.
Starting point is 00:26:01 So it kind of takes this approach of like, well, let's take all the modern tools that give you the best visibility and teach you get with that. So super fun. Which GUI tools are you using then? Which ones are you showing? Visual Studio Code.
Starting point is 00:26:14 Okay. PyCharm. SourceTree. Okay. Those are the thing. And so I've done a lot of work. I've tried to take some of my experience from doing some work on YouTube
Starting point is 00:26:22 where I was experimenting with like setup and presentations and stuff. And I think I have a really neat, polished experience for this course with like lots of cool visuals and graphics and video and stuff. So hopefully people really enjoy it. Anyway, this is my extra. I just sent this out to the world. I'm pretty excited about this. Nice. Congrats. Yeah. Thanks. Thanks so much. You have no extras. Does that mean you're ready for some humor? Yes. Always.
Starting point is 00:26:48 All right. This one, I chose this, honestly, I just chose it just because of the title. So there's Robert, is this Robert Downey Jr. looking at somebody in like some kind of wizard situation, right? Yeah. This is like Endgame or something. Okay. Yeah. I don't know the movie. Apparently I stopped watching movies at some point,
Starting point is 00:27:09 and now I'm out of touch. So anyway, the title is When Your Code Stopped Working During an Interview, or it could be a demo presentation or whatever. You want to tell us what this is about, what's going on here? So he's looking back at B banner so who's the hulk says dude you're embarrassing me in front of the wizards yeah because um yeah because banner wasn't
Starting point is 00:27:31 able to become the hulk so at the time try to don't don't embarrass me in front of the wizards i just i love to think of programmers it's kind of like the modern day wizards like we can think of things and then poof they they kind of come into existence. Yeah. It's good. And also while working on that Git course, I had this pretty fun experience. Like right while I was recording it. Nice.
Starting point is 00:27:54 And I'm just sitting there and then... Git was down. How often does GitHub itself go down? But no, oh no, there's like the octocat is falling like with a 500 sign in its hands. Which of course made me redo that section of the course. Yeah. I like the expression on your face for that. It's like.
Starting point is 00:28:13 Yes, exactly. People seem to really like that tweet. I'll put it in the show notes. People can check it out. Anyway, dude, don't embarrass me in front of the lizards. That's what I got for you. Yeah. Good.
Starting point is 00:28:24 Good. Good. Well, thanks. Thanks a lot again. It's a great show. Yeah, sure was. Thanks. Thanks, Brian. Thanks for everyone who came. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.