Python Bytes - #282 Don't Embarrass Me in Front of The Wizards
Episode Date: May 3, 2022Topics covered in this episode: pyscript Memray from Bloomberg pytest-parallel Pooch: A friend for data files Extras Joke See the full show notes for this episode on the website at pythonbytes.f...m/282
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 282, recorded May 3rd, 2022.
I'm Michael Kennedy.
And I am Brian Ocken.
It's great to have you here, Brian. It's just us, just the two of us.
Yeah, just like old times.
I know, but we have our friends out in the audience, so we're not entirely alone.
It's great.
So, let's kick it off i'm i know you have a
particularly exciting announcement uh topic to cover here so definitely let's let's go do it
okay so um py script so this was a an announcement at pycon us by uh anaconda's ceo peter wang uh
during a keynote i wasn't there but like everybody was tweeting
about it so it almost felt like i was i was there so um uh but but i i haven't seen the
presentation so i can't wait can't wait till that goes online so uh and if i know are the videos
i have not seen the videos for the presentations at pycon out yet are they out yet and i just
missed it i haven't looked. Is my YouTube broken?
It should be full of this stuff.
Like what's up with,
is it supposed to be next day or something?
I don't know.
I know, I know.
I would have loved to live stream it,
but I didn't see an option.
So anyway,
I'm looking forward to watching this one
in particular when it comes out,
because this is big news.
So PyScript is Python in the browser.
So what does that mean?
It is built on top of Pyodide, which is a port of CPython based on WebAssembly.
I'm pretty sure we've covered Pyodide before.
But so this is a pretty neat thing.
And one of the things that this, so the, you go to it, it's got a little, it's kind
of actually, it's like hype and it sounds neat and you can do python in the browser neat
with the py script tags but what does that mean so there's a if you go down to the bottom there's
a github uh repo that you can go look at this is what i suggest and this will talk about um there's
a getting started guide um but what i did is uh just followed this um i cloned the repo and then i went in and did the into the javascript
area and then did npm install and then did this dev run run dev thing so this will only take me
like five minutes to get this far and um and what you have is you've got one of the things that it
has is it has an examples folder and you can just open this up now in your local your local browse
local host um and there's all these cool demos like there's a a repl where you can just do it's
kind of like um uh jupiter where you can say like x equals three let's do this uh and then x and
then if i do shift enter it evaluates it how neat is that that's pretty neat oh that's awesome
yeah um to do app here so make sure you
listen to our podcasts go by python testing the pytest we'll check that because we know you already
bought that so um and then uh here's an example with d3 uh graphics this is neat i i don't think
i've ever done this the there's an altair example and this is pretty fun because you click around
and it changes the above it's like an interactive thing thing. This is fun. We use Altair with a project at work. So this is
neat. The Mandelbrot set. So there's some code. So all of this code is in the repo. So you can
look at the examples and look exactly how the code is done. There's a HTML file and a Python
file for all of these. So you can check it out. Actually, I don't know about
the Python thing. It's it's it's HTML and Python within the HTML code embedded. So there isn't a
separate file, but you have, you can do imports and all this sort of stuff too. Um, Oh, I went
too far, but I wanted to bring up, there's also an article that we're going to link to in the show
notes that is called, um, uh, PyScriptash the Power of Python in Your Browser.
This is by Eric Lewinson, and it runs through.
It's a pretty interesting little quick read of what it is if you're not familiar with
WebAssembly and Pyodide.
So it's nice.
What do you think, Michael?
So excited.
I am very excited.
You know, there's been progress on the WebAssembly plus Python side on several occurrences that were, they give you a sense of what's possible, but they didn't give you a thing to build with.
You know what I mean?
So, for example, Pyodide is awesome, but it's kind of like, well, if I want to sort of host Jupyter kernel in my browser, like I can, I can kind of do that.
The WebAssembly Python itself is great, but it doesn't specify a way to have a UI of your web page interact with Python.
It's just, oh, you could execute Python over here.
Well, like, and then what?
You know what I mean?
Which is, which is still good, but there's not something where like I can have a button on there that like wires up to this thing in Python and I have this list that binds in that way and so on.
And this looks like we might be there.
Like one of the things they talk about on the page is not just running Python in the browser and the Python ecosystem, as you pointed out.
But really importantly, two more things.
Python with JavaScript, bidirectional communication between Python and JavaScript objects.
So you can wire into like events on the page and other DOM type of things.
And then a visual application development ties in with that with use readily available
curated UI components such as buttons, containers, text boxes, and more.
Oh yeah.
Yeah. I mean mean like these are just
little quick examples but i'd love to see some some uh bigger examples of things like that like
being able to connect um uh it you know yeah javascript interaction with uh stuff on on the
python side it'll be neat yeah it's weird to see python written just straight in the browser you
know yeah like here you have like angle bracket
pi dash script and just import anti-gravity anti-gravity dot fly and like wait what well
so this this is a good example i picked this example for one is because it does it does do
an import so this there's like a path thing you so you can set up so you can put code you can put
code all your code doesn't have to be in html it can be in in a
python file so you can debug it there which that's where you want to debug it and then you can import
and call it within python this is this is probably more where i would use it is uh putting most of my
code somewhere else and then yeah that's what i want to see i would want to see just python files
and just uh effectively a script tag for it.
I mean, maybe you can't do it directly as a script tag,
but you could do bracket, PyScript, and then just import and run, right?
Yeah, so I haven't looked at this before.
So the that it's bringing in is bringing in some Pyodide stuff
to be able to work it.
I'm seeing some from document doc this is python code from
document or sorry from js import document yeah and set interval and so those are the things you do
there uh let's see are there any any callbacks i don't see any callbacks there oh yeah yeah this
set interval as a callback self..move when the JavaScript interval fires.
So under fly, that is hooking into a timer there.
Timer callback.
So we should check that out.
So where's that?
So the anti-gravity.
I should have done this ahead of time.
The anti-gravity is not linked to, but I'll just bring it up.
Based on.
Wow. Oh, my gosh. This this is so amazing people have to do this
oh this is cool we all know import anti-gravity and we've got to know the xkcd that comes up
but yes having it animated it's great it's alive it's not just is the person who who says how are
you flying the person says i'm playing with python like that thing is alive and cruising around. I love it.
And that's based on the callback, right?
That's calling Python based on the set interval
timer callback in JavaScript.
Yep. Yeah.
And to me, that has been the missing piece.
Like, how do I wire up?
It's like, great if I can just execute Python
and have, you know, like a number come out.
But what I want is view in Python or React.
I want to build the UI in Python
and just not deal with JavaScript
and be able to do so many more things on the front end.
I mean, this opens up stuff like progressive web apps,
which could be really amazing for the Python space, right?
Like I'm here in Vivaldi.
If I go to my email client,
just in the browser, I can right-click and install.
It gets its own app that works offline.
It pulls its data down into local DB or whatever.
Theoretically, you could do this, right?
You could pull down the CPython WASM.
You could pull down the 5K Pyscript file and then just somehow use JavaScript to Python
to talk to local DBs.
I mean, what if we get like orms
in python going oh yeah we have one of our back ends is the web browser uh local db yeah or
something that would i mean this is great i would love i'm very excited for where this might go
sky's the limit right that's what that little flying character is saying at least yeah okay
so well good job anaconda folks and i believe this was fabio and
and crew so really really nice that was super psyched how am i gonna follow that one up right
i mean come on it's just i'll give it a try no i've got some good items they're just not
flying around amazing python in the browser amazing so So Bloomberg has a lot of Python going on and Bloomberg actually
has a pretty cool like tech engineering blog where they talk about some of the stuff going on
at Bloomberg, right? Yeah. One of the really good articles I read from this, from them was about
how to really set up and run micro whiskey in production. And it was like this huge, long,
deep list of like, here's a bunch of flags you
probably never thought about and here's why you should care about them in python really good stuff
so they're back with another thing that they use that is cool called memory like memory but memory
it is a memory profile for python so if you want to understand the performance of your application, especially
around memory, here's a pretty neat tool. Now, let me just get that right out of the way before I
forget Linux only. So if you're not using Linux, just close your ears. No, just kidding. Like
you could all if you're on Windows, you could just run your Python app under WSL,
and then profile it and then go back to running on Windows. Or if you're on Mac,
just do a VM or something, right? Anyway, it only runs on Linux, but because Python is so similar across the platforms, I'm sure you could just test your code there, even if that's not the main
use case. All right. So you get all these different visualizations of memory usage.
It can track allocations for Python code in native extension modules like NumPy or something like
that. And even within CPython itself. So you get sort of a holistic view of the memory, which is
pretty awesome. Yeah. Yeah. And it'll give you a different memory reports. We'll talk about them a
little bit and you can use it as a CLI tool, just like kind of like time it or whatever. You can
just say memory run my app. And then when your app exits, it's like,
and here's what happened.
One of the things that's super challenging
about complicated applications and web apps and stuff
is you want to focus on a particular scenario
and there's so much overhead of like startup
and other things.
So for example, if I just want to profile
a fast API, API call, if I just say run it up, and then I go
hit that API, all of the infrastructure starting up UVA corn, and fast API and Python is like it
just dwarfs whatever that little thing is usually. So there's also a programmable API that says,
you know, you could create like a context manager, like, I don't know if it actually is that way.
But you could certainly build it if it doesn't exist. Like with memory profile here
and just do a little block of code
and then get an answer,
which I think is pretty neat.
Alvaro asks if it accepts an entry point.
I suspect you could call an entry point
because you just do the run on the command prompt.
So you could probably pass it over.
Whatever you run, yeah.
Yeah, exactly.
But the problem is there's still like the startup
of just CPython itself, right?
Like I always find just the imports
and all that is just way more overhead
than, you know, it clutters it up.
Anyway, let's hit some notable features of Memray.
It traces every function call as opposed to sampling it.
So instead of just going every millisecond, what are you doing now?
What are you doing now? Let's just record that, right? It actually exactly traces so you don't
miss any functions being called, even if they're brief. It handles native calls in C++ library. So
the entire stack is represented in the results, which is pretty cool. That's pretty neat. Yeah,
that's pretty dope. Apparently it's blazing fast. There's some kind of character.
I think it's a race car there.
It causes minimal slowdown in the app
if you're doing Python racing.
If you do the native code stuff,
it's a little bit slower, it says,
but that's optional.
You get a bunch of reports.
We'll see those in a minute.
It works on Python threads.
So you can see,
I know all these people watching,
but you check out the webpage.
There's a little thread,
like a sewing thread emoji.
Or a Twitter thread.
Yeah, indeed.
So it also works on native threads,
like C++ threads and native extensions,
which it represents as an alien plus the thread icon.
I love it.
Alien threads, yeah.
Yeah, yeah, yeah.
So let's look over here real quick.
We'll look at just, I guess the reporting, right?
I mean, the running is super simple, as I said.
Memory run Python file with arguments
or memory run dash M module with arguments.
These are the places you could put your entry point
and so on.
And Dean in the audience says,
we've had a rich spotting.
Okay, I haven't pulled that up yet, but very nice.
So there's different ways in which you can view it.
And the first one that I ran across, which is pretty interesting,
if you're familiar with glances or you want to go old school,
like top or one of these things,
you can run in just the terminal and get,
not really with rich, not rich, not rich with top,
but rich output like glances,
is you can run it in a live mode where while it's running,
it'll show you what's happening with the memory.
That is so awesome.
That's pretty cool.
So instead of just showing you a memory graph,
it's like,
guess what?
We're running here right now with this many allocations and,
and so on.
Like that looks super neat.
And if you've got the dash dash life,
and if you've got something interactive,
you can interact with it and watch the memory change then.
yeah. Yeah, yeah.
You can cycle through threads.
You can sort by total memory or its own memory.
That's a common thing you do in profiling like this and all the stuff it's called or just this method itself.
Sort by allocations versus memory usages, all kinds of stuff.
So that's really neat.
It will track the allocations across forks,
as in process, sub-process. Why would you care? Because multi-processing. If you want to track some kind of multi-processing memory workflow, it'll actually do that. Just you do dash, dash,
follow fork, and it'll like aggregate the stats across the different processes. Kind of insane.
Let's see if we can get down here.
You can do, they have the summary reporter,
which is kind of a nice just,
this is probably what you would expect.
Flame graphs, if I can get down here somewhere,
it'll show like sort of the color
and the width of these bars.
It'll show you how significant it is.
There's a nice tree version
that'll show you the biggest 10 allocations
and then a call stack sort of in and out with trees and like how much memory is being allocated in each one of those and so on
that's nice yeah this is a nice app right nice uh utility definitely cool yeah indeed indeed so
if you want to track down memory leaks or you're just wondering like why is my program using so
much memory fired up let it run for a while see what
happens yeah cool all right back to you brian well i want to bring up a a pi test tool so um
it was a i i have uh recent i've often used a pi test x dist uh for parallel so x dist is a way
you can just say that it's it's the one that I heard about first
for running PyTest in parallel. So you've got, you know, like tons of unit tests maybe, and you
want to just speed them up. You can throw them, throw a dash N for something like that at it,
and it'll just throw them, launch different processes and, and run PyTest in parallel on a bunch of them.
So it cuts time down.
But there's overhead.
And I was recommending this to somebody on Twitter,
and I think it was Bruno Olivier suggested a couple of alternatives, and one of them was PyTest Parallel, which I know I've run across,
but I haven't played with it for a while, so I tried it out.
And it's actually like really cool.
So one of the, one of the, PyTest X does a lot.
One of the things it does is it not just, it's not just multiprocessor, but it can be on different actual different computers.
So you can launch them on.
Oh, nice.
Like grid computing almost.
You can SSH into different systems and have it run in parallel.
But I don't usually need that kind of power.
The one thing it doesn't do is thread, so it's process-based.
And PyTest Parallel does both.
So you can say you can give it where we have.
I'm going to go down to the examples.
So you can give it number of workers, I'm going to go down to the examples. So you can give it number of workers and it'll tell it to,
that's how many processes it'll spin up or how many CPUs.
Now you can also give it test per worker
and then it'll run in multi-threading mode
and you can give it auto on both of these.
And this is extremely useful for, you you have to by default this is turned
off by default the the features if you just say workers equals five or something it won't do
multiple threat multi-threading and the reason is it because you need to make sure your tests are
thread safe um and many are not so i tried it on a couple of my products. Even if they're isolated, they might not be thread safe, right?
That's another level of consideration.
However, there's a lot of small,
especially small, not really unit,
like system tests,
but a lot of unit tests
are just testing a little Python code.
If you've got a part of that,
a lot of projects,
that's a big chunk of the test load.
So being able to do multi-threading
is really nice but you know even with just multi-processing i tried this on a few different
projects and there were like i tried it on flask and the um uh the the parallel version using
pytest parallel was like three times faster than the exodus version so um so based on your i there's
there was another one that bruno mentioned but i think these two are really solid exodus and
parallel so if you want to speed up your test run times i would try both on your project and just
see play with them and see see which one's faster on uh of the projects I tried. Parallel was at least as fast or faster than X-Test.
So it's kind of nice.
Yeah, it's cool.
This looks great.
I like it.
And having your test run faster is always good.
Do you do anything crazy?
Like, do you set up your editor to auto run tests
on file change or anything like that?
One of the things that I've always,
I've done it a few times but it
always makes me nervous i'm like i just it's unnerving to me that it just keeps running
one of the the things that i really like around that was added to pytest not too long ago was um
is stepwise so that's not really all the running it all the time but um stepwise will and this
would be a handy one to to run all the. So what stepwise does is it takes,
you can run all your tests in stepwise
and when you run it again,
it'll start at the first failing test
because it assumes you're trying to fix something.
It'll start at that
and then run until it finds a failure.
So if you haven't fixed this first failure,
it'll just keep running that one
until you fixed it
and then it'll go to the next one. And so i do that a lot while i'm trying to debug something
um that's cool and and hooking that up with like an auto like a watch feature there's a bunch of
ways you can watch your code too to do that um yeah yeah it's fun nice very cool so let's do
some real-time follow-up here first alvaro is being all mischievous
asking i wonder what would happen if i install both plugins both xdisk and parallel i you could
i don't know if i've you can run them at the same time i should try i have it installed on like the
flask one i ran it i installed both of them and then tried them both but not at the same time
i'll have to try to work the forks it's gonna go so fast and then just going both, but not at the same time. I'll have to try that. Fork the forks. It's going to go so fast.
And then just going back to PyScript,
there's like tons of excitement about PyScript.
JL's excited.
Brandon's excited.
And David says, I hope someday I can say,
back in my day, you couldn't just learn Python.
You had to learn JavaScript too.
Indeed, indeed.
Let's see.
So I got one more to cover that is going to be fun as well. And this one comes to us from former guest co-host, Michael Fikert, sorry, Matthew Fikert. And Matthew is a
great supporter of the show. He sends all sorts of interesting things in to help us out and good
ideas. And this is yet another one coming from the data science side of things,
saying, you know,
one of the things you have to do often
in say a Jupyter notebook
is go download a file off of an API
or just some link or S3 bucket or whatever,
and you want to process it.
And if you use requests,
wow, great.
You end up making the request,
verifying that it worked,
reading the stream into bytes, writing the bytes to a file, picking a file name and then using that file name to open it and then say, now you can process it.
So there's this thing called pooch, a friend to fetch your data.
All right, pooch, go get my files.
Like a little friendly dog that also seems to hold a snake in its mouth.
So that's pretty cool.
Anyway, who wouldn't want a dog that can wrangle snakes to go help you with your notebooks?
Anyway, the idea is you can do all of what I described with requests.
You can do that in one line of code.
Oh, wow.
And you get other cool features as well.
So it says, look, you can just make this one function call and it'll save it.
And it'll also
cache your files locally. So some of these files that data scientists especially work with are
massive, right? You know, it's like a gig. And every time you run the notebook, you don't want
it to download the gig again. You just want it to run more quickly. So you can set up a location
for it to cache it. You can pass in a hash of the file to say, I want to get this file and I expect
it to be this MD5 or
whatever the heck the hash is that they're using so that you can be sure it doesn't change, right?
So if you're doing like reproducible data science, you say, what you do is you download this file,
then you apply this algorithm, then you get this picture. Well, if the data changes, I bet the
picture changes, right? And so you can put it like a layer of verification that it's unchanged from the last
time you decided what it should be. That's pretty cool. You can do multiple protocols. So not just
HTTP, HTTPS, but FTP. Oh my gosh. SFTP. Oh yeah. It's what else basic off. It'll also automatically
resolve DOIs, digital object identifiers, which are used in places like Figshare and Zenodo.
And this is about the reproducible science.
Like here's the file
and like we've been assigned an immutable ID
that we can always refer back to it.
So you can just say, here's the ID
and it'll actually get the file
and it'll even unzip and decompress files upon download.
Pretty neat, huh?
Pretty straightforward.
Let me see if I can find an example of i love i like the the the section of learning about it it's called
training your pooch that's cute oh nice i love it apparently it has progress bars post download
actions logging and uh you get multiple files but the main use case is just file equals pooch.retrieve URL done.
That seems pretty nice.
Yeah, that's great.
It's my data.
Here it is.
Oh, cool.
So Pamphil Roy out in the audience says,
hey folks, funny, we're adding this to SciPy optional
to have a SciPy dataset submodule.
Scikit-image is using this as well.
I had no idea.
Very cool.
Thanks for that extra background there. Cool. Yeah. But I think this is great. In fact, I know it sells itself. It
bills itself as being for data science. I also like to download files sometimes and not go through
five or six lines of code. I could use this. Yeah. Yeah. There's a lot of stuff that data
science people are doing that we can use in lots of other fields. So indeed,
I do think that's actually one of the really interesting aspects of Python is we have so many people from these different areas that it's not just all,
you know,
CS grads doing the same thing.
for sure.
All right.
those are my items for today,
I don't have any extras today.
Do you have any extras today.
Do you have any extra stuff?
I do.
I do have extras.
So this one I'm very, very excited about.
I have a new course that I just released called Up and Running with Git,
a Pragmatic UI-Based Introduction.
So I'm really excited.
I just released it.
I haven't really even announced it yet,
but I finished getting it all public and online and turned all the GitHub repos public and all
that stuff right before we jumped on the call today. And the idea is there are tons of Git
courses. So why create a Git course? Well, I feel like so many of them are just like, okay,
we're just going to work in the terminal or the command prompt. And you're just going to assume
that like, that's the world of Git that you live in.
Like kind of a least common denominator approach.
And while that is useful,
like I don't think that's how most people are working, right?
If you're in Visual Studio Code or PyCharm,
like there's great hotkeys just to do the Git stuff
and see the history and whatnot.
And there's other tools like SourceTree and Power and others.
So it kind of takes this approach of like,
well, let's take all the modern tools
that give you the best visibility
and teach you get with that.
So super fun.
Which GUI tools are you using then?
Which ones are you showing?
Visual Studio Code.
Those are the thing.
And so I've done a lot of work.
I've tried to take some of my experience
from doing some work on YouTube
where I was experimenting with like setup and
presentations and stuff. And I think I have a really neat, polished experience for this course
with like lots of cool visuals and graphics and video and stuff. So hopefully people really enjoy
it. Anyway, this is my extra. I just sent this out to the world. I'm pretty excited about this.
Nice. Congrats. Yeah. Thanks. Thanks so much. You have no extras.
Does that mean you're ready for some humor?
All right.
This one, I chose this, honestly, I just chose it just because of the title.
So there's Robert, is this Robert Downey Jr. looking at somebody in like some kind of wizard
situation, right?
This is like Endgame or something.
Okay. Yeah. I don't know the movie.
Apparently I stopped watching movies at some point,
and now I'm out of touch.
So anyway, the title is
When Your Code Stopped Working During an Interview,
or it could be a demo presentation or whatever.
You want to tell us what this is about,
what's going on here?
So he's looking back at B banner so who's the hulk
says dude you're embarrassing me in front of the wizards yeah because um yeah because banner wasn't
able to become the hulk so at the time try to don't don't embarrass me in front of the wizards
i just i love to think of programmers it's kind of like the modern day wizards like we can think
of things and then poof they they kind of come into existence. Yeah.
It's good.
And also while working on that Git course,
I had this pretty fun experience.
Like right while I was recording it.
And I'm just sitting there and then...
Git was down.
How often does GitHub itself go down?
But no, oh no, there's like the octocat is falling like with a 500 sign in its hands.
Which of course made me redo that section of the course.
I like the expression on your face for that.
It's like.
Yes, exactly.
People seem to really like that tweet.
I'll put it in the show notes.
People can check it out.
Anyway, dude, don't embarrass me in front of the lizards.
That's what I got for you.
Well, thanks. Thanks a lot again. It's a great show.
Yeah, sure was. Thanks.
Thanks, Brian. Thanks for everyone who came. Bye.