Python Bytes - #249 All of Linux as a Python API
Episode Date: September 9, 2021Topics covered in this episode: Fickling Python Project-Local Virtualenv Management Testcontainers jc What is Python's Ellipsis Object? PyTorch Forecasting Extras Joke See the full show notes fo...r this episode on the website at pythonbytes.fm/249
Transcript
Discussion (0)
Hey there, thanks for listening.
Before we jump into this episode,
I just want to remind you that this episode
is brought to you by us over at TalkPython Training
and Brian through his PyTest book.
So if you want to get hands-on
and learn something with Python,
be sure to consider our courses over at TalkPython Training.
Visit them via pythonbytes.fm slash courses.
And if you're looking to do testing
and get better with PyTest,
check out Brian's book at pythonbytes. get better with PyTest, check out Brian's
book at pythonbytes.fm slash PyTest. Enjoy the episode. Hello and welcome to Python Bytes where
we deliver news and headlines directly to your earbuds. This is episode 249 recorded September
8th 2021 and I am Brian Ocken. Hey I'm Michael Kennedy. And I am Ida Costanza. Hey Eric thanks
for joining us today. Yeah, thank you
so much for having me. So tell us a little bit
about who you are. So first of all,
I'm a long-time listener to the show.
I just told Michael I'm listening since
episode one of this podcast, actually.
Wow. Also listening to
Michael's podcast, obviously. And then
once I get to know it,
I started listening to your podcast
as well. So basically, everything that's out there, I started listening to your podcast as well.
So basically everything that's out there, I'm listening.
What I'm doing, I'm currently leading the competence center for AI and data science at Data Drivers,
which is a consultancy firm from Hamburg, Germany. Our focus is mainly on building big data platforms and applications, mostly using cloud-native services.
And we try to apply best DevOps and MLOps practices
to wherever we are.
That's super cool.
Do you have a favorite cloud?
In all honesty, probably Google Cloud.
Gotta say it.
Yeah.
Yeah, nice.
Well, Michael, why don't you kick us off
with our first item?
Yeah, this one's a little fickle.
Comes to us from Ollie.
He sent that in, so thank you, Ollie.
And sort of indirectly from Patrick Gray over at Risky Business, which is a cool security-focused podcast. Python supports security. They talk about it over there. this binary Python object graph and turn it into a blob that I can stash away
and then later get it back, right?
Sometimes it's real simple,
stash it in Reddit
and other systems can pull it out real quick as a cache,
maybe save it to a file.
But where it's become really popular
as a means of data exchange
is actually in machine learning, okay?
So the people who built this thing
I'm gonna tell you about
were really built it around focusing on the machine learning use case because people are handing around these models, these pre-trained models.
And like, here's the model loaded up and roll and loaded up and roll may mean you have an amazing artificial intelligence that drives a car, or it may mean that you have a virus because pickles can contain all sorts of bad things. All right, so this thing I'm gonna tell you about is called Pickling, like pickling.
It's a decompiler, a static analyzer,
and a bytecode rewriter
for Python pickle object serializations.
So you take these pickle files,
these object graphs of Python things,
and you can pull them apart and look at them.
You can ask questions like, is it a virus?
And you can even say things
like, let's put a virus in it. So all of these are possible with this tool. And it's made by a
security pen testing company called Trail of Bits for basically that purpose, right? So it's kind of
either side, the attacking pen testing side or the defensive side of the store. So it works on three, six and above.
And you can see it's super simple.
You say, you basically do pickle stuff and you say from fickling.pickle, import pickled.
And then you can kind of as if you would use the dis module to disassemble Python code,
you can do that with this pickled library.
And it'll print out something that's kind of like an abstract syntax tree of the pickle.
And they've got a real simple example on the GitHub repo.
It's like a list of four numbers, one, two, three, four,
and then it just shows you,
look, we're assigning the results of creating a list
and setting these constants in it.
Another thing that is nice about this
is it's not specifically built for Python developers,
so it's also kind of something you can integrate into other tooling and say continuous integration
and stuff like that. So you can run it off the command line as well. You can just on the
terminal, just type fickling and give it the data and then out comes some answer. The one that people
might want to do is the dash dash check safety. And that will try to look and see if it's doing bad things like, for example, talking
to OS dot system or doing other malicious stuff like that.
So that's good.
But I wouldn't trust that entirely.
Like, how well is it checking?
Right.
If you, for example, were to encode Python code and then decode it and then take that
decoded stuff and it did OS something,
right? You feed that to a val or whatever. There's all sorts of layers here, right?
So it can check for obvious things, but it's not like an absolute guarantee. And then finally, you can inject arbitrary Python code that will run on unpickling into an existing pickle file
with dash dash inject. Seems fine, right? Everything's fine.
That's the fun part.
Yeah.
So if there's no malicious code present, here you go.
Yeah, exactly.
So maybe I'm imagining something like a little thing that prints out in like flashing bright colors.
We told you you shouldn't unpickle untrusted data.
Don't do it.
Beginning hard drive format. It has
a loud beeping sound. It goes, three, two,
one, and just like...
Obviously not really do it, but that would
get your attention, right? That would be a mean trick.
Absolutely.
This is interesting.
I didn't really put it together with
the ML data exchange model exchange
story until I heard
the folks talking about it over on Risky Business.
So it seems like, especially in the ML story,
you want to have a look at these kinds of things.
Yeah.
So I've heard about the use case before, actually,
but I didn't know that somebody would solve it in this way.
So pretty nice.
Yeah.
I mean, Eric, this is sort of your world, right?
The machine learning stuff.
So how does this sit with you?
What do you think?
Yeah, so it comes up all the time
that you pick up some random model
that someone has built.
So as security issues become more prevalent,
this might be a thing.
Yeah.
Well, is there better ways to store it?
Like JSON or something else?
Models don't have to exist that way, do they?
Yeah, I mean, even if there was,
there are some projects that
focus on building some
reusable interface across
all these different frameworks and stuff.
But in reality, people just
use Pickle. Really?
Yeah, they do. I just didn't know anybody
was really using it for much.
It's absolutely common. So within, like, say, Scikit-Learn didn't know anybody was really using it for much. No, it's absolutely common.
So within, like, say, scikit-learn,
which is probably the most used library ever,
you just use pickle on the pivot, store your files.
Yeah.
All right, well, cool.
So this is a useful library from Trello Bits.
People can check out.
And we're going to start with everything is fine, and we'll end with everything is fine as well, Brian.
But over to you.
Okay.
Well, this is something, it's a blast from the past a little bit, about a year ago.
Anyway, I want to talk about virtual environments and directories.
So, and there's an article from Hinnick that's called Python Project Local Virtual Env Management.
That's a mouthful.
But the idea, and we've talked about wanting this before, is to be able to...
I still want it.
Yeah.
So just to go, if I've got several projects going on, whenever I CD into a directory with
this project, I just want the virtual environment
to activate automatically.
And then when I leave it and go to another one,
it's just automatically switched.
Apparently that already works
and we've already covered it, but I missed it.
So actually in episode 185,
you brought up Durinv and in part of it,
it's the ability to, you can have per project isolated
development environments yes but i didn't pick that up yet but hinnick uh just said this is how
you do it this is and uh how you do it really is just you just uh have you have to install Durenv first, and then you put a.envrc file in a directory
and say layout Python and then what Python version.
So like layout Python, Python 3.9,
and then that's it.
That's all you got to do.
And I'm like, that can't be that easy.
And it was.
I did it this morning morning and it's like,
man,
this is great.
So on my Mac,
it's all solved.
But it doesn't work on windows.
So,
oh,
well.
Must use a Linux subsystem for windows or a window subsystem for Linux WSL.
I guess it is.
Oh,
okay.
Yeah.
I mean,
that sort of semi solves it.
Yeah.
Yeah.
So I mean, that sort of semi-solves it. Yeah. Yeah. So I really, I probably have this need more within Windows than I have in, in, on my Mac,
but I have it in both places.
So I'm, I'm going to start using it.
It's great.
Plus, like you covered last time, you can also have a bonus.
You can put environmental variables in there too so that in the project you've got you
like your perhaps your secrets or um or just different environmental settings you want to use
yeah i think people will look in your dot rc whatever your bash rc um zsh rc whatever files
for your secrets but i suspect it's much less likely to go hunting through virtual environments
and looking for their activate scripts and see what's in them.
You know, people know, but fewer people know that stuff gets stashed in there.
So that's probably good.
Right.
So I guess mainly the story is I knew that you could do it, but I didn't realize how easy it was.
So this is it's super simple.
It just took a little bit.
And then my second thought was it isn't it it's not that hard to create virtual environments though.
Is this saving any time?
I still got to create this file and put this stuff in it.
It actually is more typing a little bit more,
but it didn't take me long to realize that it's when you're switching between
different directories,
you save a ton of time.
Yeah.
It's the going back and forth between projects.
Right.
Yeah.
So that's it really just
kind of neat um yeah brett out in the live stream's got a comment for us if you use pi
env you can run pi env local env name in your project folder and get this behavior as well
how do you do that how do you get it to uh activate by just changing directory into it
what i'm not totally sure yeah yeah i I think you can use the Python version that way, right?
But not the actual virtual environment.
Yeah, possibly if you've installed Python
through PyENV as well, yeah.
And then David has a comment back,
the first topic out there in the live stream.
Hey, David.
The irony of legacy object serialization
being used on cutting-edge machine learning.
Like that one?
Yeah, and then Teddy out in the live stream. Hey, Teddy. He says, does it work with an IDE? legacy object serialization being used on cutting edge machine learning. Like that font? Yeah.
And then Teddy at the live stream.
Hey, Teddy.
He says, does it work with an IDE?
I changed the interpreter based on the folder you're in within a workspace in
this coast, for example.
That I don't know, but I was going to add the personal comment that I don't need this
nearly as much as I felt like I used to because the way I jump between projects
is usually open them up in PyCharm
and jump between them there.
And that always activates.
If you go to the terminal in PyCharm,
it activates that environment for that project.
I don't know.
I'm on the command line all the time.
So definitely.
Yeah, if you're on the command line busting around a lot,
then both Brett and Alvaro have a follow-up, PyENV
adds a shim that intercepts the calls to
Python. So, yeah, very good. So it must be
that you have to install Python through PyENV,
but then it'll also do this. Very cool.
Good to know I didn't know that.
Me too. Nice. Alright, Eric,
first one is for you. Yeah,
so I brought
with me the test containers
Python library, which, and let me the test containers Python library,
which, and let me quote this one from the description
because I think it's a pretty good summarization.
So test containers Python is a port for test containers Java
that allows Docker containers for functional integration testing.
It provides capabilities to spin up Docker containers
such as databases, Selenium
Web Browsers, and many other containers for testing.
So maybe not that many new things in here, but we use this in a project lately, and especially
we use this in integration pipelines using cloud-native services.
So there's a container for Google Cloud PubSub, for example,
which is pretty amazing, also for your Kafka.
This is originally a Java project,
so there's still a lot to do for the Python community
in order to catch up on a bunch of interfaces that
need to be implemented and stuff.
One example, it is here.
Let me just show you that one. So there's, in the repo, you can find
an example of how to use this within your CI pipeline. So what's happening here is actually
that if you have like a standard CI pipeline for your integration test, which consists of Docker
containers that will use Docker in Docker to actually
run the integration test. So
all your standard 2021
stuff in here, I guess.
Yeah, this is super cool. And the way you do it is
just create a context manager,
right? You just say something like
with MySQL container,
here's a connection string, and then you can just
do your normal database stuff
over to it yeah yeah so it
integrates perfectly fine with pytest we uh we did that a lot um and so yeah the syntax is pretty
cool it's super easy to use the integration with the cicd works fine so um yeah yeah brian we could
we could use this with um a test fixture and a little yield action, something like that. Yeah. Yeah. And I can't wait to try to play with something like this.
Yeah. We talked about this way long ago. I brought this up, I believe,
but I'm glad you brought it back, Eric,
because it's really useful and it's really neat.
And there's more stuff than actually is listed on the read me for some reason.
Exactly.
Like if you flip through the actual documentation,
you can see that there's other containers, right?
For example, I believe there's a MongoDB one, for example, but that's not listed in the documentation.
And then the cloud emulators are probably neat for you for testing, right?
Yeah.
I mean, that's one of the things that I find off-putting from like cloud native type stuff is if you don't have access to the cloud you're dead
in the water right like and that can be a problem for continuous integration and for all sorts of
things so things like this are pretty neat it's definitely challenging so stuff like this helps
yeah uh you know to me it's it's an interesting trade-off because on one hand sure you can mock
out your database and then just test against your test data but then if your data model and the
database changes but you don't think to update the test data well then your code's gonna like
sql alchemy for example will freak out and crash if the scheme is not a perfect match whereas you
wouldn't find that in testing if you weren't letting it talk a little bit to the database
and like there's just interesting things uh like this uh Brian, you even had an episode about not mocking out
your database, didn't you? Yeah. I think as little as you can, I guess, let's do it the reverse.
As close as you can have to the real environment, the better. And this is when people are deploying
on containers, testing with containers makes total sense. Yeah, absolutely. Absolutely. All right.
Want to talk a little more infrastructure?
Yeah.
All right.
So I have the one,
it's got to be the shortest named thing
for a featured item.
JC, two letters, JC.
So JC comes to us from Garrett.
Thank you, Garrett, for sending that in.
And at first I was like,
I don't know if this is relevant to me
or if this is interesting.
But the more I looked at it,
I'm like, yeah, this is actually pretty awesome.
To me, let me, I'll read what JC describes itself as in a moment.
But to me, what this is, is it is basically what web scraping is to the web.
JC is to Linux.
So there's not a nice API for it, but I'd like to somehow wrap a little Python magic
around it and then have an API for it, but I'd like to somehow wrap a little Python magic around it and then have an
API for it. Okay. So it's official story is it's a CLI tool in Python library that converts the
output of popular command line tools and file types to JSON. And it allows piping one thing
to the next, obviously, because it's Linux-like. So the idea is, you know, the example they have
on their site there is dig. So dig is a command that'll give you information about a domain.
So you could do something like dig example.com pipe JC, and then you tell JC what it's expecting
output from just whatever the print output to the terminal is in dig.
And it will parse that and turn it into a python dictionary right so i could sub process
run dig but then i just get a huge blob of text and i've got to basically go through it try to
understand it and so on and this knows the exact format and turns it into like structured data so
think of all these different linux commands you may run you find a whole bunch of them. They're like a huge list down here. So
airport, ARP, crontab, date, CSV, free, DU, hash, history, hosts, IP config, netstat, all those
types of commands, syscontrol. So for example, if you're automating daemons and stuff like that,
you can now do that from Python. And then instead of getting just a text blob and an exit code,
you get a dictionary back
that you can then check out and program against.
What do you think?
Oh, that's pretty cool.
Yeah.
Yeah, there's a bunch of built-ins.
Hopefully the thing you're looking for is one of these.
Yeah, exactly.
I suspect it's not extraordinarily hard to do uh to add another one yeah yeah but
you can also run it on the command line you don't have to use it in python which is what i was
scrolling around looking for so if you want to like let's suppose i want to go and run dig and
i just want to go to the answers and get the data, which would be the IP address of some domain.
You can say, JC, run this thing, and then JQ-R,
or there's like a way to just pass over a string.
And basically, the string you pass in is the object dereferencing,
the traversal of the dictionary.
So dot, bracket, dot answer, bracket, dot data,
and it'll go and pull that all apart,
which is pretty neat.
So it's got a cool command line
terminal automation aspect,
just like Fickle.
This is a nice wizard effect
so that if you know how to do this well
and people come over and watch you do this,
they will be amazed.
Yeah.
Just make sure you spin up
your third or fourth terminal
while you do that.
Exactly. Eric, what do you think?
Yeah, so it sounds like
I found something that I can
put my usual Sunday
afternoon time into.
I'll play around with it.
Yeah, yeah, yeah.
Every now and then, I want
to do some subprocess thing, and it needs to call
some kind of Linux command. I'm like, what am I want to do some sub-process thing and it needs to call some kind of Linux command.
I'm like, what am I going to do?
Am I just going to check the status code, the return code,
and hope it works and then just say it didn't work if it didn't work?
Or, you know, you could do so much more with this.
Sorry, Brian.
Well, there's some stuff that's less Unix-y that other people might need like you can parp, parp. You can parse pip list and pip show and and YAML and XML with this as well.
So that's nice.
Yeah.
Yeah.
Very cool.
All right.
How about some ellipses or I don't know how else to say it.
Dot, dot, dot.
The next thing.
Do say more.
So this was a surprise to me. Dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot Ellipse I. Ellipse I. Keep going. And it's an actual object within Python.
Who knew?
And then also you can just do dot, dot, dot.
And that's a valid thing, an identifier.
So it's a special value.
But you can use it for all sorts of stuff.
Like the, oh, by the way,
I'm referencing an article called what is the python what is
python's ellipsis object from florian dollitz thanks florian for writing that um so it's
the python or the definition really is uh it's the same the the ellipsis literal is the same
as the literal dot dot dot it's a special value used most mostly in conjunction with extended
slicing syntax for user-defined container data types i don't know what does that mean um i guess
pandas uses it maybe but the the article comes up has some some interesting things you can use it in
place of pass because it is a valid has a valid value you can kind of do a dictionary
or a function definition and instead of saying pass just do three dots and that's valid python
i'm kind of liking that i'm sure it's it's people will be like what are you doing but at the same
time it's like that's really what you wanted to put down there is like i just don't want to put
anything but python won't work unless i kind of close this off so here's a pass right well also
one of the things i was thinking about is no i would probably use pass all the time when when
in that case but when writing documentation and you really want to have a working code example
but you want to just indicate there's going to be more code there. That's a cool thing to put in. Anyway, so there's that.
And then there's also using it in type information.
So with type information, for instance, apparently,
like let's say I've got a function that returns a tuple or tuple.
I've got these words today.
Anyway, a tuple with two integers,
you can just say a tuple with two ints.
But if you don't know how many integers are going to be there,
you can do the three dots.
And apparently that works with typing.
That's neat.
There's not a lot.
Apparently it's used also within
FastAPI and Typer,
but it's there, and if you want
to use
to implement a certain feature where that might make sense, it is a thing that's available.
Maybe you could have an operator, a dot dot dot operator on your something.
I learned this just the other day from a tweet from Raymond Hedginger, where he was asking people like, how would you do this? And he brought up
the exact same example using the documentation and the pass or the ellipses instead. And I didn't
even know that this was a Python object. I knew it from the typing. But so the question is, can you
pass this object around? Can you return from a function value like
dot dot dot?
I imagine. I don't know.
It should work, right?
It should work. Yeah, it should work.
Yeah.
Nice.
I'll try it out while we go on to the next topic.
Yeah, that one surprised me. Well done, Florian.
Yeah, so the last one that I brought with me um actually since i lead
the data science and ai team i gotta bring something with me that has to do with it
so um i brought with me the pytorch forecasting library um so um michael you just used this
analogy um in a couple of minutes ago so so I'm going to use an analogy now.
So for me, PyTorch forecasting looks like
what a fast AI does for computer vision
and natural language processing,
it does for time series forecasting.
Because there was like a lack of deep learning
for type series forecasting.
And actually I think that PyTorch forecasting
is gonna close this gap.
So it comes in with a bunch of important features actually.
So it's built on top of PyTorch Lightning,
which allows training on CPUs, signal and multiple GPUs, basically out of the box.
So there's been a lot of software engineering involved for all the data scientists in the past.
And this library just makes it pretty simple.
So you have to work very hard in order to mess things up with this library, I guess. So, what it also brings is an implementation of a model that is called
the Temporal Fusion Transformers.
So this is from Google Research.
And actually there's also a TensorFlow-based implementation.
I'm going to put the link to the paper in the show notes.
This is a very interesting model
that has performed pretty well
on a dozen prominent benchmarks very lately.
And it has a very huge benefit,
which is that it is pretty interpretable.
So it does actually calculate feature importance for you.
So this is in the real world applications, very important, because whenever you stick
your data into these models and something good comes out, people will always ask you,
so, okay, so what was the important part of the data?
So how does it influence the model and the outcome?
So Temporary Fusion Transformers, they do this for you.
Also, the PyTorch forecasting comes with Optumna,
which is a popular library for hyperparameter tuning,
which is also implemented in here.
Right, there might be.
So this does multivariate time series,
multivariate time series, multivariable time series.
Yeah, so the multi-horizon part of it
is pretty important, actually.
So go ahead, sorry.
I was going to say, so the hyperparameter tuning,
you might say this part actually doesn't make any difference
in the prediction, but this other part does.
So pay attention to that, right?
Yeah, absolutely.
Yeah, this looks really good.
So if you want to predict the future about sales, home prices, heart rate, whatever.
It comes up all the time.
It comes up all the time.
And I know from a couple of guys who work for the Google Cloud of this world and the AWS,
that within these software as a services or these APIs that they provide
when like say a demand forecast,
they use this temporary fusion transformers under the hood.
So yeah,
this looks great.
Just spin it up and use it.
Yeah.
Great recommendation.
A follow up from the previous one,
Brian,
Will McGugan.
Hey,
Will,
the live stream says it's the dot,
dot,
dot ellipsis sometimes is used as a Sentinel value to mean no value when none is a valid value.
So, yeah.
Yeah, and also, yes, you can return it from a function.
Nice.
It's just fine.
And then, let's see.
Someone out in the live stream asked if it has methods.
Does it have methods or anything that you can do to it?
That was Teddy.
Yes, but only the built-ins, right?
I don't think it, from object,
I don't think it does anything interesting
besides just be dot, dot, dot.
And then Anderson, hey, Anderson,
says it's a pity the ecosystem
is moving towards PyTorch Lightning.
The separation of concerns there is not very nice.
In my opinion, PyTorch Ignite does a better job
in that aspect.
Eric, that's all you.
Yeah, fair enough.
Still, I mean, one thing
that you've got to keep in mind,
so speaking
of separation of concerns, right, there's so
many data scientists out there that if you throw
like separations of concerns at
them, they just answer like,
yeah, here's my model. So what is
separation of concerns in this sense,
right? So if this works, if people
use it, it's probably good.
Yeah, cool.
Brian, extras?
Extras.
Oh, I just wanted to bring up
that Python 3.10 RC2 is out.
So the release candidate,
the second release candidate for Python 3.10 is out
so people can play with it.
Apparently we're like maybe a month away
from getting 3.10.
So I'm excited about that.
Yeah, that's me.
Very excited.
Awesome.
All right.
I got a couple to throw out there.
Really?
What a surprise.
Can you imagine?
What a surprise.
Can you imagine?
So remember we talked about several things.
I talked about how I turned off all of the tracking stuff and all those things on the
website, which I think
is good because so many people run ad blockers.
They were, it was like pretty inconsistent data anyway, inaccurate.
Then I mentioned go access.io.
I said, that'd be cool.
Maybe we should apply it.
I ended up writing a ton of automation to apply this to Python bias, TalkPython, TalkPython
training, all the things.
And it's pretty cool.
I built some automation that will download all the IntuneX log files, some of which are text,
some of which are gzipped,
and then run this thing across it
and it will build like one giant monthly log thing.
And then GoAccess can then turn into nice, beautiful reports.
So very excited to have GoAccess working well.
And instead of running it on the server,
I actually just download and then run it
on like a monthly report locally,
which I think is kind of cool.
All right.
One, we had some feedback about Caffeinate.
Remember Caffeinate?
You can type Caffeinate on the macOS terminal
and it'll keep your system alive.
Nathan Henry said,
you mentioned over in macOS the caffeinate tool.
It says you can follow it with a long-running command to keep awake.
So you could say like caffeinate python-c import time, time.sleep.
So you could say caffeinate python and some script you want to run.
So you could reverse it if that script doesn't use keep awake or I think that's what it was.
Right.
So you could apply caffeinate to your Python code and just say, no, stay awake while you're
doing this.
Or you can even apply it to a running process using a PID.
So it just stays awake while that process is running then?
Yeah.
And then it'll go away.
Yeah.
Oh, okay.
Nice.
Yeah.
So it's like the reverse of what we talked about then.
Then Sean Tabber from teaching Python said, the reverse of what we talked about then then sean
tabber from teaching uh python said isn't this what we were asking for remember we were talking
about the the keyboards keyboards and here's a python one this is uh m60 mechanical keyboard
the open source usb ble bluetooth low energy five hot swappable, 60% keyboard, powered by Python.
So this one comes with Python built in, which is pretty excellent.
So if people want to play that, they definitely can.
The next one I want to throw out there real quick comes to us from Mark Little, a friend
of mine here in Portland.
And basically the subtitle is that, this is an article from CNBC Finance News, that open
source is booming.
So the headline has to do with MongoDB,
but it's more broad.
So if people are interested in kind of following up on that,
it's kind of cool.
So MongoDB surged on Friday, which was last Friday.
It's now worth as much as IBM paid for Red Hat.
Databricks raised private financing around
at $30 billion valuation.
And just, you know, these are the mega open source companies, but it's pretty interesting
to just give you a sense.
Like I read this article, I got, it's pretty interesting.
These numbers kind of just like bounce off me.
But the one that made it stick for me was MongoDB was a private company for a while.
Then it became, then IPO'd, right?
It had VC money, then IPO'd.
Do you have a sense? Either of you have a sense for how much it IPO'd became, then IPO'd, right? It had VC money, then IPO'd. Do you have a
sense? Either of you have a sense for how much it IPO'd for? It seemed crazy, right? Like, like a
1.2, $1.4 billion. MongoDB is worth 30 billion now, right? So even after like the crazy IPO,
you know, 1.2 billion to start and now over 30 billion. So that is an insane amount of growth
in these. And then
they talk about Confluent and JFrog and a bunch of other elastic. If you kind of want to dig into
the business side of open source, that's pretty interesting. All right, two more. I've been doing
a ton of video encoding lately. I use FFmpeg for some of the audio processing and other types of
things around both the podcast and the courses.
So attribution here.
This is from Jim Anderson.
Sent this over.
Thanks, Jim.
FFmpeg.wasm.
So here's FFmpeg, which is a very popular tool in that world, but as a WebAssembly thing,
which is pretty awesome.
And I'm trying to remember what the name of the library was.
But over in, we did talk about on Python By bytes, I think with Cecil Philip on one time, maybe it was even him that brought it up.
But there's a Python library that will run WebAssemblies.
So not run WebAssembly in their browser or put Python in the browser, but reverse it.
Like I have a WebAssembly library that does cool stuff.
Put it in my Python code and run it here. So you could take ffmpeg.wasm and pure Python and have like a no dependency sort of
audio video processing tool in Python, which I think is pretty cool. Cool. All right. Last one.
I told you we'd start with everything is fine. I'm going to end with everything is fine.
Credit card stealing backdoored packages found in Pypi python's pypi library hub what that's not good
this this is not good this is not good um when you hear people talk about remote code execution
that typically is bad like i'm on the internet people send me bad stuff now they have my computer
and i don't even necessarily know it so So apparently in addition to this, these were found and removed.
It was something, what was it?
It was something around the line of Noblesse, N-O-B-L-E-S-S-E and a couple of variations
on that spelling.
That was the problem.
So I'm happy to see I didn't install that, but this doesn't make me happy.
It looks like it's fixed.
So the PyPI team also just patched a remote code execution hole in their platform,
which potentially could have been exploited
to hijack the entirety of PyPI.
That one makes me way more nervous
than typosquatting or that weirdness.
And it was a vulnerability
in the way that they were doing GitHub actions with PyPI,
which allowed a malicious pull request
to execute arbitrary code over there, which is not ideal.
Yeah, but I'm glad to hear that's fixed.
Anyway, everything's fine.
It doesn't feel fine.
No, not at all.
More like a nightmare, to be honest.
Yeah, to be honest.
Eric, anything else you want to share with us?
No, just
thank you guys again for having me on
the show. Pretty fun.
And make sure that you guys
follow me on Twitter.
Awesome.
We'll put a link in the show notes for your
Twitter. No, we aren't done, are we,
Brian? No, we need a joke.
One thing is missing. Yeah, it's important.
So this one is more of a,
not an ML one, it's more of a web API type thing. So, so often people will write web APIs
and just return some kind of message in a JavaScript dictionary that says things like
bad response or whatever, but you're supposed to use HTTP status codes, right? Like if there's a
bad request, you should return the status code 400.
If it's not found as an entity, you should return 404 or whatever.
So here's like two kids at school exchanging messages and it has server on one of them,
client on the other, and 200 on the message exchange here.
And then at the bottom, the one kid that got the message reads the JavaScript and says
status code 400, detail, bad request.
He's like, why?
Why did you do this to me?
This is good.
Yeah, this is like little Bobby Tables.
Let this be a lesson to you.
You don't pass messages like that.
Come on.
It's so true.
It's totally true.
Totally true.
All right.
Well, that's it for our jokes and everything, Brian.
We'll have another fun Wednesday on Python Mites. Totally true. All right. Well, that's it for our jokes and everything, Brian. Yeah.
Well, it was another fun Wednesday on Python Bites.
Absolutely.
Thanks, Eric.
Thanks, Brian.
Yeah, thanks, Eric, for being here.
Thanks a lot, guys.
See you around.
Bye, all.
Bye.
Thanks for listening to Python Bites.
Follow the show on Twitter via at Python Bites.
That's Python Bites as in B-Y-T-E-S. Get the full show notes over at pythonbytes.fm. If you have a news
item we should cover, just visit
pythonbytes.fm and click Submit in the
nav bar. We're always on the lookout for sharing
something cool. If you want to join us for the live
recording, just visit the website and
click Livestream to get notified
of when our next episode goes live.
That's usually happening at noon
Pacific on Wednesdays over at
YouTube. On behalf of myself and Brian Ocken, this is Michael Kennedy.
Thank you for listening and sharing this podcast with your friends and colleagues.
