Python Bytes - #249 All of Linux as a Python API

Episode Date: September 9, 2021

Topics covered in this episode: Fickling Python Project-Local Virtualenv Management Testcontainers jc What is Python's Ellipsis Object? PyTorch Forecasting Extras Joke See the full show notes fo...r this episode on the website at pythonbytes.fm/249

Transcript
Discussion (0)
Starting point is 00:00:00 Hey there, thanks for listening. Before we jump into this episode, I just want to remind you that this episode is brought to you by us over at TalkPython Training and Brian through his PyTest book. So if you want to get hands-on and learn something with Python, be sure to consider our courses over at TalkPython Training.
Starting point is 00:00:17 Visit them via pythonbytes.fm slash courses. And if you're looking to do testing and get better with PyTest, check out Brian's book at pythonbytes. get better with PyTest, check out Brian's book at pythonbytes.fm slash PyTest. Enjoy the episode. Hello and welcome to Python Bytes where we deliver news and headlines directly to your earbuds. This is episode 249 recorded September 8th 2021 and I am Brian Ocken. Hey I'm Michael Kennedy. And I am Ida Costanza. Hey Eric thanks for joining us today. Yeah, thank you
Starting point is 00:00:46 so much for having me. So tell us a little bit about who you are. So first of all, I'm a long-time listener to the show. I just told Michael I'm listening since episode one of this podcast, actually. Wow. Also listening to Michael's podcast, obviously. And then once I get to know it,
Starting point is 00:01:02 I started listening to your podcast as well. So basically, everything that's out there, I started listening to your podcast as well. So basically everything that's out there, I'm listening. What I'm doing, I'm currently leading the competence center for AI and data science at Data Drivers, which is a consultancy firm from Hamburg, Germany. Our focus is mainly on building big data platforms and applications, mostly using cloud-native services. And we try to apply best DevOps and MLOps practices to wherever we are. That's super cool.
Starting point is 00:01:30 Do you have a favorite cloud? In all honesty, probably Google Cloud. Gotta say it. Yeah. Yeah, nice. Well, Michael, why don't you kick us off with our first item? Yeah, this one's a little fickle.
Starting point is 00:01:41 Comes to us from Ollie. He sent that in, so thank you, Ollie. And sort of indirectly from Patrick Gray over at Risky Business, which is a cool security-focused podcast. Python supports security. They talk about it over there. this binary Python object graph and turn it into a blob that I can stash away and then later get it back, right? Sometimes it's real simple, stash it in Reddit and other systems can pull it out real quick as a cache, maybe save it to a file.
Starting point is 00:02:14 But where it's become really popular as a means of data exchange is actually in machine learning, okay? So the people who built this thing I'm gonna tell you about were really built it around focusing on the machine learning use case because people are handing around these models, these pre-trained models. And like, here's the model loaded up and roll and loaded up and roll may mean you have an amazing artificial intelligence that drives a car, or it may mean that you have a virus because pickles can contain all sorts of bad things. All right, so this thing I'm gonna tell you about is called Pickling, like pickling. It's a decompiler, a static analyzer,
Starting point is 00:02:50 and a bytecode rewriter for Python pickle object serializations. So you take these pickle files, these object graphs of Python things, and you can pull them apart and look at them. You can ask questions like, is it a virus? And you can even say things like, let's put a virus in it. So all of these are possible with this tool. And it's made by a
Starting point is 00:03:11 security pen testing company called Trail of Bits for basically that purpose, right? So it's kind of either side, the attacking pen testing side or the defensive side of the store. So it works on three, six and above. And you can see it's super simple. You say, you basically do pickle stuff and you say from fickling.pickle, import pickled. And then you can kind of as if you would use the dis module to disassemble Python code, you can do that with this pickled library. And it'll print out something that's kind of like an abstract syntax tree of the pickle. And they've got a real simple example on the GitHub repo.
Starting point is 00:03:50 It's like a list of four numbers, one, two, three, four, and then it just shows you, look, we're assigning the results of creating a list and setting these constants in it. Another thing that is nice about this is it's not specifically built for Python developers, so it's also kind of something you can integrate into other tooling and say continuous integration and stuff like that. So you can run it off the command line as well. You can just on the
Starting point is 00:04:14 terminal, just type fickling and give it the data and then out comes some answer. The one that people might want to do is the dash dash check safety. And that will try to look and see if it's doing bad things like, for example, talking to OS dot system or doing other malicious stuff like that. So that's good. But I wouldn't trust that entirely. Like, how well is it checking? Right. If you, for example, were to encode Python code and then decode it and then take that
Starting point is 00:04:42 decoded stuff and it did OS something, right? You feed that to a val or whatever. There's all sorts of layers here, right? So it can check for obvious things, but it's not like an absolute guarantee. And then finally, you can inject arbitrary Python code that will run on unpickling into an existing pickle file with dash dash inject. Seems fine, right? Everything's fine. That's the fun part. Yeah. So if there's no malicious code present, here you go. Yeah, exactly.
Starting point is 00:05:18 So maybe I'm imagining something like a little thing that prints out in like flashing bright colors. We told you you shouldn't unpickle untrusted data. Don't do it. Beginning hard drive format. It has a loud beeping sound. It goes, three, two, one, and just like... Obviously not really do it, but that would get your attention, right? That would be a mean trick.
Starting point is 00:05:34 Absolutely. This is interesting. I didn't really put it together with the ML data exchange model exchange story until I heard the folks talking about it over on Risky Business. So it seems like, especially in the ML story, you want to have a look at these kinds of things.
Starting point is 00:05:51 Yeah. So I've heard about the use case before, actually, but I didn't know that somebody would solve it in this way. So pretty nice. Yeah. I mean, Eric, this is sort of your world, right? The machine learning stuff. So how does this sit with you?
Starting point is 00:06:06 What do you think? Yeah, so it comes up all the time that you pick up some random model that someone has built. So as security issues become more prevalent, this might be a thing. Yeah. Well, is there better ways to store it?
Starting point is 00:06:21 Like JSON or something else? Models don't have to exist that way, do they? Yeah, I mean, even if there was, there are some projects that focus on building some reusable interface across all these different frameworks and stuff. But in reality, people just
Starting point is 00:06:37 use Pickle. Really? Yeah, they do. I just didn't know anybody was really using it for much. It's absolutely common. So within, like, say, Scikit-Learn didn't know anybody was really using it for much. No, it's absolutely common. So within, like, say, scikit-learn, which is probably the most used library ever, you just use pickle on the pivot, store your files. Yeah.
Starting point is 00:06:55 All right, well, cool. So this is a useful library from Trello Bits. People can check out. And we're going to start with everything is fine, and we'll end with everything is fine as well, Brian. But over to you. Okay. Well, this is something, it's a blast from the past a little bit, about a year ago. Anyway, I want to talk about virtual environments and directories.
Starting point is 00:07:17 So, and there's an article from Hinnick that's called Python Project Local Virtual Env Management. That's a mouthful. But the idea, and we've talked about wanting this before, is to be able to... I still want it. Yeah. So just to go, if I've got several projects going on, whenever I CD into a directory with this project, I just want the virtual environment to activate automatically.
Starting point is 00:07:47 And then when I leave it and go to another one, it's just automatically switched. Apparently that already works and we've already covered it, but I missed it. So actually in episode 185, you brought up Durinv and in part of it, it's the ability to, you can have per project isolated development environments yes but i didn't pick that up yet but hinnick uh just said this is how
Starting point is 00:08:15 you do it this is and uh how you do it really is just you just uh have you have to install Durenv first, and then you put a.envrc file in a directory and say layout Python and then what Python version. So like layout Python, Python 3.9, and then that's it. That's all you got to do. And I'm like, that can't be that easy. And it was. I did it this morning morning and it's like,
Starting point is 00:08:45 man, this is great. So on my Mac, it's all solved. But it doesn't work on windows. So, oh, well.
Starting point is 00:08:54 Must use a Linux subsystem for windows or a window subsystem for Linux WSL. I guess it is. Oh, okay. Yeah. I mean, that sort of semi solves it. Yeah.
Starting point is 00:09:04 Yeah. So I mean, that sort of semi-solves it. Yeah. Yeah. So I really, I probably have this need more within Windows than I have in, in, on my Mac, but I have it in both places. So I'm, I'm going to start using it. It's great. Plus, like you covered last time, you can also have a bonus. You can put environmental variables in there too so that in the project you've got you like your perhaps your secrets or um or just different environmental settings you want to use
Starting point is 00:09:31 yeah i think people will look in your dot rc whatever your bash rc um zsh rc whatever files for your secrets but i suspect it's much less likely to go hunting through virtual environments and looking for their activate scripts and see what's in them. You know, people know, but fewer people know that stuff gets stashed in there. So that's probably good. Right. So I guess mainly the story is I knew that you could do it, but I didn't realize how easy it was. So this is it's super simple.
Starting point is 00:10:00 It just took a little bit. And then my second thought was it isn't it it's not that hard to create virtual environments though. Is this saving any time? I still got to create this file and put this stuff in it. It actually is more typing a little bit more, but it didn't take me long to realize that it's when you're switching between different directories, you save a ton of time.
Starting point is 00:10:21 Yeah. It's the going back and forth between projects. Right. Yeah. So that's it really just kind of neat um yeah brett out in the live stream's got a comment for us if you use pi env you can run pi env local env name in your project folder and get this behavior as well how do you do that how do you get it to uh activate by just changing directory into it
Starting point is 00:10:42 what i'm not totally sure yeah yeah i I think you can use the Python version that way, right? But not the actual virtual environment. Yeah, possibly if you've installed Python through PyENV as well, yeah. And then David has a comment back, the first topic out there in the live stream. Hey, David. The irony of legacy object serialization
Starting point is 00:11:01 being used on cutting-edge machine learning. Like that one? Yeah, and then Teddy out in the live stream. Hey, Teddy. He says, does it work with an IDE? legacy object serialization being used on cutting edge machine learning. Like that font? Yeah. And then Teddy at the live stream. Hey, Teddy. He says, does it work with an IDE? I changed the interpreter based on the folder you're in within a workspace in this coast, for example.
Starting point is 00:11:15 That I don't know, but I was going to add the personal comment that I don't need this nearly as much as I felt like I used to because the way I jump between projects is usually open them up in PyCharm and jump between them there. And that always activates. If you go to the terminal in PyCharm, it activates that environment for that project. I don't know.
Starting point is 00:11:35 I'm on the command line all the time. So definitely. Yeah, if you're on the command line busting around a lot, then both Brett and Alvaro have a follow-up, PyENV adds a shim that intercepts the calls to Python. So, yeah, very good. So it must be that you have to install Python through PyENV, but then it'll also do this. Very cool.
Starting point is 00:11:54 Good to know I didn't know that. Me too. Nice. Alright, Eric, first one is for you. Yeah, so I brought with me the test containers Python library, which, and let me the test containers Python library, which, and let me quote this one from the description because I think it's a pretty good summarization.
Starting point is 00:12:12 So test containers Python is a port for test containers Java that allows Docker containers for functional integration testing. It provides capabilities to spin up Docker containers such as databases, Selenium Web Browsers, and many other containers for testing. So maybe not that many new things in here, but we use this in a project lately, and especially we use this in integration pipelines using cloud-native services. So there's a container for Google Cloud PubSub, for example,
Starting point is 00:12:45 which is pretty amazing, also for your Kafka. This is originally a Java project, so there's still a lot to do for the Python community in order to catch up on a bunch of interfaces that need to be implemented and stuff. One example, it is here. Let me just show you that one. So there's, in the repo, you can find an example of how to use this within your CI pipeline. So what's happening here is actually
Starting point is 00:13:16 that if you have like a standard CI pipeline for your integration test, which consists of Docker containers that will use Docker in Docker to actually run the integration test. So all your standard 2021 stuff in here, I guess. Yeah, this is super cool. And the way you do it is just create a context manager, right? You just say something like
Starting point is 00:13:37 with MySQL container, here's a connection string, and then you can just do your normal database stuff over to it yeah yeah so it integrates perfectly fine with pytest we uh we did that a lot um and so yeah the syntax is pretty cool it's super easy to use the integration with the cicd works fine so um yeah yeah brian we could we could use this with um a test fixture and a little yield action, something like that. Yeah. Yeah. And I can't wait to try to play with something like this. Yeah. We talked about this way long ago. I brought this up, I believe,
Starting point is 00:14:11 but I'm glad you brought it back, Eric, because it's really useful and it's really neat. And there's more stuff than actually is listed on the read me for some reason. Exactly. Like if you flip through the actual documentation, you can see that there's other containers, right? For example, I believe there's a MongoDB one, for example, but that's not listed in the documentation. And then the cloud emulators are probably neat for you for testing, right?
Starting point is 00:14:37 Yeah. I mean, that's one of the things that I find off-putting from like cloud native type stuff is if you don't have access to the cloud you're dead in the water right like and that can be a problem for continuous integration and for all sorts of things so things like this are pretty neat it's definitely challenging so stuff like this helps yeah uh you know to me it's it's an interesting trade-off because on one hand sure you can mock out your database and then just test against your test data but then if your data model and the database changes but you don't think to update the test data well then your code's gonna like sql alchemy for example will freak out and crash if the scheme is not a perfect match whereas you
Starting point is 00:15:16 wouldn't find that in testing if you weren't letting it talk a little bit to the database and like there's just interesting things uh like this uh Brian, you even had an episode about not mocking out your database, didn't you? Yeah. I think as little as you can, I guess, let's do it the reverse. As close as you can have to the real environment, the better. And this is when people are deploying on containers, testing with containers makes total sense. Yeah, absolutely. Absolutely. All right. Want to talk a little more infrastructure? Yeah. All right.
Starting point is 00:15:47 So I have the one, it's got to be the shortest named thing for a featured item. JC, two letters, JC. So JC comes to us from Garrett. Thank you, Garrett, for sending that in. And at first I was like, I don't know if this is relevant to me
Starting point is 00:16:02 or if this is interesting. But the more I looked at it, I'm like, yeah, this is actually pretty awesome. To me, let me, I'll read what JC describes itself as in a moment. But to me, what this is, is it is basically what web scraping is to the web. JC is to Linux. So there's not a nice API for it, but I'd like to somehow wrap a little Python magic around it and then have an API for it, but I'd like to somehow wrap a little Python magic around it and then have an
Starting point is 00:16:25 API for it. Okay. So it's official story is it's a CLI tool in Python library that converts the output of popular command line tools and file types to JSON. And it allows piping one thing to the next, obviously, because it's Linux-like. So the idea is, you know, the example they have on their site there is dig. So dig is a command that'll give you information about a domain. So you could do something like dig example.com pipe JC, and then you tell JC what it's expecting output from just whatever the print output to the terminal is in dig. And it will parse that and turn it into a python dictionary right so i could sub process run dig but then i just get a huge blob of text and i've got to basically go through it try to
Starting point is 00:17:12 understand it and so on and this knows the exact format and turns it into like structured data so think of all these different linux commands you may run you find a whole bunch of them. They're like a huge list down here. So airport, ARP, crontab, date, CSV, free, DU, hash, history, hosts, IP config, netstat, all those types of commands, syscontrol. So for example, if you're automating daemons and stuff like that, you can now do that from Python. And then instead of getting just a text blob and an exit code, you get a dictionary back that you can then check out and program against. What do you think?
Starting point is 00:17:50 Oh, that's pretty cool. Yeah. Yeah, there's a bunch of built-ins. Hopefully the thing you're looking for is one of these. Yeah, exactly. I suspect it's not extraordinarily hard to do uh to add another one yeah yeah but you can also run it on the command line you don't have to use it in python which is what i was scrolling around looking for so if you want to like let's suppose i want to go and run dig and
Starting point is 00:18:19 i just want to go to the answers and get the data, which would be the IP address of some domain. You can say, JC, run this thing, and then JQ-R, or there's like a way to just pass over a string. And basically, the string you pass in is the object dereferencing, the traversal of the dictionary. So dot, bracket, dot answer, bracket, dot data, and it'll go and pull that all apart, which is pretty neat.
Starting point is 00:18:47 So it's got a cool command line terminal automation aspect, just like Fickle. This is a nice wizard effect so that if you know how to do this well and people come over and watch you do this, they will be amazed. Yeah.
Starting point is 00:19:01 Just make sure you spin up your third or fourth terminal while you do that. Exactly. Eric, what do you think? Yeah, so it sounds like I found something that I can put my usual Sunday afternoon time into.
Starting point is 00:19:16 I'll play around with it. Yeah, yeah, yeah. Every now and then, I want to do some subprocess thing, and it needs to call some kind of Linux command. I'm like, what am I want to do some sub-process thing and it needs to call some kind of Linux command. I'm like, what am I going to do? Am I just going to check the status code, the return code, and hope it works and then just say it didn't work if it didn't work?
Starting point is 00:19:32 Or, you know, you could do so much more with this. Sorry, Brian. Well, there's some stuff that's less Unix-y that other people might need like you can parp, parp. You can parse pip list and pip show and and YAML and XML with this as well. So that's nice. Yeah. Yeah. Very cool. All right.
Starting point is 00:19:55 How about some ellipses or I don't know how else to say it. Dot, dot, dot. The next thing. Do say more. So this was a surprise to me. Dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot dot Ellipse I. Ellipse I. Keep going. And it's an actual object within Python. Who knew? And then also you can just do dot, dot, dot. And that's a valid thing, an identifier.
Starting point is 00:20:36 So it's a special value. But you can use it for all sorts of stuff. Like the, oh, by the way, I'm referencing an article called what is the python what is python's ellipsis object from florian dollitz thanks florian for writing that um so it's the python or the definition really is uh it's the same the the ellipsis literal is the same as the literal dot dot dot it's a special value used most mostly in conjunction with extended slicing syntax for user-defined container data types i don't know what does that mean um i guess
Starting point is 00:21:13 pandas uses it maybe but the the article comes up has some some interesting things you can use it in place of pass because it is a valid has a valid value you can kind of do a dictionary or a function definition and instead of saying pass just do three dots and that's valid python i'm kind of liking that i'm sure it's it's people will be like what are you doing but at the same time it's like that's really what you wanted to put down there is like i just don't want to put anything but python won't work unless i kind of close this off so here's a pass right well also one of the things i was thinking about is no i would probably use pass all the time when when in that case but when writing documentation and you really want to have a working code example
Starting point is 00:21:59 but you want to just indicate there's going to be more code there. That's a cool thing to put in. Anyway, so there's that. And then there's also using it in type information. So with type information, for instance, apparently, like let's say I've got a function that returns a tuple or tuple. I've got these words today. Anyway, a tuple with two integers, you can just say a tuple with two ints. But if you don't know how many integers are going to be there,
Starting point is 00:22:28 you can do the three dots. And apparently that works with typing. That's neat. There's not a lot. Apparently it's used also within FastAPI and Typer, but it's there, and if you want to use
Starting point is 00:22:43 to implement a certain feature where that might make sense, it is a thing that's available. Maybe you could have an operator, a dot dot dot operator on your something. I learned this just the other day from a tweet from Raymond Hedginger, where he was asking people like, how would you do this? And he brought up the exact same example using the documentation and the pass or the ellipses instead. And I didn't even know that this was a Python object. I knew it from the typing. But so the question is, can you pass this object around? Can you return from a function value like dot dot dot? I imagine. I don't know.
Starting point is 00:23:28 It should work, right? It should work. Yeah, it should work. Yeah. Nice. I'll try it out while we go on to the next topic. Yeah, that one surprised me. Well done, Florian. Yeah, so the last one that I brought with me um actually since i lead the data science and ai team i gotta bring something with me that has to do with it
Starting point is 00:23:50 so um i brought with me the pytorch forecasting library um so um michael you just used this analogy um in a couple of minutes ago so so I'm going to use an analogy now. So for me, PyTorch forecasting looks like what a fast AI does for computer vision and natural language processing, it does for time series forecasting. Because there was like a lack of deep learning for type series forecasting.
Starting point is 00:24:27 And actually I think that PyTorch forecasting is gonna close this gap. So it comes in with a bunch of important features actually. So it's built on top of PyTorch Lightning, which allows training on CPUs, signal and multiple GPUs, basically out of the box. So there's been a lot of software engineering involved for all the data scientists in the past. And this library just makes it pretty simple. So you have to work very hard in order to mess things up with this library, I guess. So, what it also brings is an implementation of a model that is called
Starting point is 00:25:14 the Temporal Fusion Transformers. So this is from Google Research. And actually there's also a TensorFlow-based implementation. I'm going to put the link to the paper in the show notes. This is a very interesting model that has performed pretty well on a dozen prominent benchmarks very lately. And it has a very huge benefit,
Starting point is 00:25:41 which is that it is pretty interpretable. So it does actually calculate feature importance for you. So this is in the real world applications, very important, because whenever you stick your data into these models and something good comes out, people will always ask you, so, okay, so what was the important part of the data? So how does it influence the model and the outcome? So Temporary Fusion Transformers, they do this for you. Also, the PyTorch forecasting comes with Optumna,
Starting point is 00:26:13 which is a popular library for hyperparameter tuning, which is also implemented in here. Right, there might be. So this does multivariate time series, multivariate time series, multivariable time series. Yeah, so the multi-horizon part of it is pretty important, actually. So go ahead, sorry.
Starting point is 00:26:33 I was going to say, so the hyperparameter tuning, you might say this part actually doesn't make any difference in the prediction, but this other part does. So pay attention to that, right? Yeah, absolutely. Yeah, this looks really good. So if you want to predict the future about sales, home prices, heart rate, whatever. It comes up all the time.
Starting point is 00:26:51 It comes up all the time. And I know from a couple of guys who work for the Google Cloud of this world and the AWS, that within these software as a services or these APIs that they provide when like say a demand forecast, they use this temporary fusion transformers under the hood. So yeah, this looks great. Just spin it up and use it.
Starting point is 00:27:14 Yeah. Great recommendation. A follow up from the previous one, Brian, Will McGugan. Hey, Will, the live stream says it's the dot,
Starting point is 00:27:20 dot, dot ellipsis sometimes is used as a Sentinel value to mean no value when none is a valid value. So, yeah. Yeah, and also, yes, you can return it from a function. Nice. It's just fine. And then, let's see. Someone out in the live stream asked if it has methods.
Starting point is 00:27:39 Does it have methods or anything that you can do to it? That was Teddy. Yes, but only the built-ins, right? I don't think it, from object, I don't think it does anything interesting besides just be dot, dot, dot. And then Anderson, hey, Anderson, says it's a pity the ecosystem
Starting point is 00:27:52 is moving towards PyTorch Lightning. The separation of concerns there is not very nice. In my opinion, PyTorch Ignite does a better job in that aspect. Eric, that's all you. Yeah, fair enough. Still, I mean, one thing that you've got to keep in mind,
Starting point is 00:28:08 so speaking of separation of concerns, right, there's so many data scientists out there that if you throw like separations of concerns at them, they just answer like, yeah, here's my model. So what is separation of concerns in this sense, right? So if this works, if people
Starting point is 00:28:23 use it, it's probably good. Yeah, cool. Brian, extras? Extras. Oh, I just wanted to bring up that Python 3.10 RC2 is out. So the release candidate, the second release candidate for Python 3.10 is out
Starting point is 00:28:37 so people can play with it. Apparently we're like maybe a month away from getting 3.10. So I'm excited about that. Yeah, that's me. Very excited. Awesome. All right.
Starting point is 00:28:47 I got a couple to throw out there. Really? What a surprise. Can you imagine? What a surprise. Can you imagine? So remember we talked about several things. I talked about how I turned off all of the tracking stuff and all those things on the
Starting point is 00:29:04 website, which I think is good because so many people run ad blockers. They were, it was like pretty inconsistent data anyway, inaccurate. Then I mentioned go access.io. I said, that'd be cool. Maybe we should apply it. I ended up writing a ton of automation to apply this to Python bias, TalkPython, TalkPython training, all the things.
Starting point is 00:29:21 And it's pretty cool. I built some automation that will download all the IntuneX log files, some of which are text, some of which are gzipped, and then run this thing across it and it will build like one giant monthly log thing. And then GoAccess can then turn into nice, beautiful reports. So very excited to have GoAccess working well. And instead of running it on the server,
Starting point is 00:29:41 I actually just download and then run it on like a monthly report locally, which I think is kind of cool. All right. One, we had some feedback about Caffeinate. Remember Caffeinate? You can type Caffeinate on the macOS terminal and it'll keep your system alive.
Starting point is 00:30:00 Nathan Henry said, you mentioned over in macOS the caffeinate tool. It says you can follow it with a long-running command to keep awake. So you could say like caffeinate python-c import time, time.sleep. So you could say caffeinate python and some script you want to run. So you could reverse it if that script doesn't use keep awake or I think that's what it was. Right. So you could apply caffeinate to your Python code and just say, no, stay awake while you're
Starting point is 00:30:31 doing this. Or you can even apply it to a running process using a PID. So it just stays awake while that process is running then? Yeah. And then it'll go away. Yeah. Oh, okay. Nice.
Starting point is 00:30:42 Yeah. So it's like the reverse of what we talked about then. Then Sean Tabber from teaching Python said, the reverse of what we talked about then then sean tabber from teaching uh python said isn't this what we were asking for remember we were talking about the the keyboards keyboards and here's a python one this is uh m60 mechanical keyboard the open source usb ble bluetooth low energy five hot swappable, 60% keyboard, powered by Python. So this one comes with Python built in, which is pretty excellent. So if people want to play that, they definitely can.
Starting point is 00:31:12 The next one I want to throw out there real quick comes to us from Mark Little, a friend of mine here in Portland. And basically the subtitle is that, this is an article from CNBC Finance News, that open source is booming. So the headline has to do with MongoDB, but it's more broad. So if people are interested in kind of following up on that, it's kind of cool.
Starting point is 00:31:32 So MongoDB surged on Friday, which was last Friday. It's now worth as much as IBM paid for Red Hat. Databricks raised private financing around at $30 billion valuation. And just, you know, these are the mega open source companies, but it's pretty interesting to just give you a sense. Like I read this article, I got, it's pretty interesting. These numbers kind of just like bounce off me.
Starting point is 00:31:55 But the one that made it stick for me was MongoDB was a private company for a while. Then it became, then IPO'd, right? It had VC money, then IPO'd. Do you have a sense? Either of you have a sense for how much it IPO'd became, then IPO'd, right? It had VC money, then IPO'd. Do you have a sense? Either of you have a sense for how much it IPO'd for? It seemed crazy, right? Like, like a 1.2, $1.4 billion. MongoDB is worth 30 billion now, right? So even after like the crazy IPO, you know, 1.2 billion to start and now over 30 billion. So that is an insane amount of growth in these. And then
Starting point is 00:32:25 they talk about Confluent and JFrog and a bunch of other elastic. If you kind of want to dig into the business side of open source, that's pretty interesting. All right, two more. I've been doing a ton of video encoding lately. I use FFmpeg for some of the audio processing and other types of things around both the podcast and the courses. So attribution here. This is from Jim Anderson. Sent this over. Thanks, Jim.
Starting point is 00:32:50 FFmpeg.wasm. So here's FFmpeg, which is a very popular tool in that world, but as a WebAssembly thing, which is pretty awesome. And I'm trying to remember what the name of the library was. But over in, we did talk about on Python By bytes, I think with Cecil Philip on one time, maybe it was even him that brought it up. But there's a Python library that will run WebAssemblies. So not run WebAssembly in their browser or put Python in the browser, but reverse it. Like I have a WebAssembly library that does cool stuff.
Starting point is 00:33:22 Put it in my Python code and run it here. So you could take ffmpeg.wasm and pure Python and have like a no dependency sort of audio video processing tool in Python, which I think is pretty cool. Cool. All right. Last one. I told you we'd start with everything is fine. I'm going to end with everything is fine. Credit card stealing backdoored packages found in Pypi python's pypi library hub what that's not good this this is not good this is not good um when you hear people talk about remote code execution that typically is bad like i'm on the internet people send me bad stuff now they have my computer and i don't even necessarily know it so So apparently in addition to this, these were found and removed. It was something, what was it?
Starting point is 00:34:08 It was something around the line of Noblesse, N-O-B-L-E-S-S-E and a couple of variations on that spelling. That was the problem. So I'm happy to see I didn't install that, but this doesn't make me happy. It looks like it's fixed. So the PyPI team also just patched a remote code execution hole in their platform, which potentially could have been exploited to hijack the entirety of PyPI.
Starting point is 00:34:31 That one makes me way more nervous than typosquatting or that weirdness. And it was a vulnerability in the way that they were doing GitHub actions with PyPI, which allowed a malicious pull request to execute arbitrary code over there, which is not ideal. Yeah, but I'm glad to hear that's fixed. Anyway, everything's fine.
Starting point is 00:34:52 It doesn't feel fine. No, not at all. More like a nightmare, to be honest. Yeah, to be honest. Eric, anything else you want to share with us? No, just thank you guys again for having me on the show. Pretty fun.
Starting point is 00:35:10 And make sure that you guys follow me on Twitter. Awesome. We'll put a link in the show notes for your Twitter. No, we aren't done, are we, Brian? No, we need a joke. One thing is missing. Yeah, it's important. So this one is more of a,
Starting point is 00:35:25 not an ML one, it's more of a web API type thing. So, so often people will write web APIs and just return some kind of message in a JavaScript dictionary that says things like bad response or whatever, but you're supposed to use HTTP status codes, right? Like if there's a bad request, you should return the status code 400. If it's not found as an entity, you should return 404 or whatever. So here's like two kids at school exchanging messages and it has server on one of them, client on the other, and 200 on the message exchange here. And then at the bottom, the one kid that got the message reads the JavaScript and says
Starting point is 00:36:04 status code 400, detail, bad request. He's like, why? Why did you do this to me? This is good. Yeah, this is like little Bobby Tables. Let this be a lesson to you. You don't pass messages like that. Come on.
Starting point is 00:36:18 It's so true. It's totally true. Totally true. All right. Well, that's it for our jokes and everything, Brian. We'll have another fun Wednesday on Python Mites. Totally true. All right. Well, that's it for our jokes and everything, Brian. Yeah. Well, it was another fun Wednesday on Python Bites. Absolutely.
Starting point is 00:36:29 Thanks, Eric. Thanks, Brian. Yeah, thanks, Eric, for being here. Thanks a lot, guys. See you around. Bye, all. Bye. Thanks for listening to Python Bites.
Starting point is 00:36:39 Follow the show on Twitter via at Python Bites. That's Python Bites as in B-Y-T-E-S. Get the full show notes over at pythonbytes.fm. If you have a news item we should cover, just visit pythonbytes.fm and click Submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click Livestream to get notified
Starting point is 00:36:57 of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.