Python Bytes - #236 Fuzzy wuzzy wazzy fuzzy was faster

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 236, recorded June 2nd, 2021. I'm Michael Kennedy. And I'm Brian Hockett. And I'm Anastasia Timoshek. Hey Anastasia, so great to have you here. Nice to have you on the show. Thank you for inviting.

Starting point is 00:00:18 Yeah, absolutely. Why don't you tell people a little bit about yourself before we get into the topics? So I'm joining from Germany, Berlin, remotely right now. And I have a little one, a baby dog joining as well. You might hear him on the stream. I am originally from Ukraine. I'm not German. I moved to Germany around five years ago, maybe five and a half.

Starting point is 00:00:39 And my passion is Python. I used to be a C++ developer, game developer, and so many more languages. But the best one, I think, for me is Python. I used to be a C++ developer, game developer, and so many more languages. But the best one, I think, for me is Python. So I decided to stick with it for around eight years now. Oh, how cool. I started out doing my professional programming in C++. And I know Brian still touches a little bit of C and C++ in his world. So that's cool.

Starting point is 00:01:02 Yeah, it's half my life. Nice. And what kind of games uh well they were um adapted first uh for ipad they were like two and a half d games and then later on it was mostly 3d games with uh unreal engine yeah oh cool yeah that that's awesome all right well once again welcome welcome so glad to have you here Brian, do I have the first item this time around? No, you do. Go for it. Okay. What do you got for us? excited to see there was a tweet recently by matthew feichart that said um i need to give some serious praise to a fellow psychic hep dev hans deminski on his excellent monolens tool for

Starting point is 00:01:53 interactive simulations of color blindness so i checked this out so monolens is this is a python package and you can pip install it and um as matthew said you can pip x install it so you just always have it around which is nice um and it just pops up this tool this uh this really cool um window and you can just um you just drag it around and it makes the whatever the windows over all over your desktop it just makes it black and white instead of color. So you can see what it looks like in grayscale. One of the things I really liked about this is the example showing it with Map Plotlib and plots. Because plots are really where you're using color to distinguish between the two

Starting point is 00:02:46 different sets of data. So you really kind of want that data to look different, even if people don't see color. So that's, that's an important thing. Um, so that was neat. And then, uh, somebody that replied to that and said, Hey, um, uh, I always try to use c c masher smasher i'm not sure um it is a to make sure they're colorblind friendly so i'm like i've never heard of this so i went and checked out uh smashers and uh what it what it is is it's a bunch of color maps so you don't really have to think about it so you um so there's all these great named color maps,

Starting point is 00:03:26 and they're actually fairly attractive color changes. But it shows you what they look like in black and white also. So this is kind of a little demo at the top that we're looking at on the stream. But the code that you have to, it's just kind of built into map plot lib already like it's an extent it's also kind of an extension to map plot lib and other things that use color maps so you can just say when you're plotting you can just specify a color map like rainforest or something and it it automatically is a colorblind um friendly color map so you can do your plots and have it still look nice everywhere.

Starting point is 00:04:07 Oh, yeah. This is really cool. And Matthew, friend of the show, thanks for sending that in. I never really thought about this and I should have, you know, I mean, I feel like maybe I should go over my websites

Starting point is 00:04:18 and go, do they look terrible for people who have, you know, color vision impairments and whatnot yeah so really cool and it looks like it's this independent thing that will just go over you just move your mouse around it works on anything it doesn't necessarily have to do with jupiter or matplotlib or something like that right great so the monolens is just a it's just something that works on anything i could i drug it over even my desktop my background uh and it showed uh showed the picture in black and white so um it is cool the other thing is wait there's uh color maps i can

Starting point is 00:04:51 just add to uh map plot live that's cool like rainbows and stuff how neat i i didn't know you could just do that so that's uh it's kind of a neat thing and then you can like for instance the the one of the examples that they have on the cASher readme is just sort of a simple plot. And when you're in map plot lib kind of just picks colors for you unless you specify colors for different plot lines. But you can just you can give it a color map instead of a specific list for each item. So and that's just kind of nice. Why not? Yeah, why not? instead of a specific list for each item. So, and that's just kind of nice. Why not do it?

Starting point is 00:05:30 Yeah, why not do it? Anastasia, what do you think? Oh, it looks amazing, really. And it's super helpful. Yeah. When you were doing the video games. I never thought of that, but that would be great to use it as well. For sure.

Starting point is 00:05:40 When you were doing games, did you have to think about this kind of stuff? No, actually we were not that far at that time. It was around seven years ago, eight. Yeah. On the Monolens site, one of the examples they show is having one of the plots use some sort of pattern underneath and not just color. And I'm not sure how to do that. So people that are great at Matplot'm not sure how to do that so people that are great at matplotlib probably know how to do that really right away but that's kind of a neat

Starting point is 00:06:09 idea also to have like one of the one of the graphs has hashes versus stars or slant lines or something like that oh yeah i have it like some sort of ascii differentiator yeah yeah yeah this is super helpful and matth, again, thanks for sending it in. And Joy, yeah, welcome to the live stream. Thanks for being here for the recording. So the next one I want to talk about is something called RapidFuzz. RapidFuzz. Yeah, so last time I talked when we had Vincent on,

Starting point is 00:06:40 I saw the FuzzyWuzzyuzzy text matching for that chatbot that he was showing off. I thought, oh, fuzzy-wuzzy is cool. So Mikael Honkala sent in RapidFuzz. And it's very much like fuzzy-wuzzy, but it turns out to be a whole lot faster. And it uses some of the

Starting point is 00:06:59 same ideas, but, you know, coming back to some of the things we were talking about, it is basically written in C++ using the Levinstein distance algorithm for words similarities but obviously has a python api that we all work with and so yeah it's pretty neat it's really easy to work with you just again pip install it and then you can come down here and do things like buzz dot ratio and you can give it two sentences this is a test or this is a test exclamation mark and it says that's 96.5 the same or um you have fuzzy was he was a bear i guess these are yeah fuzzy was he was a bear i guess those are those the same no

Starting point is 00:07:39 fuzzy fuzzy oh was he fuzzy yeah i gotta i gotta read better was he fuzzy? Yeah, I got it. I got to read better. Was he fuzzy? Was a bear versus fuzzy? Was he was a bear? Oh my goodness. That's 90% the same. Given a bunch of phrases, you can sort them by similarity. You can say, going to use selection, like, you know, to call in sort of call center type of automation, given three choices and given some text, you can say, find one you know like atlanta falcons new york jets new york giants and so on somebody says you know lowercase new york jets instead of uppercase it'll say well uh here's the likelihood that that's a match but here's another possible match that's you know and it gives you the ratios of how good of a match it is so if you've got a select set of choices and you're asking for input on it you can just say well give me the closest match and if it's anywhere close you can just say, well, give me the closest match. And if it's anywhere close, you can just run with that. So yeah, pretty neat, right? That is pretty cool.

Starting point is 00:08:36 Yeah. And the other thing that's interesting is the performance. And before people tell me that all benchmarks are broken and they don't work, sometimes at least they give you a sense. So here's some of the things that they've got in terms of performance, say versus Fuzzy Wuzzy, and the numbers are like 10 or 20 times faster. Definitely broken. It's definitely broken. I think it's because it's written in C++ instead of Python at most of its core,

Starting point is 00:08:57 you know, probably. But anyway, if you're looking for Fuzzy text matching, Fuzzy Wuzzy is a good option. And apparently, thanks to Mikko, RapidFuzz is as well. So yeah, pretty neat. Yeah, we probably should do a segment on benchmarks at some point.

Starting point is 00:09:10 No, no. No. No, we should do it. But I've written blog posts and stuff on it. And it's just an endless battle of you're doing it wrong. Your situation is not my situation.

Starting point is 00:09:22 And in my situation, it's not as good or it's worse or it's worse or it's better or you're yeah no i i hear you it would be interesting but at the same time yeah okay there we go we just had a section on uh benchmarks yeah i've already just explained like the emotional trauma that i'll go through from receiving all the feedback now it's in a stage what do you think about this um um matching? Well, maybe next time we can organize a battle between them. That's right.

Starting point is 00:09:47 Yeah, we'll bring some in. Yeah, sure. Do you have any use for this fuzzy text matching, string matching stuff? Well, actually, yes, at work. We have lots of matching algorithms, but we are using different tools. And I'm not a data scientist person,

Starting point is 00:10:03 but I would love to try that, actually. Looks super cool. We use some C++ libraries. Cool. Robert out there in the live stream says, we would have to benchmark the episode if we had an episode about benchmarking. You see, it's like recursion.

Starting point is 00:10:18 Save that thought for the end of the show, by the way. All right, Anastasia, you're up next. Structured logging. Tell us about it. Well, a few years ago, I went to a meetup and I heard a talk from Marcus Holterman about StructLog. That's the first time when I heard about this and I

Starting point is 00:10:36 decided to give it a try. And actually I fell in love with it and I'm using it since at least two and a half years, maybe two. It's an awesome way to bring a bit of structure to your logs to make them more visible and more usable because usually how we log, it's like just one huge sentence which is readable by humans, but it's not machine readable.

Starting point is 00:11:02 And the idea is here to bring more structure to build some dashboards based on different keys and then values and then see what's actually happening with the system without touching the logs, without scrolling through the whole log and then just reading a whole bunch of things. And I already used it in production. It looks pretty well. If you try using JSON format, just fantastic. Oh, how cool.

Starting point is 00:11:33 Yeah, you can pass it all these like processors and type stuff. So you can say render out the print, the stack info, the log level, timestamp, all those kinds of things. That's neat. We added a bunch of processors, like custom-made, which were specifically designed for our applications,

Starting point is 00:11:51 which made the life of our DevOps parsing the logs way easier because they didn't have to write them by hand. And if you use structured logs for all applications, not just one, but, for example, microservices, and you pass the key ID or trace ID or something that will identify the path which the log goes through, then you might see what happened before the bug happened. Because if you want to see how the system is working, you also need to be either one of the detectives of the system or use the struct log.

Starting point is 00:12:34 It's interesting. When you log out stuff it looks like you can just do keyword arguments and those will add to the log really nicely. So you don't have to create a message that you're going to send that embeds, you know, the value, you know, variable equals valuable, variable equals value. You just pass them to the log message and they become part of the message like that.

Starting point is 00:12:55 That's cool. Yeah. And you can also use the initial message, which is an event like greeted here as some kind of key, which would give more clues where this message is coming from and what type of event happened instead of a usual message. Yeah, nice. Very cool. The other thing it says is if you have Colorama installed, it will automatically render in nice colors.

Starting point is 00:13:19 That's very neat. I love Colorama, and I love having colors in the code that we look at. It really makes a nice difference. So yeah, you get things like the colored, whether it's an info message or an error and whatnot. Yeah, very neat. I like it. I keep meaning to use this more and I know I'm glad you brought it up because I definitely want to try this. Definitely try this.

Starting point is 00:13:42 Yeah. Yeah, this is a really good one. This is new to me, but quite neat. All right. Not new to me, but also quite neat is our sponsor for this episode. So this episode is brought to you by Sentry. So how would you like to remove

Starting point is 00:13:55 a little stress from your life? Do you worry that users may be having difficulties and encountering errors with your app right now? Would you even know until they send that support email? I mean, yes, maybe using Struck Log, but are you watching the Struck Log now? Would you even know until they send that support email? I mean, yes, maybe using struct log, but are you watching the struct log now? You don't know, right? So how much would it, how much better would it be if you had that error or performance details immediately sent to you with the call stack and local variables and active user and all that stuff. And with Sentry, it's not just possible, it's easy. We use Sentry on all of our

Starting point is 00:14:22 web apps, pythonbytes.fm, talkPythonTraining, all those kinds of things. And we know if there's some kind of problem. It's unfortunate if someone hits a problem, but it's better to know and be able to fix it right away. In fact, one time somebody ran into a problem over at TalkPythonTraining, getting a course, and got the message. I could see who was logged in when they had the problem,

Starting point is 00:14:40 and I actually fixed the bug and was about to push out the changes, and I got an email, hey, I'm having a problem with your site. I'm like, yeah, I know. I just fixed it. Try again, please. And they were quite surprised.

Starting point is 00:14:51 So surprise and delight your users today. Create your Sentry account at pythonbytes.fm slash Sentry. And please, when you're signing up, click the got a promo code redeem option and enter Python Bytes. It's not automatic. And they'll make sure that

Starting point is 00:15:04 you enter Python Bytes as the promo code. Otherwise, they won't know it's from us. You'll get a bunch of cool stuff. Two free months of the team plan with many more errors and events and other features as well. So check them out at pythonbytes.fm.com. That's pretty awesome. Brian, I guess you should probably also test your code maybe

Starting point is 00:15:19 before you end up with errors. What do you think? Definitely. And actually, before we go on, I think I mentioned this before, but the graphic on the Sentry page is so cool. I know. I really like it, too. I love the upset console terminal reading of paper. Yeah.

Starting point is 00:15:36 So this is kind of like Inside Baseball, maybe. But I don't know. Maybe three people might care about this. But anyway, I'm one of them. So XFile now works with PyTest subtests. So it's neat, but I got to explain it a little bit. So subtests are kind of this weird feature of unit tests that came along in Python 3.4. And it's a way, it's a context manager so that you can have possibly several places where your test might fail, but continue. It doesn't stop if it fails. And that was within unit test. PyTest had,

Starting point is 00:16:16 well, PyTest had PyTest Check, the plugin that I wrote that allows something similar, context manager. But then PyTest subtests came out, which was a plugin in about 2019 that started that allowed you to run the unit test subtests within from PyTest. But there's also a PyTest style of doing subtests also. They're a bit quirky. So I'm linking two resources, an article by Paul Gansel and an episode of Testing Code where he and I talk about subtests.

Starting point is 00:16:54 And so before you jump in and use them right away, you should know some of the quirks about them. But they're still cool if they work for you. But one of the quirks that was around for a long time was that X fail didn't work. And X fails a way to say, I know my test is going to fail. Um, uh, but you know, and then you get to decide whether or not you want to make market as an X

Starting point is 00:17:14 pass or market as a fail. If it, if it fails, um, and the, uh, this anyway, X fail didn't work with subtests,

Starting point is 00:17:23 but it does now as of like the start of the month. So somebody named maybe Sibber on GitHub. Maybe. Maybe. Merged a fix or submitted a fix as a pull request and it got merged and it's now in version 0.5.0. So XFail, if you wanted to use subtests, XFail now works with them.

Starting point is 00:17:43 So that's the good news. Yeah, yeah, this looks really interesting. So the basic idea is I want to loop over a bunch of scenarios or whatever, maybe test them all and then have the test fail if any of them did, but actually just go through them all before. Yeah, so like on the subtests site,

Starting point is 00:18:01 there's a little example. So like, let's say you're looping through a range and you want to run all of them within, not a parameterized, just within the test, you're doing like several things and you can, yeah. And if something fails, you want to actually report all of the failures. And this is, you know, sort of helpful with loops, but you know, why not just use parameterization? But the one part where it does really help is if you really are checking four or five different things and you really want to know like let's say you're measuring something or you're checking

Starting point is 00:18:35 several dimensions of something. And having all of the failures together would help you determine what the real problem is. So when you have to have all the information, this is a good idea. Very cool. Anastasia, what's the testing story in your world? We use mostly parameterized testing because we don't have the subtest need.

Starting point is 00:18:59 We don't need to test it multiple times, maybe in the future. Yeah. It might be useful. Parameterized works, so I'd stick with it so yeah it's definitely good all right another thing that i think is really neat to talk about but i feel like it's almost down to the benchmark type of situation is what do you do with the secrets in your application there's's to get SSH get, which is always terrifying.

Starting point is 00:19:27 If you go here, you can see, oh, here's all the code that we found in this branch of this GitHub repository. For example, here's your, you know, database connection string with username and password right there, right? So you can see all kinds of issues. If you go over here,

Starting point is 00:19:43 like even a live stream, if it doesn't feel bad enough, you like watch the live stream of all the things that are coming in. Like right now, apparently there's some username and password and a URI and some kind of private key and whatnot. So you don't want that. So what do you do? Well, there's all kinds of things you can do. Do you encrypt those secrets and put them in source code? Well, then where do you store the encryption key?

Starting point is 00:20:05 There's some kind of certain types of vaults you can install on your server, kind of like one password, but for servers, you could do that kind of thing. There's just leave it in there and hoping for the best. There's putting environment variables. That's a very, very common one, right? But still, no matter what you pick, you kind of got to get that data back and deal with it. So I want to introduce you to Pydantic. Brian, you've heard of Pydantic, right? Yeah. In fact, I didn't know this had anything to do with secrets. Yeah. If you go to Pydantic right here at the top, I believe there might be some nice little comment here. Oh, yeah. I thought, I thought you were in here,

Starting point is 00:20:47 but apparently I'm in here right now. I think it toggles between us. Anyway, yeah, so we've known, the point is we've really talked about pedantic a lot. It's a really cool way to create these classes that are kind of like data classes, point them at some data source, and then they validate it and adapt it, right? So if I've got like a JSON document

Starting point is 00:21:04 and it has a field in it, and that field is a list of something, I could say in my model, this thing has a list of integers. And if it happens to be quote a string or a number that has quotes on it, it'll just automatically do the int parse type of thing to get it fixed.

Starting point is 00:21:19 Or it'll tell us that it couldn't figure out what to do with the third value, something like that. It's really fantastic. But what I also didn't know was that it has a built-in support for working with these user secrets. So Dennis Roy pointed this out to me. And there's all kinds of things. You can have the.env file. You can have Docker secrets. You can have environment variables. And all of these things as your secrets. And if you just derive from,

Starting point is 00:21:47 instead of base model, you derive from base settings, then this will automatically determine any of the fields that are not passed to it from the environment or from.env files. What do you think? Well, that's cool. Where do the.env files go? Not in GitHub.

Starting point is 00:22:03 Okay. You know, you store them somewhere else, right? You probably, what ideally I think you do is you would store like an ENV template file that has, you know, put this value and then the real value here. This value and the real value there. And then you, of course, ignore, get ignore the other one, the real one, right? So you at least have a structure. But so the idea is you come down here and say, I've got these settings and we've got like an API key and auth key. We've got a Redis connection, all those kinds of things. And you can even say, I'm going to put a prefix on it. So

Starting point is 00:22:37 in your environment variables, it's fine if you've got one app and one server, but if you've got 10 apps running or 10 APIs running on your server, what does the API key refer to? What does the database connection string with the database name in it refer to? Which one of those 10 apps, right? So you can put a prefix. So you could have like login app API key or login app API key. And you put that in there and it automatically will just let you access it as if it's API key. So you can sort of configure an environment a little bit better. There's just lots of really neat things that you can do in here to make that work. You can say whether it's case sensitive. Let's see, let me pull up. I had to take notes, some other things I thought were super

Starting point is 00:23:19 cool. So it's a regular Pydantic model, which means it'll do all the conversions and the validation. So if something is missing that's required from your environment, it'll let you know exactly what's missing. It'll do those conversions. Yeah, all sorts of stuff. It has support for raw secrets files as well, which is like a slightly different way to do it. You can have differently named ENV files like a prod.env versus unad.env or whatever.

Starting point is 00:23:48 All sorts of settings. So I've always thought Pydantic is amazing, and I had no idea it had this built-in support for working with this. The other thing that's really cool about this is, if you go back to the top where it describes it, it says it will try to get these values from the environment if you don't pass them over. So if you're in, say, a testing environment and you want to actually pass values that would control it, you could just explicitly pass them along instead of having

Starting point is 00:24:14 them come from the environment. So it's really easy to test, you know, set the test values instead of trying to configure a test environment. Nice. We do use it, by the way, base settings, but we didn't use prefixes. Yes. Yeah. Which is a good idea. A really We do use it, by the way, base settings. But we didn't use prefixes. Which is a good idea. Yeah, the prefixes are cool if you have a bunch of apps. If you just have one, it doesn't really matter, right? Yeah, of course. Cool. You like this? It's working well for you?

Starting point is 00:24:36 Yeah, it's working perfectly well. And we are committing on the development version with some dummy keys just to have them around, of course. Oh, wow. How neat. Okay. Well, cool. Well, that's neat that you're using it.

Starting point is 00:24:48 Brian, you got the next one. Is that right? You've already done it. No, but I just wanted to mention the, oh, wait. Never mind. I had the wrong thing. Oh, here we go. The quote I think you were looking for was from FastAPI.

Starting point is 00:25:04 Oh, yes, yes. Of course, of course for was from fast API was not. Yes. Yes. Of course. Of course. Yeah, it is. I'm over the moon. Yeah. Super excited about it. Yeah.

Starting point is 00:25:10 Fast. Thanks. We use it. I love fast API as well. And to me, like pydantic and fast API, they go together because I learned about them at the same time. I know there are different people in different projects, but you know, it works like magic. Yeah. Yeah, absolutely. Yeah. And if it's different projects, but you know. It works like magic. Yeah, yeah, absolutely.

Starting point is 00:25:25 It really is, yeah. And if it's not magic, maybe you should document it. Or maybe it is magic, you should document it. Definitely, definitely. Actually, I'm the one who is usually bringing this topic to the team, how to write documentation. And first, the question is why to write documentation? Everyone knows that we need documentation, but it's hard. It's time consuming. It's annoying.

Starting point is 00:25:50 And how it usually happens, someone leaves the team and then the last days are about handing over everything. Oh my gosh, I remember I've had this experience twice at least. Writing? Where it's like, oh, you said, where you said you're going to, you've given me your two weeks.

Starting point is 00:26:09 So your next two weeks, your two weeks notice that you're going to leave. Your next two weeks will be to start writing documentation for everything you've ever worked on and anything that people might need to do. So your next two weeks are to begin writing documentation that you should have been doing the whole time. In Germany, we have notice period of three months. So like notice period of three months so like it's a lot of documentation just kidding but normally even if you leave the team like you for example move from one team to another you it doesn't mean that you have to leave the company uh still you have to hand over everything that you worked for let's say in in a year or even half of the year. And, for example, in my experience, when I started with Python, I didn't know any Python.

Starting point is 00:26:52 I had to learn it. And, of course, I didn't know about Sphinx or read the docs or any kind of documentation for Python. And what did I do? Nothing. I didn't write it. And half a year later, I was wondering who wrote this code. So I did get blame. And of course, it was me. And I was like, what a stupid person. So yeah. And I suggest to start writing documentation now, even if you're not leaving the team. The reason why I'm bringing

Starting point is 00:27:19 up the Sphinx and read the docs is that it will allow to have continuous documentation. And with Sphinx, you can easily write just some doc strings, which will explain what the function does, what the class is doing, add some input output parameters, and then you will automatically generate it. So there's no need to write it somewhere on Confluence or any other source. Because if there are too many sources, that's where the documentation will die. Because no one will go and check it. And during the handover, usually it happens like that. You write documentation somewhere where nobody knows where and nobody reads it.

Starting point is 00:28:01 Yeah, you pointed out that you've got it in Jira and you've got it in GitHub. And you've got it in all different places. Google Docs, it. Yeah, you pointed out that you've got it in Jira and you've got it in GitHub and you've got it in all different places. Google Docs, yes. Yeah. Especially Google Docs. Oh, yes. And then you share like 10 Google Docs with different people

Starting point is 00:28:16 and then they lose the links and people are leaving. It's nice when people are leaving the team, but it's not nice to the people who are leaving the team to another team because they are getting all the questions for a year. Where to find this? How can I get this function?

Starting point is 00:28:33 How to get this data? Yeah, yeah. Very good advice. You know, for a long time, Sphinx was like synonymous with restructured text, but now we've also got the Markdown with the missed parser there. So that's very cool as well. I'm a fan of markdown instead. Yeah.

Starting point is 00:28:50 And also it supports the Sphinx itself. It supports different types of documentation. For example, you can write code reference, then you can go through all the code. And then you can also write extra documentation like markdown even readme can be included into documentation and you can also style it oh nice yeah yeah very cool yeah there's lots of great themes to it too now it really looks attractive yeah you did recently cover that right brian the sphinx themes yeah and i'm actually when the the markdown uh the support came on that's when i went back and started looking at Sphinx. So some of our documentation is done in Sphinx now

Starting point is 00:29:31 because it does Markdown. And you can even make it do, it's not built in, but you can make it read doc strings and interpret doc strings as Markdown. Yeah, very cool, very cool. Robert on the live stream has an interesting addition to continuous integration and continuous delivery.

Starting point is 00:29:50 So can we deploy yet? Only if the documentation is complete. Definitely. Very cool. All right. Well, that's it for our main topics. Brian, you got anything you want to share? Any extra stuff you want to throw out there? Mostly, I'm curious about PyTest uses.

Starting point is 00:30:06 So I'll drop a link in the show notes, but basically I've got a pinned tweet on my Twitter. And I'd like to have people tell me where they're using PyTest. So I've got some examples. And then I kind of went, my first question was people, projects that have switched. But I was looking at just the guts of how Python works.

Starting point is 00:30:32 And there's some amazing projects that use PyTest, like Wheel, Tip, Setup Tools, Warehouse. Those all use PyTest. That's pretty cool. Wow. How interesting. Yeah. And those are sort of almost inside of Python, which is interesting, because they're not using unit tests, right? Yeah, so then I just

Starting point is 00:30:48 learned about recently, even if it's proprietary, that'd be interesting. I just learned that Stripe and Lyft went through a PyTest conversion recently, so that's kind of neat. Yeah, that's cool. Yeah, yeah, very cool. Anastasia, anything else you want to throw out there or let people know about while we're here? Yeah, maybe

Starting point is 00:31:04 using exceptions. Don't use base here? Yeah, maybe using exceptions. Don't use base exceptions. Yeah, create custom ones for your app or have certain absolutely, I definitely second that idea. All right. Brian, this was in danger of almost being an extra, extra, extra, extra, extra.

Starting point is 00:31:19 Hear all about it, so I'll just go quick. So Matthew Fikert is getting a couple of shout outs on this show so he also pointed out that whoa super cool pip x which we've talked about on the show before it lets you install python tools kind of like homebrew or apt they're not part of a project but you want to have them managed and installed in their own isolated environment so you pip x instead of pip install a thing which is great great. That is now officially part of PyPA, the Python Packaging Authority. So yeah, pretty cool.

Starting point is 00:31:48 So pip x is now sort of officially part of Python. Not Python, the distribution, but the group, you know. Next, I will be presenting-ish. It's recorded, but then there's like a live Q&A afterwards. Manning is having a conference on developer productivity. I don't honestly remember what my talk is going to be about. Oh, yes, here it is. It's 10 tips and tools you can adopt in 15 minutes or less to level up your developer productivity. So I'm going to be speaking on

Starting point is 00:32:15 that. All sorts of fun things. So if you want to check that out, it's free to register for. It's later this month, I guess. Here's just a thought I would throw out there for you. I don't expect an answer, but yikes, cloud bills can pile up. Alex Chan, who is teaching, I guess, I could figure out exactly the context of this, but put out a tweet that said, I have a panicked student in my DMs who accidentally racked up an $8,000 AWS bill.

Starting point is 00:32:43 My suggestion of talk to support is no good. Apparently they won't issue a billing adjustment. Anyone got ideas out there? Help? Oh, no. Could you imagine as a student? I mean, as a professional, it's still a lot of money, but as a student, $8,000 is like a ton of money.

Starting point is 00:33:00 Yeah, it's like a term of bills. It depends on- Yes, exactly. Yeah, like a semester a term of bills. It depends on... I get an announcement like, your bill is now at $50. Your bill is at $100. Your bill is now at $500. Your bill is now at $1,000. And if it goes beyond that, I'm going to have to start paying a lot of attention to what's going on with my AWS account.

Starting point is 00:33:32 So just put these alerts on there. It's usually easy with whatever platform you're on. Anyway, don't be that poor student. All right, what's next? Brian Skin shouted out, hey, this might not be a total new item, but maybe we can mention it. Maybe it's interesting.

Starting point is 00:33:47 Developed a Flake, mentioned a Flake, he didn't develop it, I don't believe, a Flake 8 plugin for Fast API. So if you're doing Fast API, there's different ways to do things like routes and whatnot. And there's like the natural way

Starting point is 00:34:01 and there's sort of a clumsy way. And so here's a Flake 8 thing to make sure you're using FastAPI. Nice. Yep. And I think, yeah, and I think this is my last one. It is my last one here. So Saul Shannonbrook tweeted,

Starting point is 00:34:16 JupyterLab 3 will have localization. So localization means like the menus and the help text and the button hover tips and all that kind of stuff are localized for different languages. So JupyterLab 3 will have localization making it more approachable for people who don't want to work in an English UI. And they're crowdsourcing translations. So if you wanted to contribute to Jupyter and you were good at programming and in a language that's not English, but it's already done in English, you know, go check that out. That would be kind of cool.

Starting point is 00:34:52 I wonder if anybody just messes with people and like does wrong translations just for fun. I'm so afraid of that. Yeah. I think they do. I bet they do. I bet they do. And maybe not really obvious, maybe in real subtle ways. Yeah. Yeah. Yeah. Nevermind. Don't, don't, don't, don't have any ideas. Brian, don't give people ideas. Yeah. Yeah. Yeah. Nevermind. Don't, don't, don't, don't have any ideas. Don't give people ideas. This is not. That's a good one. All right. Well, that's all the extras as well. So how about a joke? Yeah. Okay. So imagine you're learning programming, you're learning Python, take one of these computer science courses where they talk about weird things like recursion. So recursion is the idea that the function calls itself with different parameters, right? Like a really common example would be factorial. So if I'm going to calculate a factorial, it's just n times n minus

Starting point is 00:35:35 one times n minus two. So that's just n times factorial of the smaller number. You can just like work your way back, right? But there should be an exit condition. Like if n equals one, return. Don't keep recursing. So here's a nice little graphic under the banner of only programmers would understand. And it's got the four squares. It's kind of like screen sharing.

Starting point is 00:35:56 We got that infinite view. So learn to program in one corner. Next corner, make recursive function. Third corner, no exit condition. And then it just repeats and repeats and repeats down to smaller and smaller and smaller. I love it. This is bad.

Starting point is 00:36:10 No, this is good. That's how you learn. That's right. No. Yeah, exactly. It's like when you share your screen in Zoom or maybe Google Meet, but you've still got the window up or something like that. But it's about recursion. It's beautiful.

Starting point is 00:36:24 And then you silence basic exceptions and you cannot exit the program. Yes, that's right. Do you know if Python has a tail recursion optimization? I'm thinking no. So the whole point is here, Brian, that we would run out of a call stack space really quickly. And that's usually the error stack overflow error if you recurse too deep type of thing. Yeah. But with trail recursion, it basically becomes an infinite loop. So you run out of

Starting point is 00:36:49 time instead of memory. Okay. So that would be the advantage of tail recursion. I have no idea if it is there or not. Yeah. I mean, there's some languages that do the optimization so they don't generate a new call stack because there's nothing to save. Yeah. I don't know. I'm sure we there's nothing to save. So yeah.

Starting point is 00:37:05 Yeah. I don't know. I'm sure we will find out before next week. Yeah. One of the reasons why I like asking open-ended questions on the podcast. Yeah. That's awesome. Yep.

Starting point is 00:37:16 Well, Brian, thank you as always. And Anastasia, thank you for being here. It was great to have you as a guest. Thank you for inviting. Thank you.

Starting point is 00:37:22 Yep. Bye.

Your Ad Here

Python Bytes - #236 Fuzzy wuzzy wazzy fuzzy was faster

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.