Python Bytes - #215 A Visual Introduction to NumPy

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 215, reported January 6th, one of my favorite dates, 21. I'm Brian Ocken. I'm Michael Kennedy. And we have Jason. Hello. Yeah, hey, Jason, nice to have you here. Jason McDonald. Yeah, it's good to be here. Thank you for having me.

Starting point is 00:00:18 Yeah, thanks for joining us. Oh, and Brian, I think he's going to cover something we haven't really covered much on the show, GUIs. Oh, good. Actually, to be honest, I know this is like a longstanding joke in the show for longtime listeners, but we actually haven't covered GUIs that much recently. But there was a long stretch where we did. Yeah.

Starting point is 00:00:35 Yeah. That was probably like a year ago. Yeah. Yeah. I like my programming projects and my brownies to be GUI. And fudge. Come on, fudge too.

Starting point is 00:00:44 And you like bad jokes, so you'll fit in nicely. Oh, absolutely. I know if anyone likes puns, follow my Twitter. I post an original every Monday. Nice. I heard that there's going to be a lot of exciting news for space in 2021. So I kind of want to bring a little space and Python together. That's good. Yeah. Yeah. So the first topic that I want to talk about is this video done by a woman in the UK who is a astrophysicist. She goes by the name Dr. Becky, which is cool. She has a fantastic YouTube channel. She's also a Python developer and she works in cosmology, which is pretty cool. And she did this video that I'd just like to highlight for people who maybe are coming into Python, not from the, hey, I'm going to create a microservice set of

Starting point is 00:01:32 APIs talking to Docker, but more from the, hey, I do some kind of science or data science or something like that. And the video is called the five ways that I use Code as an Astrophysicist. Cool, huh? Yeah. Yeah. So she basically lays out the idea of as a modern day scientist, you can barely do your job if you're not doing some sort of programming. And of course, one of the best languages, technologies for programming these days is Python in the data science space, right? Surprise. Yeah. No big surprise there since 2012, I would say. And so she covers five different things with examples of each. So I thought that was just a nice way for people who are either getting into Python from a science side, or maybe they're

Starting point is 00:02:16 teachers and they want people ask, why should I not just use MATLAB or some other custom tool? Let me show you. So here's some really cool examples of real astronomy being done with Python, but it's also super accessible to even like middle schoolers, I would say. And number one is image processing of galaxies from telescopes. So you can do things like noise removal. So it turns out that when you're taking pictures of galaxies, even if there's no actual background light or disturbances, just the basic disturbance in the actual sensors themselves will put little marks and imperfections in the images. So using Python to go through and clean those up makes it much easier to get started. And the size of these pictures and the amount of data coming in from some of these new telescopes is stunningly large. Yeah, for sure. Another one is data analysis. So if you're trying to find the brightness of some part of an image, say maybe you're looking for a transit

Starting point is 00:03:16 of an exoplanet, right? You want to constantly monitor the brightness of a star or in her case, what she's studying, it just blows my mind. She's studying galaxies. Like when you see pictures of stars and you're zooming in, you're like, oh, that's not a star. That's a galaxy. It's just like, you know, like I still can't really get my mind around that. But she talks about one of her data sets that has 600,000 rows of like brightness of galaxies. So 600,000 galaxies, they all have information about that they're comparing. So that's pretty awesome, right? Model fitting. There's an example about theory that most galaxies have a supermassive black hole in the middle there's also this idea that possibly the size of the black hole and the size of the galaxy these things kind of grow in mass together so she has all this data she's like well let's do some

Starting point is 00:03:59 statistical fits of black hole size and galaxy size. Also, the color of galaxies can indicate the relative speed or rate of star formation. And the age. And the age, exactly. Yeah. All tied together. And so she's using Python for that. Finally, data visualization, you know, pretty straightforward, but drawing graphs and pictures. And the last part that was my favorite part is simulation. So there's two really cool examples. What happens if a star gets too close to a black hole and gets, she said, spaghetti-ified? That's cool. And the other one is examples of galaxies colliding,

Starting point is 00:04:35 which is just, again, mind-blowing. But really cool computational examples of all that. So I wanted to highlight this video because it's super accessible, but it's also really neat to show concrete examples of real science being done with Python. Yeah. I thought it was cool when she was talking to her, her colleague about building the simulations of the, of the, you know, you have a simulation of the universe, where do you start on that? It's like, you, we think we have project blocking, you know, it's like you start on a project. It's like, yeah,

Starting point is 00:05:02 I'm just going to build a tool. Where do I begin? It's like, I'm going to build a simulation of the entire universe. Where do I start? Exactly. Like I'm going to simulate gravity at a galactic scale. Let's just do that. Yeah. Awesome. So if people are out there and they're interested in this kind of stuff, um, yeah, this is all in one video. Yeah. This is all in one video. Yeah. Robert says star or galaxy. It's big. Yeah, they're both huge, but obviously, I mean, it's just like I can't get my head around like galaxy size stuff. It's so, so. Star isn't a star as a primitive type in the universe.

Starting point is 00:05:35 And then a galaxy is a collection. That's what I just immediately go to right there. Yeah, exactly. Exactly. Yeah. So, Brian, it's like a 15 minute video that half of it is the stuff that i talked about that half is what jason touched on she actually interviews one of her colleagues who basically does the more uh the simulation side of programming

Starting point is 00:05:55 that's pretty cool yeah yeah i'll have to check that out yeah it's definitely worth it yeah i enjoyed it i don't do very much data science actually at all and so it's like you know understanding you seeing data science stuff is always interesting because i but most of my work is in like application development i don't usually work with a lot of data so it's the that side of it explained in this really cool relevant way instead of like well the statistics is a number of people you know who you know buy you know cheese every weekend that the supermarket is not interesting galaxies exactly getting better click-through rates on your ads is not not super compelling but yeah i think it's really valuable to see alternate perspectives right we all get into our own little world of like this is what programming is this is

Starting point is 00:06:33 what python is for and then you know it's it's bigger i want to talk about uh numpy a little bit all right tell us about it well i've i've actually i've used numpy off and on a lot um and it's definitely a staple scientific use of machine learning and sorts of stuff. But I'm starting to use it more, and I've realized that I had the wrong mental model. So I think of arrays kind of just like lists, but it's really different. And so I came across this article. It's a couple years old, but it's a visual intro to NumPy and data representation. And to me, it really helps a lot,

Starting point is 00:07:07 like to help me understand what you can do with it and just have a good mental picture of what the arrays are in NumPy. So it talks about arrays, matrices, and nd arrays, which are n-dimensional. But like, for instance, even just creating an array, I knew how to create an array i mean you just kind of initialize it with a list and you get an array but i didn't know you could

Starting point is 00:07:30 do like just say uh i want a list of ones or a list of zeros or an array of ones or just a random array pre-filled with random numbers that's pretty um and then he talks about um you know arithmetic you can do with them and slicing and stuff you know brian like when we talk about pythonic code all the time like oh you could write code in this way where you kind of hack a numerical for loop but you should do this other way and that would be more pythonic i suspect there's also a in my way a numpic way right sort of like filling up stuff you're like oh you should just do ones on this one. And then, you know, you always like there's a lot of cool other ways of sort of conceptualizing things. Right. Yeah.

Starting point is 00:08:09 Well, and it's worth remembering. And I've said this quite a few times. Not here, obviously. But I regularly like to remind people abstractions are there to save us typing, never to save us thinking. It's like it helps to have that mental model, as you put it, Brian, you know, straight, because if your mental model is wrong, it can really wind up, well, you're prone to both cargo cult programming,

Starting point is 00:08:32 well, I do it this way because it's the way I was taught, or trying to, you know, ill fit a pattern that's familiar to, you know, the wrong sort of problem when you don't realize what it is you're really working with. So understanding what's happening under the hood, even if, you know, you don't know all the technical details of the implementation, still understanding how it's doing things is important to, you know, choosing the right idiomatic patterns always. Yeah.

Starting point is 00:08:54 Yeah. And you'll hear stuff like, oh, well, Python is slow. It's like, well, because you're doing it wrong. Don't do it that way. Yeah. For example, use something like NumPy, right? And like, for instance, one of the things I really loved about this article was the explanation of dot product, because i've heard this before i've

Starting point is 00:09:08 never had to use a dot product but it like somebody described it to me several times and i'm like yeah okay weird but then like the visual representation of it i look like just stared at it and read it for like you know 30 seconds oh that's easy now And I'll, I'll have it forever now because of, of that sunk in there. Pretty good. Um, one of the reasons why I went to it, uh, I had this problem is that I, um, I get like large arrays, but they're not like you, um, they're like in the thousands say of numbers. And I need to make sure that one array is like comparing to another. I know equal works, but I wanted to compare item by item to make sure every element is less than the other element in the other array. Less than or equal. I didn't know how to do that.

Starting point is 00:09:55 And I'm like, I think NumPy would probably do that easy. Can you do one NumPy array less than the other? Yeah. So if you say less than, it compares it element by element and it gives you a list of true or false and then you can do all yeah doing all on it yeah just say all of of these two arrays less than or equal to each other and i get exactly what i want they're very expressive simple line of code yeah it's that kind of stuff i was thinking of when i was talking about like the numpic numpionic way or whatever idi Idiomatic numpy. Thank you.

Starting point is 00:10:26 Is like, that's like one or two lines and it's really fast. Whereas you could loop over each item individually and it not only is more code, but it's also slower. Yeah. Well, and also I like, I also have to, I like that there's the intermediate step of that. There's gives me a list of true and false too,

Starting point is 00:10:40 because I also on the debugging side, I need to be able to like wrap this in something and pick like say the first five elements that are not matching i mean i don't want if i if if it if it's false the whole statement's false i don't want to like just say you know list all the thousands that are wrong but i i want to be able to like list a few to say at least these are not in the right yeah yeah it's good i'm gonna try out numpy now i now have a reason to to try it out exactly like why am i not using this in certain situations magnus uh the live stream says two dimensions is okay three is hard but in that then my mind

Starting point is 00:11:17 blows yeah i actually did a bunch of math research and four-dimensional stuff two-dimensional but complex numbers so four-dimensional sort of and yeah yeah, it's just, it's just hard. Well, what one, one of my weird knacks as a programmer is I actually can think in six dimensions. It's, it's, I mentioned before the podcast, I had a head injury a few years ago. So I'm a minor traumatic savant. I can think in six dimensions and the best way I can explain it, if you're trying to do it without having a really bizarre brain like mine is think of, uh, think of the fourth dimension as a timeline. And for each timeline you have, you have space represented as a cube,

Starting point is 00:11:49 but then you have this row of cubes, which represents the timeline. It becomes a lot easier to think of four dimensional arrays when you think of it in that fashion. Yeah. And the way that we did it, we actually had animations of that three dimension thing and the animations were moving through that,

Starting point is 00:12:03 that bit. Yeah. But still it's, it's, it's no easy moving through that bit. But still, it's no easy thing. Yeah, it's easier when you're an animator to wrap your head around 4D than if you're just an ordinary run-of-the-mill programmer like most of us.

Starting point is 00:12:14 Brian, would you say that that's a GUI type of solution? No. Maybe you could do something with Qt? Yeah, I don't know. Jason? Who knows? cute. Yeah. Oh, yeah. I don't know. Jason. Yeah. Who knows?

Starting point is 00:12:26 It's possible. So that's our next topic. Take it. Grab it, Jason. Yeah. Well, OK. Well, I was really excited to discover the cute six just released on December 8th. So cute.

Starting point is 00:12:38 Yeah. It is officially pronounced cute, although it's much it's very debatable. People go, it's cutie. It's cute. Come on. Yeah, exactly. Anyway, whatever you're gonna call it it just released um and this includes the python binding so um piside 6 uh shaboken 6 which is the so piside 2 was qt 5 as if that made sense piside 6 is is qt 6 cute sexy now i'm doing it anyway so that just released. And you also have the Pi Qt 6 if you prefer Riverbanks version. But in whatever case, you're going to wind up with all the Qt 5 and prior had this hard dependency on OpenGL. And they've

Starting point is 00:13:26 actually put in what they call the rendering hardware interface with an abstraction layer into Qt. So now it can natively support whatever the 3D graphics driver is on that device, whether it's DirectX, Vulkan, Metal, whatever you want it to work with. It uses the native by default. You can tell it to use whatever you want want um that is so cool yeah um and there's a bunch of other optimizations and fixes to have in here i am really excited because i discovered and this was actually introduced in 515 uh but they now support snake case for those of us who are like pep 8 addicts who really hate the fact that cute kind of seemed to force you to use the camel case, you can use snake case.

Starting point is 00:14:08 There is a setting for it. You can also use properties instead of getters and setters as of Qt 6. So you can just rely on properties. And that is, it makes it a lot easier to write, you know, idiomatic Python code that is Qt, which is kind of fun. Well, it just feels wrong to write get with, set with, all those things. They also have this cool thing called property binding

Starting point is 00:14:31 where you can actually link those together now too. It's like you can link the width and the height. So when you change the width, the height automatically changes. Nice. Yeah, I really want to build some stuff with Qt. I've got a few app ideas in mind. What I don't have is time.

Starting point is 00:14:45 Sadly. Can you help me with that? Jason, can you help me? Just have more time in my life? I know I have a reputation as a time lord, but unfortunately, I can't control the stream of flow of time there. If I find my TARDIS, I'll pick you up and drop you off 10 years ago, and you can relive those 10 years and do additional things. Okay. Nice, nice. That would be good.

Starting point is 00:15:05 Yep. Let, nice. That would be. Yep. Let's see. Actually, a couple of questions from the live stream. Magnus asks, any news about Qt going mobile? I actually am. Shame to admit, I don't know. I don't know either. But I think the bigger, more interesting question would,

Starting point is 00:15:21 could PyQt stuff, like, would you be able to write a python cute application and make it mobile right i think that's where it gets really interesting um because there's other if you pick another language like c++ there's other options you might be able to choose and then maybe you know this one you're going to ask are there any well known python apps built with cute oh yeah yeah they're um on the spot i'm trying to think of what mine it's not well known but i built time card in cute if you look up um if you open like built with Qt. Oh yeah, yeah, they're on the spot, I'm trying to think of one. Mine? It's not well known, but I built TimeCard in Qt, if you look up

Starting point is 00:15:48 if you open up TimeCard, it's just a time tracking app that I built, but actually there's quite a lot built with Qt. I think with a K in front of it, if you're into KDE, the entire KDE stack is built on top of Qt,

Starting point is 00:16:04 and there's actually quite a bit of it that's done in Python. So names are escaping me off the top of my head here. But yeah, anything in the KDE universe is Qt, and so you're either going to get C++ or Python. Python is certainly a lot faster to write. Oh, FileZilla apparently is built. You know one that I know that's written in it for sure that I, it's like one of my favorite apps actually,

Starting point is 00:16:27 is RoboMongo or Robo3T. It got renamed too. I believe it's just C++. It's not Python cute, but that one's a really nice one as well. Actually, there's a huge long list. I'll put it in the show notes over here of a bunch of apps written as well. It's definitely a lot easier to write write uh something in i've used a lot of different ui toolkits and keeps definitely one of the easiest yeah the thing

Starting point is 00:16:51 that i like about it is it looks like it belongs because so many apps you build with these sort of cross-platform things and it's just like okay well that's not how the file dialogue supposed to look you just know it's alien but you're like no no this looks this looks like it belongs here well and packaging's the other half of it because like i tried to build something with kivy and i love kivy from uh from a development standpoint it's really cool from a packaging standpoint it's like beating yourself to death with a wet trout so um and and actually if you're gonna do cross platform then um actually um gtk is horrible too because it's really hard to get it to package on windows a lot of times.

Starting point is 00:17:25 Qt just works. It just packages everywhere. Yeah, that's great. That's nice. All right. Brian, I think this episode is brought to everyone by us. Wonderful. We're good people.

Starting point is 00:17:35 Yeah, so we are. We're doing a lot of work out there, as everyone probably knows. If you're into testing, check out Brianrian's pi test book if you're looking to take a python course we are just about to pass 200 hours of python courses over at talk python training i'm working on a new course how to build web apps not web apis but web apps with fast api super neat stuff so that's that should be out in a week or two so anyway yeah i'm also i wanted to bring up that um there was kind of a spike in uh my test sales in the last last quarter of 2020. And I'm hoping that like some schools using it, testing other teaching.

Starting point is 00:18:13 So, yeah, that'd be super cool. Yeah, it's nice to see more, more, more, more stuff about stuff other than unit test. I mean, unit test has its place. But when I wrote the I've got a book coming out in May. And when I wrote the chapter on testing and what I'm at is like, thank you for not forcing me to edit yet one more unit test chapter. Nice. What's your book on?

Starting point is 00:18:31 Oh, my book's called Dead Simple Python. It just it it introduces the language of Python, the idiomatic practices of Python to people who are coming from another language. So if you don't want to have to sit through yet one more explanation of what a variable or a function is or a class is, you can pick this up and it dives straight into the fine details of why idiomatic patterns are what they are in Python.

Starting point is 00:18:54 Nice. Yeah, that's a good idea. The courses or books that say, we're going to pretend you know nothing about the world and we're going to force you to go through everything from scratch every time, that drives me crazy. You know what else drives me crazy, Brian, is when my Python GC is doing stuff when I know that it doesn't need to do stuff. Yeah. I like to not have to think

Starting point is 00:19:13 about the garbage collect. And you generally don't, right? Like one of the things that genuinely surprises me is the fact that we don't really talk about memory very much in Python. It's like, oh, okay. I think it cleans itself up. That's good. Now what? Let's go about stuff, right? But if you dig into it, it's pretty interesting. There's a lot of stuff around allocation we've covered before, but it's quite unique. But Python's also somewhat unique in the sense that it has like two modes.

Starting point is 00:19:37 So it has reference counting, which I would say 98% of all like memory management cleanup stuff is in the reference counting side. This is totally made up these numbers, but there's a little, there's, I would say maybe even more like 99.5, unless you're building some kind of a certain kind of app, like with interesting algorithms, most apps don't create cycles. And the only reason we have garbage collection in addition to the reference counting is to catch those cycles, right?

Starting point is 00:20:09 You know, I've got a customer object. I've got it out of a SQL community database. It has a relationship over to the orders. I go to the orders. The orders have a link back to the customer. Maybe like traversing that lazy loaded list has created a cycle. And now I need the GC to save me. So the rule for when the garbage collector runs is you can ask it, you can say import the GC module, you say gc.getThreshold or thresholds. I can't remember if it's singular or plural. On my screen, if I would switch to a singular, getThreshold. It returns three numbers. They're not the same units, which makes them really hard to understand. The first number is how many allocations of collection objects. So classes, dictionaries, lists, tuples, things that could

Starting point is 00:20:46 contain other stuff. So things that could potentially be participants in a cycle, like numbers and strings are not even considered by the GC. But how many allocations of collection types are there that exceed the reference counting deallocation? So if I had a list and I put a thousand classes, class objects in it by allocating and filling it up, then I would hold on to a thousand and none of them would have become garbage. So the first number that comes back is, well, how big is that number before we just run a GC no matter what? And the default is 700. So my example there, if I create a list of a thousand objects, that's a GC that's going to run. It doesn't matter if there's cycles, there's no cycles. It just doesn't matter.

Starting point is 00:21:23 Like I've made a thousand of them. That's over 700. So we're going to run a GC. And then the rest are like, how much do you run? Like a whole memory GC versus a local, a small, like recent object GC. And what occurred to me is, you know, my website, there's a lot of pages that pull back thousands of items and any website that uses the database and an ORM that pulls stuff back and hangs on to it and not just like streams over the items, but puts them maybe in a list or something temporarily. Anytime you do that more with that thousand, you're going to have the GC run, right? They're just looking for anything to throw away basically. Yeah. But you know, you're still in the process of building the list of them. I got to get 10,000. Well, guess what? That means you're going to have 14 GCs and you're just in the

Starting point is 00:22:03 process of building the list. I'm like, that's kind of weird. That seems excessive to me. And then I went and looked at the site map on TalkByThonTraining where we're pulling back like thousands of transcripts and all sorts of stuff to generate all the pages on there. 77. There's 77 GCs to render the site map. There's no cycles. There's not one. So I'm like, that's not good. Well, let me think about that for a second. So what I ended up doing was I said, well, what if I made the threshold 10,000? Actually, I ended up on 50,000. So only run the GC if you get more than 50,000 allocations without deallocation. What was really interesting is doing that made my unit tests, which were including many,

Starting point is 00:22:37 many integration tests on TalkBite on training, run 10 to 12% faster. Just setting that one line. And it basically does not use more memory in my case. Is that crazy? Well, it makes sense. Most, most issues of performance just come down to memory and how memory allocation is the allocation. I spend almost all my time in C++, more time in C++ than I do in Python. And we don't have a garbage collector over there. So you have to do all this manually and doing it right. You know how much work it is, right?

Starting point is 00:23:06 Yeah. Exactly. It's like doing it wrong is why stuff's slow. People are like, well, Python's slower than C++. Well, it has the potential. C++ has the potential to be faster than Python. But it really depends on how you write that code because well-written code is always going to run faster

Starting point is 00:23:21 than poorly written code. It doesn't matter what the two languages. Yeah. Yeah. I realized that in my world, in my type of application, I almost never create cycles, but I often get back more than a 700 class objects, which also have dictionaries potentially in the mix as they're like allocating the converting serializing into classes. Like there's gotta be a lot of places where that's happened. So I just set this number to say, you know what, let's waste a little bit of memory. And if there are cycles, we'll come

Starting point is 00:23:47 back and get them later. And because there's almost no cycles, there's almost no memory growth. For example, so the server is running like eight worker processes, one of them. And I made this change. And I think over after running for a week without restarting any of the processes, it went from 1.89 gigs of memory usage to 1.91. So like 220 megs, I think it was 20 megs more memory usage and yet like 10% speed up by just changing like one call at startup.

Starting point is 00:24:14 It was insane. Well, and think about what Dr. Becky's code is. You know, like, you know, go back to the astrophysicist thing here, you know, with the sizes of data structures that she's doing or any data scientist who's listening, you know, they're usually dealing with 10,000, 100,000 million items. You know, you combine this with all the stuff that we talked about with NumPy and with data processing and, you know, we talk about how long it takes to do some of these data regressions.

Starting point is 00:24:37 How much would this be? Yeah, exactly. So if that data is being done in Python and it's not just purely being pushed down into the C data science layer, then yeah, that's really interesting, I think. Although I would caution at the same time that there's no such thing as a magic bullet. So you have to understand why this is going to speed things up. Well, I have to just copy and paste that line that my colleague has that he got from Michael Kennedy because it'll make the code faster. No, you have to know why it's going to make the code faster.

Starting point is 00:25:07 It's an easy test. Some cases it makes sense. People can check it out. I thought it was really, it just so surprised me. I was walking along and I'm like, wait a minute, that must mean something weird is going on. And then I put it on just on one of my pages. Like, why would I do 77 GCs on a single page load? That's crazy.

Starting point is 00:25:21 And so I just started exploring this and here we are. So did you, whatever you're linking to, does it talk about how you can test how many garbage collections? Let me see. I'm linking to a Twitter thread and way deep down. No, but there is a way to do it. If you go to the GC, you can say, I think it's set debug stats or something.

Starting point is 00:25:45 I'll look it up real quick while we're talking. I'll throw it in at the end here. But yeah, there's a way to do it. Actually, I got it right here. Hold on. Give me just a sec. The way you do it is you say GC.set underscore debug, and then you pass an enumeration, and the value is GC debug stats.

Starting point is 00:26:01 So that thing was just lighting up my you know when i turned that on it would just light up it just completely fill this the terminal with the debug or you know gcd gc gc gc over and over and over when i hit the that one page and then changing it guess what made it better yeah now we should probably be pc about the gc and call the garbage collector the uh the the the programmatic uh sanitation engineer but that's right well it it it doesn Well, it doesn't take offense. It's just there to help us out. Brian, it's probably a pretty awesome library, honestly, the GC library. Probably, but it's built in.

Starting point is 00:26:38 Of course, I'm susceptible on a listicle. Who isn't? Come on. Right. But we don't cover them very much. But I really like this. So this article is top 10 Python libraries of 2020. But their criteria was interesting. The criteria was it has to be a library that was launched or popular. It has to be well-maintained, have maintenance changes since their launch date.

Starting point is 00:27:03 And it has to be just outright cool that you should check it out. So I'm going to go through a handful of these. They listed 10. I don't know if all of them, since I'm, there's like four of them that are machine learning focused that I. I think cool is relative. Yeah. But the first one, I, the first one was typer and I can't, I'm like,

Starting point is 00:27:23 I'm really a fan of typer now was it really just 2020 and i went back and look like it was released like yeah in december of 2019 so sebastian ramirez is killing it for sure and then i looked in i'm like well fast api when that come out that was the previous december so uh the end of 2018 released fast api and then type her a year later. He's just crushing it. Yeah. So yeah, nice. Um,

Starting point is 00:27:47 uh, both a huge fan of both of those, a big fan of rich also. So rich, uh, actually just showed up this in last year in 2020. Um, and rich is a beautiful,

Starting point is 00:27:57 beautiful formatting in the terminal. And yes, it's a beautiful, Oh, it's really great. That's glorious. Um, I'm even using it even in applications where I just need these the tables. So if you need to print out a table in the command line, the tables, tables are kind of hard.

Starting point is 00:28:14 And there were like weird other there were other table specialized table libraries. But this one is great that you can it works. You don't have to specify the width. It, like, comes up with the width on its own. And then you, if you shrink the terminal to really narrow or wide, it'll word wrap correctly and stuff. And that's kind of incredible. So even if, even just for tables. Yeah, which is awesome.

Starting point is 00:28:41 The third one is Deer Pie GUI. I think we covered this. Maybe we could i don't remember i mean we did go on our gooey rant so it feels like it should be yeah so it's a gooey project um the uh nice pictures though at least yeah i've been drooling over up i've been drooling over deer i'm gooey for a while i haven't haven't had an opportunity to use it yet but i've been looking at it so yeah so uh the last few i want to highlight pretty errors looks neat i haven't had an opportunity to use it yet, but I've been looking at it. Yeah. So the last few I want to highlight, pretty errors. Looks neat.

Starting point is 00:29:08 I haven't tried that yet, but it's a way to. That is glorious as well. Better tracebacks. I mean, ideally, you don't show errors to people, but if you're going to, let's make them at least readable. This is great. And let's train ourselves, too. You know, it's like, you know, we're going to have to read the. We're going to spend at least half our life reading error messages.

Starting point is 00:29:27 So let's at least make it readable. Another quarter crying about the what we just couldn't figure out. And then the last two that I want to highlight is diagrams and scaling. Diagrams is a library. Look at that picture. It's a way to do it's intended for like um cloud architecture drawings um but the it's written in in python you you write these diagrams in python um and so because they're text you can check them in with version control that's cool um which is nice i'd like to

Starting point is 00:29:59 see these sorts of diagrams like more would be great for not just network diagrams, other diagrams. Flowcharts would be great. I still flowchart. Yeah. So the last one is Scalene, which is a CPU and memory profiler in Python that handles multithreading well

Starting point is 00:30:18 and distinguishes between Python versus non-free usage. That's pretty cool. I definitely need to try this out. I also like that you don't have to modify your code to use it. Yeah, that's really cool. Absolutely. Yeah, those are cool. There's a bunch of great ideas there and

Starting point is 00:30:35 I really need to find a use for rich. Solution of sorts of a problem again, but hey. Well, I write a lot of little terminal apps and stuff and I'm just like, maybe you'll put a little color in here or something and just you know i just need to take the time and go no this is a ui that i should pay more attention to not just some random thing with text yeah we'll find this cool stuff it's like i want to i want to use i feel the need to use this somewhere well i had a little so i had a little application where it's just like i said with the

Starting point is 00:31:03 tables and um and i'm like i don't think it needs colors i'm just showing a table um but the default for rich is to show colors so and you don't have to pick them it just um so the like the uh the heading and the lines between were like different colors if you're on a color terminal and if you're not on a color terminal it works anyway it just figures that out for you and lovely love it yeah that's awesome it's awesome it's very awesome awesome speaking of awesome so uh pep 518 rolled out a while back uh it was introducing this thing called uh pyproject.toml i guess that's toml or whatever i'll say that pyproject toml uh so the idea behind this was that it was going to be this um configuration

Starting point is 00:31:45 file you know one configuration file to rule them all and of course python we like things to be simple well ironically this turned into a really political thing which i'm still trying to wrap my head around so basically the the nice thing about this repository is is is keeping track of all the projects that have adopted pyproject.mel, either optionally or mandatory, for configurations. So instead of having to have a dozen configuration files in your project for all these different tools, you can just use this one.

Starting point is 00:32:14 And so it's got this big list. What I find interesting is this part down here at the bottom. If you go down to... Yeah, just scroll just slightly here. Just slightly. Just a little bit up. That's going to sound weird on the podcast. Anyway, so if you're going to, so these are projects that are quote unquote discussing the use of PyProject Humble. But if you actually look at these, it's kind of odd, you know, the big sticking points,

Starting point is 00:32:38 because these are the projects that are like stopping people from really just going all in on PyProject Humble. And there's even some, you know, talk about circular, you know, dependencies. Or some are like, well, I'll do it when know talk about circular you know dependencies or somewhere like well i'll do it when they do it and they're like well i will do it when they do it um which makes you wonder if it's a huge uh so my pie is the weirdest we don't have an awesome himself said well it doesn't solve anything you know someone said can we just add this please just add it it's easy here here's the pr somebody did the pr he's like nah it doesn't solve anything and he closed he closed it's like uh it does solve something it's one less file i have to deal with that is a solution um flake eight um they have a couple of concrete objections one is the fact we don't have the standard tomo parser in the python uh standard library so that could be

Starting point is 00:33:21 you know that could be a problem um you're adding another dependency to just support having this format exactly yeah um but then again it's the common dependency with a bunch of other you know tools that are already in use and it almost doesn't matter uh pip um someone said i don't understand this pip to change its behavior so mere presence of the file doesn't change functionality i can't wrap my head around what he's referring to there maybe up but the stupid thing is someone already did flake 9 which is a which an exact fork of flake 8 that just adds high project homel so it's like it's done they just have to merge it but it's yeah and actually the same thing happened with uh bandit uh someone actually implemented in 2019 the pr has been sitting there untouched since 2019. So over years gone by, it's there and Bandit is not picking it up.

Starting point is 00:34:09 They're just, they're silent. Read the Docs is saying it's too much work. Like it's a lot of work for us to have the multiple. Pi Oxidizer shockingly hasn't even said anything since 2019. They're like the new trendy, like the trend setting packaging thing and they haven't been saying anything about this this so i i'm trying to figure out why it is that this is so controversial because it seems so obvious you have one file to store all of the settings for all the different tools um and yet everybody seems to want to do their own thing with this. Well, I know that, you know, PIP, ENF and Poetry and Flit and some of these other tools that suggest a workflow.

Starting point is 00:34:52 I feel like I hear this file format being used along with those and, you know, telling people we're going to have a different way for you to like work with your projects and manage dependencies and stuff and you know that i think that's part of the source of of this and i don't know if it's just necessarily all mixed together brian what do you think you know more about this than i do um i think a lot of projects are on the side of like for instance um coverage was uh was it i don't know where they are on the list that they adopted did they adopt okay yeah well coverage had this thing and and other tools were talking about um you know there's no tomo cursor and they they didn't have any other dependencies so they didn't want to add a third-party dependency um just for this and and if they're just using it for packaging however or or settings or something but um the so i do i do think we will see a lot i don't think it's a reasonable argument because um there's there's reasons why you know the same reason why request

Starting point is 00:35:51 is um because there's making changes but i do think that the like the format of toml basic format enough to get a by project um isn't going to change much uh So I think enough of a project Toml parser to handle PyProject, I think we need something like that built into Python. Yeah, especially since we have PEP 5.18, so we have some standard already. Yeah, so I think we'll see a big...

Starting point is 00:36:21 I would like to see at least, even if it isn't the mainstream one, if most projects that are okay with the third party use something else uh for a tomo parser but there's some built-in stripped down version in the the standard library i think that that's i think that's great yeah i i see you could solve that problem by just vendoring it just like here's the two files that make up the parser we're just gonna you know make it part of this package so now we're good to go i don't know sounds good well i think that's it for all of our items um brian you got anything actually you want to share with folks yeah i'm it's my birthday yay happy birthday man so i'm looking good for i was gonna

Starting point is 00:36:59 say you're looking good for 28 brother so i'm 51 and uh i heard today that that's just one i'm just shy of a full deck well yeah i've never been accused of playing with a full deck myself um but don't i will say don't let anyone tell you that you're old because uh it says in the first chapter chapter of genesis thou and then god said man's year shall be limited to 120. Half of 120 is 60. So it is biblical that 60 is middle age. You're not even middle aged. You've got a way to go. It's the Bible. I keep telling everybody that I don't look at day over 73.

Starting point is 00:37:38 You're good, man. A couple of happy birthdays. And also you're going to ask if you're still a fan of Flit. Yeah, I love Flit. Especially since they adopted the source directory. Yeah, that's right. That's awesome. Yeah, that saved my life sometimes.

Starting point is 00:37:52 Jason, anything extra that you want to throw out there? I mean, maybe people have a place they could get notified about your upcoming book or something like that. Yeah, you know, following me on Twitter is probably the best way to do that. I'm Codemouse92 on Twitter. And then, actually, I follow NoStarchPress, too. me on twitter is probably the the best way to do that i'm code mouse 92 on twitter um and then uh actually follow no starts press too i mean no starts press is awesome to begin with that's where you're doing the book yeah exactly they're my publisher no starts i don't think they ever put out a bad book i love that publisher so um i was i i can you can actually you can ask my

Starting point is 00:38:21 mother when i got when i got when my book contract got accepted i actually screamed um very high pitched that's awesome yeah follow follow those starch press for updates on on that and all their other awesome they got some other incredible books coming up too and so i'll go ahead and ask her so what's your mom's twitter handle my mom's twitter handle oh um she doesn't have a twitter handle actually so i'll have to put you in touch directly i think unfortunately awesome well cool thanks for being here again so i have a couple of items to throw out here actually this almost brian this almost could have been an extra extra extra extra extra extra here all about it but they're real short so i didn't do that uh django 315 is released django 3 didn't we just just go to Django 2 or something? I mean, that's good. That's really good to hear. So awesome that. Python 3.10 Alpha 4 is available for testing.

Starting point is 00:39:12 The new parser is going to be in that one. Oh, that's the peg parser that Guido's been working on? Yeah, that's going to revolutionize the language eventually. Yeah, it'll definitely make it possible to do more. And in releases, SciPy 1.6.0 was released. I learned about a cool project. So we talked about like avoiding Excel for the Python data science stack, right? Like just stop doing Excel. There's all these weird errors. Like the organization that defines or governs how you can name genes has come up with rules for names you can't use. And the reason they can't be used is they'll be parsed incorrectly into other data types by Excel,

Starting point is 00:39:51 for example. So there's a lot of issues you might run into with Excel and that's all good. But there's this project called PyXLL and this is actually a paid product. They're not sponsoring the show. I just think it's kind of neat. So spreading the word. But anyway, if it's interesting for you, what you can do is it's a plug in for Excel that will embed Jupyter into Excel and allow you to write functions and macros in Excel in Python. So basically almost adds the program Python, the programming language to Excel, which is good. Yeah, it's better than VBA. Let's see. No, I started in VBA. Tell me about it. it's better than vba uh let's see no i started in vba tell me

Starting point is 00:40:26 about it anything better than vba so uh someone on twitter asked if um high charm works okay on my apple mac mini m1 and they pie charm and jet brains in general just released a whole bunch of their tooling with different installs for the apple silicon native versions and so i've got a cool little video um that i'm going to link to in the show notes and it's just like a five second video of here i opened up pycharm and you you basically from the time you click on open project till the project's open if you've opened a project before so that that caveat but at that point if you click on it you perceive click. Like by the time you're letting up the mouse, the whole, the project is loaded and ready to work on.

Starting point is 00:41:09 It's like, it's insane. I will, I will consider picking up PyCharm again when they add live share into it. They have, they're, they're working on it. There's something called code with me. Yeah. Yeah. So I have not tried it. I have no one to code with.

Starting point is 00:41:20 I'm sorry, but. Email me later. We'll set something up. Yeah, exactly. We'll go, we'll go we'll go together uh so also since i got my m1 like three four weeks ago whatever i've only used used uh this for all my python work and apparently it's it's still going strong i even had to send in my macbook pro because it had starting started shut the battery was so bad it would shut down at

Starting point is 00:41:42 75 like you know when it like gets too low it'll shut down and as the battery was so bad it would shut down at 75 like you know when it like gets too low it'll shut down and as the battery gets bad maybe it shuts down at 10 instead of zero if i'm doing video work it'll actually shut down at 75 until i plug it back in so so it's all in one until that comes back well i'm i'm still on my system 76 linux i can't speak to apple i do love my system that that's cool i i just i think this whole um like new arm architecture stuff that they're doing it's it's going to be interesting you know i think microsoft's following suit or trying in parallel with them uh it just felt to me like intel and amd that's just the way it was going to be forever and it's not necessarily the case i i don't i don't have a problem with i don't have

Starting point is 00:42:20 a problem with competition what i have a problem with the software companies making their own you know architecture and it only works on their architecture that's what you move towards and I don't have a problem with competition, but I have a problem with the software companies making their own architecture and it only works on their architecture. That's what you move towards and then you wind up with a totally fragmented industry. I think that's- Yeah, that's not going to be great. Don't do it, Microsoft.

Starting point is 00:42:34 It's not worth it. Awesome. All right, well, that's my extra, extra, extra, extra, extra, extra, Brian. Nice. I want to get an M1. I'd like to get a Mini. Yeah, the mini is fantastic

Starting point is 00:42:45 i really really like it it's not even funny it's not even a it's not even a joke i'm being serious but we do need a joke yes oh i have a joke all right yeah you got the joke this week i actually do have the joke this week yeah and uh so why why did the programmer always refuse to check his code into the repository why he was afraid to commit. Yeah. Yeah, if you want a regular dose of my, that is one of my originals. If you want a regular dose of my absolutely horrific puns,

Starting point is 00:43:14 you can follow me on Twitter, your own peril. I post it every Monday. I've got a new one. Awesome. Nice. Thanks for being on the show. Yeah, it was fun. Yeah, thanks.

Starting point is 00:43:21 See y'all. Thanks, everyone out there on the live stream, and thanks to everyone who listened. See y'all.

Your Ad Here

Python Bytes - #215 A Visual Introduction to NumPy

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.