Python Bytes - #265 Get asizeof pympler and muppy

Starting point is 00:00:00 Hey there, thanks for listening. Before we jump into this episode, I just want to remind you that this episode is brought to you by us over at TalkPython Training and Brian through his PyTest book. So if you want to get hands-on and learn something with Python, be sure to consider our courses over at TalkPython Training.

Starting point is 00:00:17 Visit them via pythonbytes.fm slash courses. And if you're looking to do testing and get better with PyTest, check out Brian's book at pythonbytes.fm slash PyTest. Enjoy the episode. Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 265, recorded January 5th, 2022. I'm Brian Ocken. I'm Michael Kennedy. And I'm Matt Kramer.

Starting point is 00:00:43 Matt, welcome to the show. Thanks. Happy to be here. Yeah, welcome to the show. Thanks. Happy to be here. Yeah, welcome, Matt. Who are you? Oh, so a huge fan. I've listened to every episode. I actually, I'm one of these folks that started their career outside of software.

Starting point is 00:00:56 I've heard a similar parallel story a bunch of times in the past. So I have my degree actually in naval architecture and marine engineering, which is design of ships and offshore structures. In grad school, I started with MATLAB, picked up Python, thanks to a professor. And then over time, that's just grown and grown. Spent eight years in the oil and gas industry and using Python mostly for doing engineering analysis, a lot of digital type stuff, IoT type monitoring work. And about three months ago, I joined Anaconda as a software engineer, and I'm working on our Nucleus cloud platform as a backend software. Very cool. Awesome. Yeah. Well, congrats on the new job as well. That's a

Starting point is 00:01:36 big change from oil and gas. A couple of years. I mean, it is in Texas and all, but it's still on the tech side. Yeah. No, it's related, but obviously a different focus. I wanted to make writing code my job rather than the thing I did to get my job done. Fantastic. I'm sure you're having a good time. Yeah. Well, Michael, we had some questions for people last week. We did.

Starting point is 00:01:59 I want to make our first topic a meta topic. And by that, I mean a topic about Python bytes. So you're right. We discussed whether the format, which is sort of, I wouldn't say changed. I would rather categorize it as drifted over time. It's sort of drifted to adding this little thing and do that different thing. And we just said, hey, everyone, do you still like this format? It's not exactly what we started with, but it's where we are.

Starting point is 00:02:24 So we asked some questions. The first question I asked, which I have an interesting follow-up at the end here, by the way, is, is Python bytes too long at 45 minutes? That's roughly the time that we're going these days, probably about 45 minutes. And so I would say, got to do the quick math here. I would say 70, 65%. Let's say 65% are like, no, it's good. With a third of that being like, are you kidding me?

Starting point is 00:02:48 It could go way longer. I'm not sure we want to go way longer, but there are definitely a couple of people that think, yeah, it's getting a little bit long. So I would say probably 12% of people said it's too long. So I feel like it's actually kind of a decent length. And one of the things I thought, it's like, as we've changed this format, we've added things on, right? We added the joke that we started always doing at the end. We added our extra, extra, extra stuff. But the original format was the six items. You covered three, I covered three. Now it's two, two, and we got Matt here to help out with that.

Starting point is 00:03:18 So what is the length of that? And it turns out that that's pretty much the same length still. So the last episodes, 39 minutes, 32 minutes, 35 minutes, 33 minutes, that's how long are our main segments up to the end of the minute. So it's kind of like for people who feel it's too long, I want to just sort of say like, feel free to just delete it. Like you hear the six items, like delete it at that point. If you don't want to hear us ramble about other things that are not pure Python, you don't hear us talk about the joke or tell jokes, no problem. Just stop. It's at the end for a reason. So if you're kind of like, all right, I'm kind of done, then be done. That's totally good. We'll put the important stuff up first.

Starting point is 00:03:55 The other one was, do you like us having a third co-host like Matt or Shell or whoever it is we've had on recently? And most people love that format. Or, you know, it's okay. So that's like, I think that that's pretty good. I do want to read out just a couple of comments as well. There's stuff that you always get that like, you just can't balance it. A couple of people are saying like, you just got to drop the joke.

Starting point is 00:04:18 Like, don't do that. The other people are like, the joke is the best. Who doesn't want to stay for that? So, you know, like, well, again, it's at the end. So you can do that. But I also just wanted to say thank you, everybody. They wrote a ton of nice comments to you and me at the end of that Google forum.

Starting point is 00:04:34 So one is, I can't tell what counts as an extra or normal, but it's fine. I love it. Fight on Bites is such an excellent show. Fun way to keep current. Brian is awesome. Oh, good. I asked my daughter to submit that. She did good. I think your third guest, having a third guest is great. Like I

Starting point is 00:04:52 said, drop the jokes, keep the jokes for sure. Ideal. So anyway, there's a bunch of nice comments. I think the other thing that I would like to just speak to real quick and get your thoughts on, and maybe you as well, Matt, because you've been on the receiving end of this a lot, is us having the live audience, right? I think having a live audience is really interesting. I also want to just acknowledge, like we knew that that would be a slight drift of format,

Starting point is 00:05:18 right? So if you're listening in the car and there's a live audience comment, it's kind of like, well, but I'm not listening to it live. That's kind of different, but I think it's really valuable. One time we had four, maybe four Python core developers

Starting point is 00:05:30 commenting on the stuff we were covering. Like that's a huge value to have people coming and sort of feeding that in. So for me personally, I feel like it's, yeah, it's a little bit

Starting point is 00:05:40 of a blend of formats, but I think having the feedback from the audience, especially when people are involved in what we're talking about, I think that's worth it. Brian, what do you think? Well, we, we, we try not to, uh, to let it interrupt the flow too much, but there's some great stuff. Like if somebody, uh, if we say something that's just wrong, somebody will correct us. And that's, that's nice. Um, the other thing is, uh, sometimes somebody has a great question on a topic that we should

Starting point is 00:06:07 have talked about, but we didn't. We didn't, right? We don't know everything. We certainly don't. So I do want to add one more thing. There was a comment like, hey, we as hosts should let the guests speak. We should be better interviewers. This is not an interview format.

Starting point is 00:06:28 Talk Python is a great interview format. That's where the guest is featured. TestingCoodle is a great interview format where the guest is featured. This is sort of just three people chatting. It's not really an interview format. And we always tell the guests to interrupt us and they just, they don't much. Yeah. Yeah. So Matt, what do you think of this live audience aspect? Like, do you feel like that tracks or is it good? Well, yeah, first of all, I think I'm, I'm, I'm glad that people generally like having a guest. Otherwise this would have been very awkward. But no, I do like it.

Starting point is 00:06:56 I think. Where'd Matt go? Oh, he must've disconnected. There was one. Occasionally there is a kind of a, a little bit of a disruption, but I think in general, it's been great. Yeah. I've definitely been listening when times when, you know, a bunch of people are chiming in

Starting point is 00:07:09 because there's always, as you know, that you mentioned a GUI library and then there's about 12 other options that you may not have covered. Instead of waiting 12 weeks, you could just get them right out. Um, so I think that's great. And I'm, I'm generally a audio listener. I listen when I'm walking my dogs, but, but I love having the video because when I am great. And I'm, I'm generally a audio listener. I listen when I'm walking my dogs, but, but I love having the video because when I am very, when I'm interested in something, I can go hop to it right away and see what you're showing, which I really like. So.

Starting point is 00:07:33 Yeah. Awesome. Thank you. Two other things that came to mind. Someone said it would be great if there's a way where we could submit like ideas and stuff like that for guests and whatnot. Yeah. Right here at the top in our menu it says submit so please uh reach out to us on twitter send us an email do submit it there the other one was uh if we could have time links like if if you go to the the to listen and at some certain time a thing is interesting that's mentioned be cool if you could like link at at a time if you look in your podcast player it has chapters and each chapter has both a link and a

Starting point is 00:08:10 time so uh like the thing that brian's going to talk about next interpreters if you want to hear about that during that section in your podcast player you can click the chapter title and it will literally navigate you to there so it's's already built in. Just make sure you can see it in your device. Yeah. All right. I think that's it for that one. But yeah, thank you for everybody who had comments and took the time.

Starting point is 00:08:33 Really appreciate it. Yeah, and just the comment, if you want to be a guest, just email on that form and you might be able to do it. That's right. That's right. Yeah. Great to have you here.

Starting point is 00:08:44 Actually, I didn't want to talk about interpreters. No, that's me. Oh, wait. That's right. That's right. Yeah. Great to have you here. Actually, I didn't want to talk about interpreters. No, that's me. Oh, wait, you're right. Well, you're talking about it now because I've changed. No, let's talk about Adder. Sorry. I saw the wrong screen. Go for it. Apparently we're not professional here, but no, it's okay. I wanted to talk about Adder's. We haven't really talked about it much for a while because lots of reasons, but adders is a great library and it just came out with adders, came out with a release 21.3.0, which is why we're talking about it now.

Starting point is 00:09:16 And there's some documents, there's a little bit of change, there's some changes and some documentation changes. And I really, in an article I wanted to cover. So one of the things you'll see right off the bat if you look at the overview page of the Adder's site is it's highlighting the define decorator.

Starting point is 00:09:35 It's a different kind of way that if you've used Adder's from years ago, this is a little different. So there was a different API that was added in the last release, or in one of the previous releases, and now that's the preferred way. So this is what we're calling modern adders. But along with this, I wanted to talk about an article that Hinnick wrote about adders adders and it's a little bit of a history and I really love this discussion. So, and I'll try to quickly go through the history. Early on, we didn't have data classes, obviously we had,

Starting point is 00:10:18 we could handcraft classes, but there were problems with it. And there was a library called characteristic which i didn't know about this was this was uh before i started looking into things um that and then glyph and hinnick in in 2015 were discussing it ways to change it and that begat the old original adders um interface and there were things like adder.s and adder atrib that were partly out of the fact that the old way of characteristic attribute was a lot of typing. So they wanted something a little shorter. And then it kind of took off. Adders was pretty popular for a long time, especially fueled by a 2016 article by Glyph called one python library everyone needs uh which was a great

Starting point is 00:11:06 uh this is kind of how i learned about it um and then uh there was a you know different kind of api that we were used to for adders and it was good and everything was great and then in 2017 uh guido and hinnick and eric smith talked about um in the PyCon 2017, they talked about how to make something like that in the standard library. Uh, and that came out of that came PEP 557 and data classes and data classes showed up in, uh, in Python three, seven. Um, and then, so what then a dark period happened, which was people were like, why do we need adders anymore if we have data classes? Well, that's one of the things I like about this, this article. And then there's an attached article that is called Why Not?

Starting point is 00:11:56 Why not? Why not data classes have always been a limited set of adders adders was a is a super set of functionality and there's a lot of stuff missing in data classes like uh like equal equality customization and validators validators and converters are very important if you're using a lot of these um um and then also people were like well data class is kind of a nicer interface right well not anymore um the uh the pound defines pretty or the app defines really nice this is a really easy interface now to work with so anyway yeah and it has typing and it has has typing. And I'm glad he wrote this because I kind of was one of those people of like, am I doing something wrong if I'm using data classes?

Starting point is 00:12:53 Why should I look at adders? And one of the things, there's a whole bunch of reasons. One of the things that I really like is adders has slots. The slots are on by default. So you kind of define your class once instead of keeping it growing, whereas the default Python way in data classes is to allow classes to grow at runtime, have more attributes. But that's not really how a lot of people use classes. So if you came from another language where you have to kind of define the class once

Starting point is 00:13:24 and not at runtime, Adders might be a closer fit for you. I like it. And it's whether you say at define or at data class, pretty similar. Yeah. Yeah, Adders is really cool. I personally haven't used it, but I've always wanted to try it. We're using FastAPI and Pydantic, so I've really come to like that library.

Starting point is 00:13:42 But Adders is something that looks really full-featured and nice. Definitely something I want to pick up. Yeah, that's cool. And Pydantic also seems very inspired by data classes, which I'm learning now. I suspected now learning that is actually inspired by adders and they kind of sort of leapfrog each other in this same trend, which is interesting. Yep. So yeah, cool. Good one, Brian. Matt, I thought Brian was going to talk about this, but you can talk about it. This would be me, yeah. So this one's not strictly Python related, but I think it's very relevant to Python. So I mentioned earlier, I came from a non-CS background and I've always, I've just been going down the rabbit hole for about 10 years now, trying to understand everything and pick it up and really connect the dots between how do these very flexible

Starting point is 00:14:29 objects that you're working with every day, how do those get actually implemented? And so the first thing I did, if you heard of this guy, Anthony Shaw, I think he's been mentioned once or twice. He wrote a great book. Shout out, Sea Python Internals. Really like that book. Anthony's out in the audience. He even says, happy New Year's He even says happy New Year's.

Starting point is 00:14:45 Hey, happy New Year's. So this book is great if you want to learn how C Python's implemented. But because I don't have a traditional CS background, I've always wanted, you know, I felt like

Starting point is 00:14:55 I wanted to get a little bit more to the fundamentals. And I don't remember where I found out about this book, but Crafting Interpreters, I got the paperback here too. I highly recommend it. It's a implementation of a language from start to finish. But Crafting Interpreters, I got the paperback here too. I highly recommend it.

Starting point is 00:15:09 It's an implementation of a language from start to finish. Every line of code is in the book. It's a dynamic interpreted language, much like Python. But I really like how the book is structured. So it was written over, I think, five years in the open. I think the paperback may have just come out last year, but you walk through every step from tokenization, scanning, building a syntax tree, and all the way through the end.

Starting point is 00:15:32 But what I really like about it is you actually, you develop two separate interpreters for the same language. So the first one is written in Java. It's a direct evaluation of the Absec syntax tree. So that was really how I got a lot of these bits in my head about what is an abstract syntax tree? How do you start from there? How do you represent these types? But the second part is actually very, where I think it becomes really relevant for Python because you, the second part is written in C. It's a bytecode virtual machine with garbage collection. So it's not exactly the same as Python. But if you want

Starting point is 00:16:05 to dig down into how would you actually implement this with the types that you have available for UNC, but get something flexible, much like Python, I really recommend this. So again, it's not directly, there's some good side notes in here where he compares different implementations between different languages like Python and JavaScript,cript etc ruby but i really like this book i devoured it during my time between jobs and um yeah i keep telling everyone about it so i thought it would be good for the community to hear yeah nice yeah i didn't study this stuff in college either i mostly studied math and things like that and And so understanding how virtual machines work and all that is just how code executes. I think it's really important. You know, it's,

Starting point is 00:16:50 it's not the kind of thing that you actually need to know how to do in terms of you got to get anything done with it. But sometimes your intuition of like, if I asked the program to work this way and it doesn't work as you expected, expect and maybe understanding that internal is like oh it's because the it's really doing this and oh everything's all scattered out on the heap and i thought numbers would be fast why are numbers so slow but okay i understand now yeah i i really like the i mean it answered a lot of questions for me like how does a hash map work right that's a dictionary in python what is a stack why would you use it what is that when you do a disassemble and you see bytecode what does that actually mean right um i really i really enjoyed it and he's got

Starting point is 00:17:30 a really great um books open source it's got a really great build system if you're interested in writing a book it's very cool how the adding lines of code and things like that are all embedded in there and he's got tests um written for every part where you add a new bit to the code. There's tests written and there's ways where he uses macros and things to block them out. It's pretty interesting. Nice. Testing books. That's pretty excellent. Yeah. So Matt, now being at Anaconda, that world, the Python world over in the data science stack and especially around there, has so much of like, here's a bunch of C and here's a bunch of Python, and they kind of go together.

Starting point is 00:18:09 Does this give you a deeper understanding of what's happening? Yeah, for sure. I think CPython internals gave me a really good understanding a bit more about the CAPI and why that's important. I'm sure you know, and the listeners may know like the binary compatibility is really important um between the two and dealing with locking and the global interpreter lock and everything like that um so it's definitely given me a better conceptual view of how these things are working as you mentioned i don't you don't need to know it necessarily on a day-to-day basis but i've just found that it's given me a much better mental model. Having an intuition is valuable. Yeah. A quick audience feedback. Sam out in the live audience

Starting point is 00:18:50 says, I started reading this book over Christmas day and it's an absolute joy. So yeah, very cool. One more vote of confidence for you there. Cool. Brian, we ready for my next one? Yes, definitely. A little Yamale. Yeah, I'm hungry. So this one is cool. It's called Yamali or Yamali. I'm not 100% sure, but it was suggested by Andrew Simon. Thank you, Andrew, for sending this in. And the idea of this is we work with YAML files that's often used for configuration and whatnot. But if you want to verify your YAML,

Starting point is 00:19:27 right, it's just text. Maybe you want to have some YAML that has a number for a value, or you want to have a string, or maybe you want to have true false, or you want to have some nested thing, right? Like you could say, I'm going to have a person in my YAML. And then that person has to have fields or values set on it, like a name and an age. With this library, you can actually create a schema that talks about what the shape and types of these are, much like data classes. And then you can use YAML to say, given a YAML file, does it validate?

Starting point is 00:20:02 Think kind of like Pydantic is for JSON. This is for YAML, except it doesn't actually parse the result out it just tells you whether or not it's it's correct isn't that cool i think it looks neat um yeah yeah so it's it's uh pretty easy to work with uh obviously requires modern python it has a cli version right so you can just say yamali give it a schema give it a file and it'll go through and check it. It has a strict and a non-strict mode. It also has an API. So then to use it, just say Yamali.validate schema and data, either in code or on the CLI. And in terms of schemas, like I said, it looks like data classes. You just have a file like name colon str, age colon int,

Starting point is 00:20:43 and then you can even add additional limitations, like the max integer value has to be 200 or less, which is pretty cool. Then also, like I said, you can have more complex structures. So for example, they have what they call a person, but then the person here, actually, you can nest them. So you could have like part of your YAML could have a person in it,

Starting point is 00:21:03 and then your person schema could validate that person. So very much like Pydantic, but for YAML files, like here, you can see, scroll down, there's an example of, I think it's called recursion is how they refer to it. But you can have like nested versions of these things and so on. So if you're working with YAML and you want to validate it through unit tests or some data ingestion pipeline or whatever i just want to make sure you're loading the files correctly then you might as well hit it with some yamali i'm guessing one of the things i like about stuff like this is that um things like yaml files sometimes people just sort of edit it in in the git repo uh instead of making sure it works first. And then having a CI stage that says, hey, making sure the YAML's valid syntax is pretty nice

Starting point is 00:21:53 so that you know it before it blows up somewhere else with some weird error message. Yeah, exactly. Yeah, this is really cool. Validation of these types of input files, especially YAML files, is really tough, I've found, just because it's indentation-based. And whitespace is not a bad thing, obviously,

Starting point is 00:22:11 but for YAML, it's tough. I can't tell you how many hours I've banged my head against the wall in a past life trying to get Ansible scripts to run and things like that. So this is really neat. Anytime I see something like this, I just wish that there was one way to describe those types somewhere,

Starting point is 00:22:27 like preferably in Python, just because I like that more. But this is really cool. Yeah, I wouldn't be surprised if there's some kind of pydantic mapping to YAML instead of to JSON, and you can just kind of run it through there. But yeah, I think this is more of a challenge

Starting point is 00:22:41 than it is, say, for JSON, because JSON, there's a validity to the file regardless of what the schema is where yaml less so right like well if you didn't indent that well it just that means it belongs somewhere else i guess you know it's a little a little more free form so i guess that's why it's popular but also nice to have this validation so yeah thank you for andrew thank you to andrew for sending that in Yeah. So next I wanted to talk about Pimpler, which is a great name. And I, I honestly can't remember where I saw this. I think it was a post or something by Bob Belderbos or something he wrote on PyBytes.

Starting point is 00:23:18 I'm not sure. Anyway, so I'll give him credit. Maybe it was somebody else. So if it was somebody else, I apologize. But anyway, what is Pimpler? Pimpler is a little tiny library, which has a few tools in it. And it has one of the things it says is one of the things I saw. It does a few things, but it measures, monitors, and analyzes memory behavior in Python objects. But it's the memory size thing that was interesting to me. So you've got, like, for instance,

Starting point is 00:23:52 it has three tools built into it, asizeof and muppy, which is a great name, and class tracker. So asizeof provides a basic size information for one or a set of objects. And muppy is a monitoring. I didn't play with this.

Starting point is 00:24:11 I didn't play with the grass class tracker, either class tracker provides offline analysis of lifetimes of Python objects. Maybe if you've got a memory leak, you can see like there's a hundred downs of my hundreds of thousands of this type. And I thought I only had three of them. Yeah.

Starting point is 00:24:26 And so one of the things that I really liked of, uh, with a size of is it's, it, I mean, we already have, um, uh, SIS get size of in Python, but that just kind of tells you the size of the object itself, not of the, um, like later on. So a size of, we'll tell you not just what the size of the object is, but all of the recursively, it goes recursively and, and, uh, looks at the size of all the stuff that it contents of it. So, right. And people haven't looked at this, you know, they should check out Anthony's book, right? But if you've got a list and say the list has a hundred items in it and you say, what is the size of the list? The list will be roughly 900 bytes because it's 108 byte pointers plus a little bit of overhead.

Starting point is 00:25:12 Those pointers could point at megabytes of memory. You could have 100 megabytes of stuff loaded in your list. And if it's really only 100, like, no, that's 900 bytes, not 800 megabytes or whatever. Right. So you really need to if you actually care about real whole memory size, you got to use something like Asize. It's cool that this is built in. I had to write this myself and it was not as fun.

Starting point is 00:25:32 Yeah, this is awesome. I also, I hit this sometime in grad school, I remember when I was at a deadline or something. And just, I hit the same thing about the number of bytes in a list being so small and just writing something that was hacky to try to do the same thing. But to have it so nice and available is great. And the name is awesome. I love silly names.

Starting point is 00:25:55 Yeah, for sure. One of the examples, and I was confused. The example we're showing on the screen is just, you've got a list of a few items. Some of it's text. Some of it's a text to some of them are integers and some are lists of integers or tuples of integers and being able to go down and do the size of everything. But then there's also a, you can get more detailed. You can give it a sized a size with, with a detail numbers. I'd have to look at the api to figure out what all this means but the example shows each element not just the total but each element what the size of the

Starting point is 00:26:32 different components are which is kind of cool but it lists like a flat size and i'm like what's the flat thing so i had to look that up and uh flat the uh flat size returns the flat size of a python object in bytes determined as the basic size. So like in these examples, it's like the tuple is just a flat. The tuple itself is 32 bytes, but the tuple and its contents is 64. I see. So flat is like sys.get size of and size is a size of that bit. I think that's what it is uh but yeah not sure but that's

Starting point is 00:27:08 what i'm thinking for people who listen they don't see this you should check out the docs page right like a usage example because if you have a list containing a bunch of stuff you can just say basically print this out and it shows line by line this part of the the list was this much and then it pointed at these things each of those things is this big and it has constituents and and so on my theory is that the detail equals one is recurs one level down but don't keep traversing to like show the size of numbers and stuff yeah probably yeah cool yeah i love it this is great yeah all right i think it's over. Okay. So I'm going to talk about HVplot and HVplot.interactive specifically. So this is something I actually wasn't very aware of until I joined Anaconda. But one of my colleagues, Philip Roediger, who I know is on TalkPython at one point, is the developer working on this.

Starting point is 00:28:01 And basically, when you're working in the PyData ecosystem, there's Pand there's pandas and X array and tasks, there's all these different data frame type interfaces, and there's a lot of plotting interfaces. And there's a project called hollow views, or HV plot, which is a consistent plotting API, or that you can use. And, and the really cool part about this is, you can swap the back end. So for example, Panda's default plot will use.plot and it'll make a matplotlib. But if you want to use something more interactive like Bokeh or HoloViews, you can just change the backend and you can use the same commands to do that.

Starting point is 00:28:37 So that's really neat. Oh, that's cool. And you set it on the data frame. Yeah, yeah, exactly. So what you do is you import hvplot.pandas and then on the data frame, if you change the backend, what you do is you import HV plot.pandas. And then on the data frame, if you change the backend, you just do data frame dot plot. Um, and there's a bunch of kind of, you know, rational defaults built in for how it would show the different columns in your data frame, um, versus the index. And then you can,

Starting point is 00:28:59 I like that cause you could swap out the plots by writing one line, even if you've got hundreds of lines of plotting stuff, right? And it just picks it up. Exactly. Yeah. And, and the common workflow for a data scientist is you got, you're reading it a lot of input data, right? Then you want to transform that data.

Starting point is 00:29:15 So you're doing generally a lot of method. Painting is a common pattern where you want to do things like filter and select a time and maybe pick a drop a column and do all kinds of things right at the end you either want to show that data or write it somewhere or plot it which is very common um now this interactive part um philip demoed this or he gave a talk at pi data global about two months ago i think um it kind of extends on that and this blew my mind when i saw it so um if you had a data frame like thing, and you put that interactive after it, then you can put your method chaining

Starting point is 00:29:50 after that. So this is an example where you say I want to select a discrete time, and then I want to plot it. And this is this particular example is not doesn't have a kernel running in the back end. So it's not going to switch. But if you were running this in an actual live notebook, it would be changing the time on this chart. And again, this is built to work with a lot of the big data type APIs that match the Pandas API. Nice. So for people listening, if you say.interactive

Starting point is 00:30:18 and then you give the parameter that's meant to be interactive, that just puts one of those I Python widget things into your notebook right there, right? That's cool. Yeah. So a related library is called panel, which is, it is for building dashboards directly from your notebooks. So you can, if you had a Jupyter notebook, you could say panel serve and pass in the notebook file and it'll make a dashboard. That's the thing I want to show in a second here. But the way the interactive works is really neat. So wherever you would put a number, you can put one of these widgets. And so you can have time selectors, you can have things like sliders and you can have input boxes and things like that. And all you do

Starting point is 00:31:03 is you would change the place where you put your input number, put one of those widgets in, and then it sort of, I actually don't know how it works exactly under the hood, but from what I understand, you put this interactive in, and then it's capturing all the different methods

Starting point is 00:31:17 that you're adding onto it. And anytime one of those widget changes, it will change everything from that point on. And so the demo here was from another panel contributor, Mark Skov-Matson. And I'm just going to play this and try to explain it. So we have a data pipeline on the right where we've chained methods together.

Starting point is 00:31:35 And what he's done here is he's just placed a widget as a parameter to these different methods on your data frame. And then this is actually a panel dashboard that's been served up in the browser. And you can see this is all generated from the little bit of code on the right. So if you want to do interactive data analysis or exploratory data analysis, you can really do this very easily with this interactive function. And when I saw this, I kind of hit myself in the head because normally my pattern here was I had a cell at the top with a whole bunch of constants defined and you know I would manually go through and okay

Starting point is 00:32:10 change the time start time from this time to this time or change this parameter to this and run it again and over and over you got to remember to run all the cells that are affected so the fact that the fact that you can kind of do this um interactively while you're working, I could see how this would just, you know, you don't break your flow while you're trying to work. And the method chaining itself is I really like, too, because you can comment out each stage of that as you're going and debugging what you're working on. So, yeah, this is really neat. And I definitely I put a link in the show notes to the actual talk um as well as this gist that mark scobb matt's input on github and um yeah it's it blew my mind it would have made my life a lot easier had i known about this earlier so um yeah and one of the important things i think

Starting point is 00:32:56 about plotting and interactive stuff is it's not even if your end result isn't a panel or an interactive thing, sometimes getting to see the plots, seeing the data in a visual form helps you understand what you need to do with it. Yeah, no, exactly. I mean, I did a lot of work in the past with time series data and time series data, especially if this was sensor data, you had a lot of dropouts um you might have spikes and and you're always looking at it and trying to make some judgment about your filter parameters and and being able to have that feedback loop between um changing some of those and seeing what the result is um is a huge game changer so yeah yeah and you you can hand it off to someone else who's not writing the code and say here you play with it and you you tell you know give it to a scientist or somebody that's exactly right. That's what panel's all about is the biggest challenge that I always had

Starting point is 00:33:50 and many data scientists have is you do all your analysis in a notebook, but then you got to show your manager or you got to show your teammates. And going through that trajectory can be very challenging. These new tools are amazing to do that. But that's how I turned myself into a software engineer, because that's what I wanted to do. But I went down the rabbit hole and learned Flask and Dash and how to deploy web apps and all this stuff. Well, I'm glad you did. Yeah. Maybe I wouldn't be here if I hadn't done that. But yeah, this is really cool. And I

Starting point is 00:34:22 definitely recommend people look at this. There was also another talk. Sorry, this is really cool. And I definitely recommend people look at this. Um, there was also another talk this, sorry, this is an extra, but, um, there was another talk at Pi Data Global, um, hosted by Jim, James Bednar, who's our head of consulting, but he leads PiViz, which is a community for visualization tools. And it was a comparison of four different, um, dashboarding apps. So it was Panel, Dash, Voila, and Streamlit. And they just had main contributors from the four libraries

Starting point is 00:34:50 talking about the benefits and pros and cons of all of them. So if anyone wants to go look at those, I definitely recommend that too. That sounds amazing. All those libraries are great. Nice. Thanks.

Starting point is 00:35:00 Oh, speaking of those extra parts of the podcast that make the podcast longer, we should do some extras. We should. We should do some extras. Got any? I don't have anything extra. Matt, how about you? Yeah, two things. So first, you can show my screen. Last year, I kind of hired the Piston developers. Piston is a faster implementation fork of CPython. I think it was at Instagram first. I can't recall. But anyway, right before the holidays, they released pre-compiled packages for a couple hundred of the most popular Python packages. So if you're interested in trying Piston,

Starting point is 00:35:38 I put a link to their blog post in here. They're using Conda right now. They were able to leverage a lot of the CondaForge recipes for building these. This is that binary compatibility challenge that we talked about earlier. So I know the team's looking for feedback on that. If you want to try that, feel free to go there. And it mentions in the blog that they're working on PIP. That's a little harder too, just because of how the build stages for all the packages aren't centralized with PIP. So it's a little more challenging for them to do that. And then just the last thing is, I don't want to be too much of a salesman here, but we are hiring.

Starting point is 00:36:16 It's an amazing place to work. And I definitely recommend anyone to go check it out if they're interested. Fantastic. Yeah. And you put a link in the show notes if people want to. Yeah, it's anaconda.com slash careers. And we're doing a lot of cool stuff and growing. So if anyone's looking for work

Starting point is 00:36:32 in data science or just software and building out some of the things we're doing to try to help the open source community and bridge that gap. I spelled it wrong. Bridge that gap between enterprise and open source and data science in particular. Yeah, it definitely seems like a fun place to work.

Starting point is 00:36:48 So cool. People looking for a change or for a fun Python job. Yeah, we're remote first too. Yeah, cool. People do reach out to Brian and me and say, hey, I really want to get a Python job. I'm doing other stuff, but how do I get a Python job? Help us out.

Starting point is 00:37:02 So we don't know, but we can recommend places like Anaconda for sure. Yeah. It looks like there's about 40 jobs right now. And so pick it out. Fantastic. Oh, wow. That's awesome. All right. Well, Brian, would it surprise you if I had some extra things? It would surprise me if you didn't. All right. First of all, I want to say congratulations to Will McGugan. We have gone the entire show without mentioning rich or textual can you imagine but no only because i knew you were going to talk about this otherwise i would have thrown it in yeah so will last year a while ago

Starting point is 00:37:38 i don't know the exact number of months back but he's planning to take a year off of work and just focus on rich and textural. It was getting so much traction. He's like, I'm just going to live off my savings and a small amount of money from the GitHub sponsorships and really see what I can do trying that. Well, it turns out he has plans to build some really cool stuff and has actually all based around rich and textural in particular. And he has raised a first round of funding and started a company called Textualize.io. How cool is that?

Starting point is 00:38:12 Well, we don't know because we don't know what it's going to do. All you do is if you go there, it's like a command prompt. You just enter your email address. I guess you hit enter. If something happens, let's find out what happens. Yes, I'm confirmed. Basically, you just get notified about when textualize comes out of stealth mode. But congrats to Will. That's fantastic. Another one, we've spoken about tenacity. Remember that,

Starting point is 00:38:32 Brian? Yeah. So tenacity is cool. You can say, here's a function that may run into trouble. If you just put at tenacity.retry on it and it crashes, it'll just try it again until it succeeds. That's probably a bad idea in production. So you might want to put something like stop after this or do a little delay between them or do both. I was having a race condition. We're trying to track when people are attempting to hack TalkPython, the training site, the Python byte site and all that. And it turns out when they're trying to attack your site, they're not even nice about it. They hit you with a botnet of all sorts of stuff. And like lots of stuff happens at once. And there was this race condition that was causing trouble. So I put retry, tenacity.retry,

Starting point is 00:39:13 boom, solved it perfectly. So I just wanted to say, I finally got a chance to use this to solve some problems, which was pretty cool. That's really cool. The other one that's similar to this, which I've used, and I think, I don't know if you've used, Brian, but it's called PyTest Flaky. And it's awesome because I was working with this time series data historian. I had a bunch of integration tests in my last job, but, you know, network stuff, it would drop out occasionally. And so you can do very similar type things and wrap your test in an at flaky decorator and do similar type stuff and and you know give it three three tries or something before you make it fail yeah exactly that's cool that's what my i think mine does three tries and it's like randomly a couple second delay or something uh remember that part brian where we

Starting point is 00:39:57 talked about it's really cool if people are in the audience while we talk about stuff and then get a little feedback so will mcgougan says hey thanks guys can't wait to tell you about it yeah congrats will that's awesome glad to see you out there all right a couple of other things did you know that github has a whole new project experience that's pretty awesome have you seen this i haven't i haven't so you know how it's like this kanban board kanban board um where you have like columns you can move your issues between them so just last week they came out with a thing called a beta projects where it still can be that, or it can be like an Excel sort of view where you have little drop down combo boxes. Like I want to move this one at this column by going through that mode or as a board,

Starting point is 00:40:36 or you can categorize based on some specification, like show me all the stuff that's in progress and then give me that as an Excel sheet and all these different views you have for automation. And then like there's APIs and all sorts of neat stuff in there. So if you've been using GitHub projects to do stuff, you know, you can check this out. It looks like you could move a lot of a lot more work towards that on the project management side of software they used to. This is really neat. Yeah. In my previous job, I was using Azure DevOps. I was always wondering when some of those features might move to GitHub. I don't know if that's what happened here, but being able to have this type of project management in there for this type of things, it's really, really great. Yeah. Super cool. Yeah. One of the things I love

Starting point is 00:41:19 about stuff like this is because even, I mean, yes yes a lot of companies do their project management on or projects on in github or places like that but also um open source projects often have they're often have the same needs of project management uh as as private commercial projects so yeah yeah i personally i only have a few open source small projects that are kind of personal and no one would probably want to use them but even just keeping notes about to-dos and future stuff and it would be really nice yeah just for future you if nothing else right yeah awesome okay so this is cool now the last yeah this last thing i want to talk about is Markdown. So Roger Turrell turned me on to this. There's this new Markdown editor.

Starting point is 00:42:11 It's cross-platform. Yes, cross-platform called Typora. And we all spend so much time in Markdown that just, wow, this thing is incredible. It's not super expensive. And it looks like a standard Markdown editor. So you write Markdown and it gives you a whizzy wig you know what you see is what you get style of programming which is not totally unexpected right but what is super cool is the way in which you interact with it and actually i am going to show you real quick so you can you

Starting point is 00:42:41 can see it and then you can tell people like what do you think about this uh here i think that's it i'm back waiting there okay yeah so here here's here's mark here's a markdown file for my course just the practices and whatever you can say you know what i would like to view that in code style right well that's kind of cool we want to edit this you click here and it becomes who comes marked down becomes marked down that's but this is a boring file so let's see about it has a whole file system that navigates like through your other Markdown stuff, hierarchically. So like here, chapter eight's a good one. So we go over to chapter eight on this and now you can see some more stuff.

Starting point is 00:43:15 Like you can go to set these headings and whatnot, but if you go to images, like you can set a caption and then you could even change the image like right here. If it were a PNG, it's not, but so put it back as JPEG and then you could even change the image like right here if it were a png it's not but so put it back as jpeg and then it comes back you can come down and write a code fence um use the right symbol and you can say def a right whatever and then you pick a language isn't that isn't that dope oh this is so good so if if you end up writing a lot of markdown and if you need to get back you just um go back and switch back lot of markdown and if you need to get back you just um go back and switch back to raw markdown and then go back to this fancy style i think this

Starting point is 00:43:49 is really a cool way to work on markdown i'm actually working on a book with roger and uh it's got tons of markdown and it's been a real joy to actually use this thing on it so yeah does it have vi mode probably Probably not. I don't know about that, but it has themes. I can do like a night mode or I can do like a newspaper mode or, you know, take your pick. It's pretty cool.

Starting point is 00:44:15 The weirdo grad student in me is upset that this isn't LaTeX. It has built-in LaTeX. You can do inline LaTeX and there's a bunch of settings you can set yeah you can do like inline latex and you can there's a bunch of settings you can set for the latex it's got a whole um a whole math section in there oh that's sweet okay yeah let's see so am i the only person that went all the way through college pronouncing it latex i did too but i just learned that the cool way of saying that yeah yeah it's

Starting point is 00:44:43 french no i don't know no yeah it has it has support for like chemistry settings like inline latex and math and all sides of good stuff so yeah it's it's i'm telling you this thing's pretty slick so all right well i gotta do my screen share back because so you all can see the joke because the joke is very good and we're gonna cover it where's but it's at the end it's at the end so if people don't want to listen to the joke they don't have to yeah brian i blew it you did i blew it i blew it uh before we move off the markdown thing though anthony shaw says editorial for iphone ipad is really nice too um cool so but let's do let's do the joke so i i blew it because i was saving this all year i saw this like last March and I'm like,

Starting point is 00:45:25 this is going to be so good for Christmas. Yeah. And then we kind of like had already recorded the episode. We're not going to do it. We'll just take a break over. So we didn't have a chance to do it. So let's do it now. People are going to have to go back just a little tiny bit for this one.

Starting point is 00:45:37 Are you ready? Yes. Matt, you ready? Yeah. So this goes, this sort of a data database developer type thing here and uh it's on a i don't know why it's on a printout anyway it's called sql clause as in sql clause so it's he's making a database he's sorting it twice select star from contract contacts where behavior equals nice sql clause is coming town nice it would have been so good for

Starting point is 00:46:07 christmas but i we can't keep it another year i gotta get out you gotta sing it sequel clause is coming to town yep exactly okay i want to share a joke that i don't have a picture for all right do it but but my daughter made this up last week. I think she made it up, but it's just been cracking me up for, and I've been telling it to everybody. So it's a short one. Imagine you walk into a room and there's a line of people all lined up on one side. That's it. That's the punchline.

Starting point is 00:46:37 I love it. Nice. We've got it. We had my, uh, we had my, my cookie candle last time nice my uh i can't know these cookies we've got a dad joke of the day channel in our slack at work and it's it makes me oof every time nice nice okay all right uh nice to see everybody thanks matt for joining the show thank you for having me good to see you michael again as always yeah good to see everybody. Thanks, Matt, for joining the show. Thank you for having me. Good to see you, Michael, again, as always. Yeah, good to see you.

Starting point is 00:47:07 Thank you. Thank you. Thanks for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. Get the full show notes over at PythonBytes.fm. If you have a news item we should cover, just visit PythonBytes.fm and click submit in the nav bar.

Starting point is 00:47:24 We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click live stream to get notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Python Bytes - #265 Get asizeof pympler and muppy

Topics covered in this episode: * Survey results* Modern attrs API Yamele - A schema and validator for YAML pympler Extras Joke See the full show notes for this episode on the website at pythonb...ytes.fm/265

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.