Python Bytes - #266 Python has a glossary?

Episode Date: January 13, 2022

Topics covered in this episode: Python glossary and FAQ Any.io Vaex : a high performance Python library for lazy Out-of-Core DataFrames Django Community Survey Results * Extra, Extra, Extra, Extra:...* Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/266

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 266, recorded January 12th, 2022. I'm Michael Kennedy. And I'm Brian Ocken. So great to be here again. And we had this whole survey about having guests, Brian. And this week, we don't have a guest. It's just you and me, which I think is cool. That's all right. It's good.
Starting point is 00:00:24 People out there listening, if they really want to be a guest they can uh shoot us a message for now we've got so much cool things so many cool things to speak about we're gonna need like a glossary or an faq or i mean something yes well um actually i don't know how i missed this uh along for so long but um there was a a tweet by um who was it? Trey Hunter had a tweet that mentioned and actually referred to the glossary. And I'm like, what? We have a glossary? I never checked it out before. So on the python.org website, there at docs.python.org, there's a glossary and it's actually pretty cool there's a whole bunch of stuff like if you if you forget what abstract places base classes are it's there and there's so there's
Starting point is 00:01:11 python stuff there's programming stuff and even defines what the three arrows mean yeah like the the three errors that's the first one default python prompt but also the dot dot dot what is that the ellipses. And two to three. See, this threw me off once when I first started. I was like, what's this two to three thing? Is this a third party package? And it wasn't obvious to me that it was built in.
Starting point is 00:01:35 So that's kind of neat. But it shouldn't be an issue anymore because everybody's on Python 3 now, right? So anyway, so the glossary, just a shout out that this is here it's fun so check it out um the other thing that this it refers to the other documentation in python a lot um and uh one of the things it refers to sometimes is the faq and also didn't know that was there and we have an faq yeah um and it's split into a whole bunch of stuff like general Python and programming and history and design and stuff. And I ran across it because one of the things I looked up when I followed from the glossary was this question of what's the difference between arguments and parameters? And it's something that I've always messed up.
Starting point is 00:02:23 And now I think I have it um parameters are the names of things that appear in the function definition and the arguments are the values get passed in neat um don't know why sometimes people use them interchangeably but they kind of talk about different um ways of working with that data yeah but like let's say you're new um either new to python or new to programming. Some of these perusals, some of these are great things. Why did my changing list Y also change list X?
Starting point is 00:02:54 Well, this will help you understand why there's the naming system in Python and stuff like that. So it's pretty great. Yeah, it talks about references and all sorts of stuff. Yeah, quite cool. I didn't know that we had it, but yeah, that's, that's cool. You did know it was there? No, I did not know it was there. That's great.
Starting point is 00:03:12 Yeah. I mean, I didn't, I didn't know anything about it. So I want to talk about something else. I want to talk about any IO as I'm sure you and a lot of listeners know, I'm a big fan of async IO and async in a wait. I think it really unlocks a lot of potential when you're waiting on things. There's been a lot of analysis saying, oh, I did this computational thing and it didn't make it any faster. It made it slower. It's like, yeah, because it only scales waiting and you're not waiting. So when you're talking about waiting, it usually has to do with IO with external systems, right? I'm waiting on the file system. I'm waiting on the database.
Starting point is 00:03:45 I'm waiting on whatever. So there's this cool library called Any.io. So I indirectly learned about this from Sebastian Ramirez from FastAPI because he talked about this thing called Asyncr, which extends a few things that are ultimately probably going to make it back to Any.io. So Any.io is an asynchronous networking and concurrency library that works on top of either Async.io, which is the one we all know and love, or Trio, which is similar to Async.io, but it has a larger, it has more of an understanding of dependencies between tasks and things, how you can say, I'm going to create a set of work that
Starting point is 00:04:25 is made up of these tasks. And this task is actually a child of that other task. So if I cancel the top level one, cancel its children, it's a little bit more complicated, but it solves this structured concurrency story that people sometimes need. So you can use this to get some libraries that will do nice things with stuff you might wait on, right? So some of the features include there's task groups. That's the thing I was describing with parent-child relationship type of things. With Trio, it has high-level networking, TCP, UDP, an API for byte streams and object streams, inter-task synchronization and communication, like locks and conditions and events and semaphores, worker threads, sub-processes, all kinds of stuff.
Starting point is 00:05:09 So you go over and you can sort of see some real simple ways for it to run. So one of the things that's sometimes not entirely obvious is how do you run something on AsyncIO? Because you've got to make sure you've got an Async IO event loop running. And if there's already one, you should call get loop. But if it's not one, you should create one.
Starting point is 00:05:29 And so this is just, you know, I have an async method, which can be a task and just say, you know, trio.run. Or you can run and just say the back end is trio, which is pretty cool. So all sorts of cool stuff like that. And it just sort of simplifies working with these different things. If we go and look at the sockets example, you can just say await async with await connect TCP. And that's allow you to do like await receive, await send, and so on. So some nice libraries that come out of NAIO for doing TCP, UDP, all that kind of stuff.
Starting point is 00:06:03 You know, the things you would wait on. Yeah. So if you know you're going to use async IO, would this buy you anything? I think that it has those additional higher level libraries for like talking to TCP and byte streams and stuff like that. And also the subprocess thing.
Starting point is 00:06:23 So I think it does have like some utility stuff on top of it but it's pretty cool you can say like a wait run sub process which is pretty cool that's actually that's really cool yeah that's i've not seen this one before and that one kind of makes me excited now yeah that's cool yeah nice cool so not a whole lot more to say about it than that but if those are the types of things you're doing then you know come check it out it's a cool library do you know what else is cool? I do not. Tell me about it.
Starting point is 00:06:49 Oh, I thought we were doing something else. Wait. Oh, yes. I've got one more thing to talk about before we move on because we have a different number of things. I'm not sure what we're slating. I'll slot it in here. So what else is cool is that this episode is brought to you by Datadog.
Starting point is 00:07:03 Thank you, Datadog, for supporting the show. They've been big supporters of Python Bytes for a really long time. So that's fantastic. Plus really great t-shirts. Exactly. They've got cool t-shirts. I mean, I definitely want to get one of those.
Starting point is 00:07:16 So Datadog does a lot of things. One of their things they're focusing on now is real-time monitoring. So they have a real-time monitoring platform that unifies metrics, traces, logs into one tightly integrated platform. Their APM empowers developers to identify anomalies, resolve issues, and improve application performance. We just finished the TalkPython episode talking about running production and everyone there on the panel was like, you need to make sure you're monitoring in production for things that change
Starting point is 00:07:45 in your performance profile, because you get too much data as your infrastructure changes, as the way your app is being used changed. It could hit these scenarios and run into problems that you would just never see in testing. So if you had Datadog APM, you would have caught it. So you can begin collecting stack traces, visualize them as flame graphs, organize them into profile types, such as these are the CPU metrics, these are IO and so on. Teams can search for specific profiles, correlate them with distributed traces if you're doing microservices and identify slower underperforming code for analysis and optimization. Plus with Datadog's APM live search, you can perform searches across the full stream of
Starting point is 00:08:25 ingest traces generated by your app over the last 15 minutes. So try Datadog APM for free with a 14 day trial. And if you do, you get that t-shirt that Brian mentioned. So just go to pythonbytes.fm slash Datadog or click the link in your podcast player show notes or in this chapter. Remember we talked about chapters and links. I'll have this have a chapter as well so thank you datadog for supporting the show now let's talk about your next item brian yeah i think it's vax vax vax i don't know oh um there's there's people are gaining traction for the idea of putting a pronunciation on a github repo for projects that are not obvious. I saw this on Twitter.
Starting point is 00:09:06 Let's do it. Let's make it happen. So this was suggested by Glenn Ferguson. This is a library that's a high-performance Python library for lazy, out-of-core data frames. Hmm. I don't know what out-of-core is. So I looked it up in a glossary.
Starting point is 00:09:26 After the FAQ. Yeah. Out of core typically refers to processing data that is too large to fit in the computer's memory. So, yeah, that's what this is. So for data processing, often you're trying to do some analysis, do some statistics, maybe explore the data a little bit, but you don't want to read it because they're huge data sets and you've got like maybe a limited computer. And so that's what this is set up to do. The main features of it, so you've got like big data sets, it has statistics like mean and sum and count and standard deviation, etc. But it also has some visualizations that are sped up from how they've sped things up and not kept things in memory.
Starting point is 00:10:11 And they're using memory mapping and some tricks inside to try to avoid any memory copies and try to do it as lazy as computation as possible. And this is actually pretty impressive. I was watching some of this, some of the demos. So there's a SciPy 2019 video where it's the person that started this library, which is now a company also, but does a demo of this. And it's really impressive how fast things are. He's pulling things up. Because of the memory mapping, you can even have multiple, you know, multiple Jupyter notebooks.
Starting point is 00:10:53 Yeah, that's it, multiple Jupyter notebooks looking at the same huge data set, and it doesn't slow things down even when things are working on it. It's pretty neat. So I definitely think this is worth checking out one of the things on the readme that i like is the key features so it's a instant opening of huge data files because it's memory mapping the data file it actually doesn't read it doesn't do any reads when you read it but when you pull some data out it does lazy reads um jumps ahead and it's it's pretty impressive so uh this also has an expression system so that it's kind of there is a
Starting point is 00:11:28 little bit of a, so you can lazy transforms of data. So that's neat out of core data frames, like we said, fast group by an aggregations a whole bunch of the, the fast and efficient joins are interesting. I was watching looking at another comparison of pandas and dask and other things versus vax and it uh the joins of huge tables are pretty fast and seamless with here and those will blow up some projects so um yeah this is yes it is similar to Dask. Somebody asked, lazy like Dask? Yes, but.
Starting point is 00:12:07 That's a good thing. Yeah. Oops. But it, yeah, a bunch of fun things. It's good to have, it isn't the same as Dask, so it's worth checking out to see if maybe this one might be a good fit for you. Yeah, it's cool.
Starting point is 00:12:23 It's the lazy that makes the magic, right? You don't have to load it all from disk. You can distribute it. There's all kinds of interesting things. In the a billion sample per row operations per second, that sounds pretty good. Yeah. Watching the demo, it's incredible how fast he's
Starting point is 00:12:41 popping up things and loading, even to be able to visualize things by pulling out samples out of the set. Wait a minute, Brian. I heard people told me that Python was slow, so it didn't make sense to do this kind of stuff with it. What's going on here? No, no, no. Python's fast. I know. Pick the right libraries.
Starting point is 00:13:02 All right. One of the things that is definitely well known in the Python world is Django. I've even had people tell me I came to become a Django developer. And so I had to learn Python, which is a really interesting perspective. So I want to talk about the Django developer survey results for the 2021 survey because that just recently happened. So I'll highlight a couple of things that are interesting over here. One of the questions was, what is the main reason you use Django? Is it both for work, personal or for work? Only 15% said just for work. Does that seem like a lower number than you expect? Yeah. Yeah. I thought more people would just like, they'd go
Starting point is 00:13:37 to work and do Django and they'd go home and they'd, I don't know, watch Game of Thrones or something. But Django developers love it. And they use it a lot for all sorts of things. So by far, the biggest group here, 66% is for using it for both. So that stood out to me. Another one that's interesting is how many people are on the latest version. So web apps often sort of get stuck in the past because once you get them up and running, people don't want to touch it. But 75% of the people are using the 3.2, which at the time of asking, I believe was the latest version.
Starting point is 00:14:09 Okay. I'm like, I thought we were up to four now. What's going on there? Four is in beta. I'm not 100% sure. I don't think it's totally released, but yeah, this is still, remember it's from 2021, 2021. And then also Django has this concept of the latest stable release and then a long-term support release. So if you this concept of the latest stable release and then a long-term support release. So if you go to just the latest stable release and it's not LTS, you may have to upgrade sooner if you want security fixes and so on. And yet 71% of the people use the latest stable release because they're upgrading frequently,
Starting point is 00:14:38 I'm guessing. And then 27% are on the latest LTS and 2% are just like, how do I upgrade this again? I don't know, but that's, that's pretty interesting. And then the next question was how often, so 44% of the people upgrade every stable release, other people less so, and it kind of breaks down 5%. I use an unsupported version of Django. I'm okay with that. Databases for people doing Django is a very strong bias to use a relational database because much of the magic of Django depends upon the Django models, right? Like the admin section
Starting point is 00:15:11 is driven by that and so many things, and those are all relational. So with that in mind, the most common database, 77% of the time is PostgreSQL, which is cool. And then does number two there surprise you, Brian no not really sqlite yeah if you got very simple deployment stories you're just going to put it on one server um not much data you just want to need something relational sqlite well a lot of internal tools and stuff too exactly i mean you wouldn't run like a major tech company on sqlite get away with it without scars and tears but you know for simple internal apps that might just be what you need you're gonna make some
Starting point is 00:15:50 sqlite enemies by saying that but yeah but but if you had a hundred thousand users concurrently using sqlite that might be bad oh somebody else said um uh possibly because sqlite is the default setting. Yeah, certainly that's a big push. The other one is, do you do caching? So caching is another layer between the database and your web app where you get the database stuff back and then you stash it in the memory somewhere so that you don't have to do queries again. So they said, do you do that? And if so, what do you use? 47% Redis, 43% you do you do that and if so what do you use 47 redis 43 i don't do that and then the only other really notable thing is memcached uh so interesting there and i guess people if they're really interested they come through and look there's a lot of i don't want
Starting point is 00:16:39 to go through it because there's so many details but but it's like, what are your favorite components? Like models or admins or auth or what contrib apps do you find most useful? Like humanized or whatever. So pretty interesting. No surprise, people are using Django templates, not Jinja as their main templates. And then look, it's a race between PyTest and UnitTest as the top two most common frameworks. With PyTest above UnitTest, that's pretty cool. Especially since UnitTest as the top two most common frameworks. With PyTest above UnitTest, that's pretty cool,
Starting point is 00:17:07 especially since UnitTest is the default. Yeah, yeah, absolutely. Let's see, I'll just wrap it up with some front-end stuff. What JavaScript front-end frameworks do you most use? jQuery, number one. And I don't mean that with a negative way. Like sometimes you just got some simple problems and you don't need a whole CLI to build a spot to like you know focus the text box all right uh react is tied at 37
Starting point is 00:17:30 as well and then view and then angular and then wow htmx made the list look at this that's pretty cool actually that's brand new shininess getting in there that's pretty cool but still yeah and then css we got bootstrap way out front and then Tailwind and then pure CSS. All right. So that's the survey results. Pretty interesting. Nice. All right.
Starting point is 00:17:51 What do you got next for us? Next, we've got more extras. We've got extras. Okay. Yes. Extra, extra, extra, extra, extra, extra. So I've got so many extras, I decided to make it one of my topics. Brian, got anything else before I go on another rant?
Starting point is 00:18:09 No, I'm just ready to listen to all these extras. All right, I got a bunch of good stuff. So don't let the bad guys into your web apps. Django just had security releases for 4.01, sorry, 4.01, 3.2, and 2.66. Oh, does that mean 4.0 is out? Yeah. It does look like 4.0.
Starting point is 00:18:28 Nobody's using it. Yeah, well, they didn't use it in the past when it wasn't out. Paul Everett and I teamed up to create a course over at TalkPython called Static Sites with Sphinx and Markdown. So this course is free. Everyone can go take it. All you got to do is have an account and go here and it teaches you how to do Markdown and Sphinx and generate static sites.
Starting point is 00:18:48 There's a cool little demo app that we build over here that you can go and do search and look around and see how you document your code and do all kinds of stuff. It's nothing too complicated, but sort of neat to see how to use Markdown with Sphinx because typically Sphinx is about restructured text. So check out the course over there. I'll put that in the show notes. I'm going to definitely check that out because I've got a project that I wanted to use Sphinx for, but I was a little
Starting point is 00:19:13 intimidated. So cool. Yeah. Paul does a great job with it. So, and it's only an hour and 25 minutes or something. So it's, it's not a huge investment in time. Something that's bothered me basically ever since USB-C, what is this, four years or something, is I need more ports on my computer and I want them to be USB-C ports because I have USB things these days because I want them to go into the ports that I already have. Until Thunderbolt 4, you've not been able to get a dock that has more than one USB-C or Thunderbolt port, which is super weird to me. But recently they've come out with Thunderbolt 4 and I just got this thing called the CalDigit Thunderbolt 4 USB 4 Element Hub.
Starting point is 00:19:55 Oh man, this thing is fantastic. Brian, I'm talking to you on my computer here and I have my 4K monitor, my 1080p camera, my microphone, my stream deck, the lights, keyboard, mouse, track, like seven different things, including the monitor plugged in with one cable through this thing. That's really pretty cool. And so sweet.
Starting point is 00:20:13 So basically it has on the front, it has three USB-C Thunderbolt 4 and a power in. And then on the side, it has the Thunderbolt that goes to the computer and then also four USB, high-speed USB-A, but the good ones. So really, really cool if you need to expand out your new-ish computer. What are you using to plug into the monitor then? I have a Thunderbolt 2 DisplayPort adapter.
Starting point is 00:20:43 And so that way, if I come with my new MacBook, I can just unplug one thing from my mini, click it over, and then boom, I'm ready to go. Everything's configured. I'm going to get one of these then. Yeah, they're not super cheap. They've been out for about four or five months, but they've been sold out supply chains.
Starting point is 00:20:58 You know, what time of, what's going on with supply chains, everything. But they finally came out, they're on Amazon. So I linked to it over on Amazon. I also linked to this video that by Doc Rock talking about like, what the heck is this thing? And why is it different? All right.
Starting point is 00:21:12 I also tweeted about how we use the stream deck to do our live stream, which was fun. So I shared a bunch of pictures of that, like how we like put the website. So it says how it's streaming, how we tweet automatically, how we do this sharing and all that kind of stuff. I'm now going to be working on how to use that thing for software development. Like how do you use it for Jupyter notebooks? So every button on
Starting point is 00:21:35 the stream deck, which is 14 free buttons, basically like how, what are the 14 Jupyter operations you'd like to have? Like run all cells, give you a button or, you know, format with black could be a button, all sorts of stuff. So very cool. Oh, you just have a black button with no logo. Yes. Yes.
Starting point is 00:21:52 That should absolutely be black as well. So anyway, people are interested in that. That's there. I did a talk at PyBay quite a while ago. Now the talk is out. Carson was kind enough to retweet that and pointed out that, hey, the talk is out um carson was kind enough to retweet that and pointed out that hey the talk is actually out so i'm linking to my pi bay talk which was an in-person talk at a conference imagine that wow um in san francisco that was really fun people can check that out speaking of conferences we are
Starting point is 00:22:19 a media sponsor of python web conference and so uh you can definitely check that out this is one of the honestly becoming one of the, honestly, becoming one of the bigger online conferences. It's five days, all day. You know, a lot of these online conferences are like, oh, half day, a little thing here. So a lot of tracks, a lot of things going on with the Python Web Conference.
Starting point is 00:22:35 I'm also speaking there as well. Are you speaking there? No. I'm off to get you pi testing something up there. Well, I should probably do some web stuff. Yeah, absolutely. Absolutely. And so there's probably do some web stuff. Yeah, absolutely. Absolutely. And so there's a code that you can use.
Starting point is 00:22:48 It's in the show notes, Python Bytes at pwc2022, and I'll give you 15% off. Also, in our neighborhood, sort of, because it's virtual, does that still have meaning? We have PyCascades coming up, and PyCascades is February 5th and 6th. So that's going to be remote. So it's not really local until things settle down.
Starting point is 00:23:10 So people wherever can take it. Well, it's in our time zone. So it's good. That's true. Time zone still matters. It absolutely still matters. All right. That's it for all of my extras.
Starting point is 00:23:20 And we have Patrick out in the audience pointing out PyCon Italy is also happening in June so that's fantastic yeah awesome so before we get to the joke I wanted to like ask you this brain teaser that like my daughter brought on the spot so she
Starting point is 00:23:38 yeah I'm totally putting you on the spot so she came home I don't know she's in junior high she came home and she said we have this She's, she's in junior high. She came home and she said, we have this cool brain teaser. I'm just want to ask you, just tell me what you think.
Starting point is 00:23:50 So it's, it's a math problem. So I'm going to go out, I'm going to buy a baseball bat and a baseball. The total for the, both the baseball bat and the baseball are a dollar 10. It's pretty cheap. Yeah.
Starting point is 00:24:08 The difference is the baseball. The baseball bat is a dollar more than the baseball so how much is the baseball so i'm not gonna you don't have to answer it right now um but it tripped me up for a little bit i'm like why is this difficult it turns out it's it's like five cents uh because of five cents plus a dollar five is dollar ten but my brain went it's a dollar it's $1. It's $1.10. That's what mine said, yeah. Yeah, but that's a 90 cent difference. I don't know why this is difficult,
Starting point is 00:24:30 but it's a fun brain teaser to ask people. Indeed. Ha, funny. Very cool. Well, you know what else is funny? That feeling of joy that we get as software developers, but is mixed in with, I kind of remember myself
Starting point is 00:24:46 screaming at my computer yesterday, like out loud, because something was so frustrating. I was just like, how is it possible that this is not working? Like what is going on? It wasn't actually about programming. It was with some app or something.
Starting point is 00:24:58 It was, it's always something else. Yeah. Yes. Sometimes it's my fault, but anyway, so the joke is expressing that feeling and it's the sticker says, I hate programming. I hate programming. I hate programming. It works. I love programming. This is amazing. It's like, it's like childbirth. Like you forget all the horror and pain. Like, oh, look at my amazing app. Like, do you remember that you cried for two days because like you couldn't get it to query the database right in production yeah and you love right but you love it because now it works so i love this um i was there this morning i was like fighting jenkins of trying to create a jenkins job with four repos and different branches and i just like i i hate jenkins but but if it works or when it works, when it works,
Starting point is 00:25:46 I'm like, sweet. I am the smartest person in the world. I'm ready to do this all the time. Fantastic. All right. Well, it never ends. It never ends. We've been doing this a long time and we still have these feelings, don't we?
Starting point is 00:25:56 Yeah. So, uh, well, yeah. So thanks a lot, Michael, for, uh, another great show. Yeah. You as well. It's always fun. Thank you everyone for listening. Catch you all later.
Starting point is 00:26:07 Thanks for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. Get the full show notes over at PythonBytes.fm. If you have a news item we should cover, just visit PythonBytes.fm
Starting point is 00:26:22 and click submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit pythonbytes.fm and click submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording, just visit the website and click live stream to get notified of when our next episode goes live. That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.