Python Bytes - #174 Happy developers use Python 3

Episode Date: March 26, 2020

Topics covered in this episode: * Quick chat about COVID 19.* Documentation as a way to build Community The Django Speed Handbook: making a Django app faster dacite: simplifies creation of data cla...sses from dictionaries How we retired Python 2 and improved developer happiness The Troublesome Active Record Pattern Types at the edges in Python Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/174

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver news and headlines directly to your earbuds. This is episode 174, recorded March 18th, 2020. I'm Brian Ocken. And I'm Michael Kennedy. And this week, this episode is brought to you by TalkPython Courses and the PyTestbook. Yeah, yay, it's brought to you by us. Yeah, us. More about that later, huh? Yeah. So we're doing something a little different.
Starting point is 00:00:29 We're recording in two different locations because of, actually, we always record in two different locations. But the locations are sometimes not the location, especially the location you're in. Yeah. I often record somewhere else, but today I'm at home because a lot of people are at home working remotely in home offices now because of, I don't even know how to pronounce it. I read it, COVID-19. Yeah. It is an insane time on so many levels.
Starting point is 00:00:54 But I would say certainly there's a lot of tech people out there who may be working from home for the first time. I know there's a lot of large companies that feel like you need to go to be in the office and you need to do the work. And yet a lot of the tools that we use as developers are very suited to the situation that many of us around the world find ourselves in working from home, working asynchronously and whatnot, right? GitHub, Slack, email, Zoom, whatever it is. It's interesting to see the rest of the world scale up to kind of you know what we've been doing for a long time we were lucky that our our office was recently moved and during the in july and during the move we tried to set everybody up to be able to remote work because um some people had longer commutes than before. And it happened to be, I mean,
Starting point is 00:01:46 it's just fortunate that we set that up before this happened. And I'm also very fortunate that I'm a software worker. There's a lot of people that, I mean, our work can continue for the most part with little interruption, but it's a harder environment. But a lot of people that are not technical workers can't do that. Yeah. It's such a bummer. You know, my daughter, she just got a new job and she was supposed to start, actually, she was supposed to start today and they sent her a message, you know what,
Starting point is 00:02:14 we're closed, we're closed indefinitely and there's no reason for you to come and get trained to work here because who knows what it's going to look like in a month or two. I mean, that's the reality for a lot of people. It's's rough one of the reasons why we started talking about this this morning is just to say you know to reach out to everybody and say um yeah i hope everybody's doing okay and um yeah let us know some stories if you want to share yeah maybe some interesting tech angles right like problems you run into or things you found that really worked or whatever. But yeah, everyone out there, be safe. It's not always fun,
Starting point is 00:02:49 but just find a place to hole up and just wait this thing out and be safe. Yeah, that's a good idea for some extra things related to that. I'll add one of these, on our add-ons at the end, I'll add one to that. All right, super.
Starting point is 00:03:03 Well, I want to start out by talking about community. I was partly thinking about this because of the coronavirus stuff, and a lot of people possibly have maybe two extra hours in the day because they're not commuting, maybe. Or, you know, I'm sorry if you have a two-hour commute or an hour commute on each end, but you might have some extra time. So one of the things you might want to share and spend some time doing
Starting point is 00:03:26 is beefing up documentation on open source projects. There actually was a great article called Documentation as a Way to Build Community by Melissa Mendoza. I think Mendoza, sorry, Melissa. But it talks about how educational materials can have a huge impact and effectively bring people into a community. And beefing up the documentation story on open source projects can actually help bring more people to use it. I mean, it seems obvious, but it isn't really.
Starting point is 00:03:58 And people aren't doing it. There's a lot of projects that lack in really good documentation. And there's a lot of reasons for that. And talking about the reasons, I think it's interesting. Decentralized development and a lot of projects start with just somebody scratching their own itch, and they don't need documentation for that, but it grows into other people getting involved. And a lot of people, it's more glamorous to add new features or fix a nasty bug and adding more documentation. Nobody really kind of knows how to do that.
Starting point is 00:04:30 I think it's important and spending some more focus. One of the directions of this article says it was targeting a specific project, but I think it really can be really more than just this one is splitting up the documentation into organizing it as like in four different areas, tutorials, how to's, reference guides and explanations. These four areas and subsections of those can be targeted towards different people targeted towards beginners or advanced people or somebody just looking something up. One of the great things about that is it makes it easier for somebody to jump in and say, oh, there's like one little piece of things, how to do something. I can contribute to that. I might not know why it works, but I can contribute a how to and some tutorials. Whereas maybe some of the more expert people in the project can do some of the explanations of
Starting point is 00:05:20 how things are working. And also a lot of teams kind of shift, or some projects have the new people come in and say, hey, you want to help out? Want to write documentation? And I think that's a great thing. But then you've got documentation that's just filled with the beginner people that content from beginners that might not be, you know, from some of the experienced people. And so I think there's some good information here,
Starting point is 00:05:43 and I think focusing on documentation might be a good thing. I like the article. I like the idea of it, right? That you can build a community. Certainly you can contribute to these projects quite easily in this way. Breaking it up into these categories is really clever because then you can definitely just sit down and think, oh, I'm going to write some docs for this thing. Well, that's pretty wide open, right? But I'm going to write a docs for this thing. Well, that's pretty wide open, right? But I'm going to write a short tutorial, which I had to learn because I had to use this thing. And now I know how to do that.
Starting point is 00:06:10 Why don't I generalize it and make a tutorial? That seems like a really easy way to get yourself on the contributor list, beef up your resume, say I contributed to this project, et cetera. I think it's good. One of the things I'd like to reach out to people, some of the beginner stuff,
Starting point is 00:06:23 a great thing to do is while you're learning a project isn't writing new content. But while you're reading documentation on a project, if there's typos, if there's just grammar errors, it may have been written by somebody that isn't native English. So you can help out by just fixing some of those things. And then also while you're going through things, if you stumble on something and it's difficult to follow the instructions, it might be that the instructions need to be modified. And why not just do like a pull request of modifying those instructions to be the way it really works. And I think that'd be cool. Yeah, that'd be great. You know, another area that might be interesting is to write tests. Yeah, definitely. A lot of projects lack tests or you know
Starting point is 00:07:05 they're just marginally tested and you're like well okay i'm going to create this tutorial and i want to make sure the things i'm saying work so let me add some tests to verify what i believe to be true to be true and go ahead and commit that back to the project yeah and modifying tests if the tests are not readable they should be and maybe you can make them more readable yeah i guess i kind of started thinking about that because documentation and like tests feel a little bit like a form of documentation. Yeah, definitely. Yeah. Well, cool. Well, I'm pretty passionate about fast websites. As you probably know,
Starting point is 00:07:37 I talk about trying to make websites fast all the time. Our website's pretty fast. Speed is important to slow websites draw and push people away they do i think it was amazon or somebody did a study saying like every you know 100 milliseconds latency of perceived latency to the user and you know it has a very tangible like whole number percentage drop in actual sales yikes yeah sales are not the most important thing necessarily. Maybe if you're Amazon, they are, but it's just gives you a sense of like, well, a hundred milliseconds, you can barely perceive that as a person. And yet as those things add up, right, it starts to really make a difference in behavior. Yeah. So I want to talk about this article,
Starting point is 00:08:17 sort of riff on some topics covered in the article, more or less called the Django speed handbook making a Django app faster by Shabell Mansour now the title has Django and some of the examples are really about Django but this actually applies to most websites and Python websites and whatnot so if you do flask this I think this still be super or super relevant the first thing though that i want to point out is actually a django thing and it does appear at least in pyramid as well so there's this in django there's a thing called the django debug toolbar and it lets you explore the different requests see how long they're taking you can even get in there and look at the orm calls and what's happening so that's pretty awesome like pyramid has this has this as well. You can actually
Starting point is 00:09:06 see the SQL alchemy calls going to the database and the timing and how many database queries there even are on a given page. It's pretty ridiculous to be able to use that to analyze what you're... It's almost like you've attached a little debugger profiler all the time and it's just right
Starting point is 00:09:22 there. That's cool. Do you have to turn it off then? Well, when you go into production, you don't include it in the setting like the run settings for production obviously right got it that would be bad but some of those settings even in the debug mode you have to turn them on i'm not sure about the django one but the pyramid one you definitely like the profiler is not on by default because that'll slow it down a little bit but you can click a box and then go do the request again okay all right so that's a real quick and easy way just to see what your app is up to then one of the things you really want to pay attention to and this is going to be a bit of a theme on today's show is talking to databases so when you're working with an orm or just talking
Starting point is 00:09:58 to the database specifically here the django orm but this is super relevant for like SQL alchemy as well, is you want to like, be really careful of the so called n plus one problem, which happens when you navigate relationships. So for example, if I have, let's say a category, I'm going to show a category of books, and the category has a books relationship relationship or maybe there's some other thing like that i get all the categories back and i want to tell you how many books are in each one or something like as you go through the the things that come back you end up doing one query for each property that you access on each instance of that object so if you do a query that returns 20 things you might end up talking to the database 21 times it's a common problem in orms but it also has an easy fix which is why that debug toolbar is cool because
Starting point is 00:10:51 you could turn it up and say well turn it on and say oh look why are there 24 queries on this page right i feel like i did one like well sort of so you can use select rated related and prefetch related and it'll basically join or pre-query those related objects together in one massive query so you don't actually go back to the database n plus one times okay nice yeah and that's a big deal and you know sql alchemy has a joined load and subquery that you can basically accomplish the same thing so he's got a cool example of not a huge database but doing making using these two properties in the Django ORM going 24 times faster.
Starting point is 00:11:28 Oh, wow. Yeah. Right? I mean, it's basically not changing the code at all, except saying, you know, I'm going to use this related property. So just query that as part of the query instead of doing however many queries you're going back for. Really, really nice.
Starting point is 00:11:41 Related to that is indexes. So if you're not thinking about and using indexes you should be i mean that's like a easily a thousand times faster to do a query against a lot of data with an index versus without and then if you've got these joins it's even better you know so super important but do be aware that indexes make writes slower. So if you have not, most websites don't write data like crazy, although some APIs do. So it's usually not as big of a problem, but just be aware that writes are slow or with indexes, but queries are much, much faster. Another thing they talk about, which is really helpful is using pagination, pagination, where instead of saying, here's
Starting point is 00:12:24 a thousand items, here's 50, and you can ask for the next 50 in the next 50 and so on that's super easy to do with jango rm or sql alchemy or anything like that so that's a really good one so does that often line up with like if you're showing like if your page only shows 50 things only fetch 50 things then yeah yeah exactly and it's super easy to put in the query string like page equals five right and then you just do a skip and a limit or whatever the or i'm using has like for the skip and take type of thing right so it's super easy you can compute it yourself but it makes a big difference right also if you have long running tasks long running things to do make them either background tasks and like extra other processes or celery or something or just use if the the person making the call has to wait on it be sure to use async right so you're you're not blocking up everything
Starting point is 00:13:17 yeah another super easy way to make things fast and many of these things we're doing at PythonBytes.fm and the other websites, is to turn on GZIP. So you can just go to like Nginx or whatever your web server is and say GZIP the response. He's got a really simple example here where the response size of the page and the CSS and whatnot is nine times smaller by just adding the GZIP middleware to Django. Oh, wow. Yeah yeah i wouldn't actually add it to django if this was me i would add it to nginx because that's the outer shell web server just let it do it and you don't have to um you're probably not talking directly to the server running django but anyway somewhere along the way gzip your content because that'll be big similarly minify your static files and bundle
Starting point is 00:14:07 them and cache them and all of those good things right okay there's some cool libraries that he talked about in there i think it was called white space i'm pretty sure it's called white space that they're using in django to minify and bundle the files so we we don't use White Space, and we don't use Django. We use WebAssets and CSSmin and JSmin, which are three awesome Python libraries, to bundle that. So if you go and look at Python Byte or TalkPython or any of those sites, you can see that there's a packed CSS and a packed JavaScript that has probably 20 CSS files
Starting point is 00:14:41 that's smushed into one with those things and minified and whatnot. So that's pretty cool. There's two ways to measure page performance. One is like how fast is the server responding, right? But that's not the most important thing to the user. The most important thing is how does it feel to them. So Google has this thing called PageSpeed, which they're even using for measuring like your SEO ranking.
Starting point is 00:15:04 So put your website into there. I have a link for TalkPython trainings ranking. I spent three days straight getting it from like 40 out of 100 to 99 or 100 out of 100. But it was quite the journey. So that took a while. You can both measure it for mobile and desktop. And it has slightly different rankings. Also shrink your images with image Optum,
Starting point is 00:15:29 which works for Mac OS and Linux. It doesn't work on windows, but there's some really great options there. And it'll basically do completely lossless compression of your images. So they might be like 40 or 50% smaller. And visually you could not, you literally couldn't distinguish them. Interesting. Yeah. Yeah. like 40 or 50 percent smaller and visually you could not you literally couldn't distinguish them interesting yeah yeah and then last recommendation is lazy load your images this is not something i've
Starting point is 00:15:52 i've really explored but apparently google chrome images now support a lazy attribute oh nice yeah and then for things that don't support it there's a lazy load javascript library basically your images you say here's as it scrolls into view it'll download them but if it's off the page and you never scroll then it'll never load it that's great yeah pretty clever so this is just some of the things covered in that article so if you're out there and you're like i need to get my site to go faster it cannot be three seconds per page load that's ridiculous like start looking through some of these things it'll really help especially if you're using Django, but even if you're using some other Python framework, I think it'll still be quite relevant. Yeah. Most of these are relevant to any,
Starting point is 00:16:31 any web stuff. Yeah. Yeah. They're super, super general. Like some of the libraries, they talk about plug into Django. So it's kind of a little extra boost if you're doing Django, but yeah, this is relevant to everyone. Yeah. All right. What do you got next? Well, this actually came into us a listener suggestion from the author of the library. So this is like JIT podcasting, right? Yeah. It just came in do you got next? Well, this actually came into us, a listener's suggestion from the author of the library. So this is like JIT podcasting, right? Yeah, it just came in this morning, and I love it. It's from Conrad Hallas, I think. It's called D-A-C-I-T-E, maybe D-A-C-I-T-E, D-A-C-I-T-E, D-A-C-I-T-E, I don't know.
Starting point is 00:16:59 But it's cool. It simplifies the creation of data classes from dictionaries. So when I first heard it I'm thinking I'm using data classes all the time now because I really like them, there's a lot of cool aspects of them you can have default values, I really like that I can easily exclude some of the fields you can take them out of the comparison
Starting point is 00:17:20 so some objects can be equal even if they're not completely equal sort of thing, And I love that aspect. And there's a whole bunch of other cool stuff about them. So I'm using it more and more, but our data all over us that we get from databases and whatever, it often gets converted to dictionaries and not to data classes. So this is a little library that has basically it's one function called from dict that converts dictionaries to data classes. And my first reaction was I can already do that. If you do the star star or the double splat. Dictionary to keyword argument type of thing.
Starting point is 00:17:58 Yeah. I mean, you can do that for simple data classes and simple dictionaries. That works just fine. But I looked into this more and this from dict from to cite, it allows you to do nested structures. So you can have a data class with another data class field and arrays of lists or tuples of data classes as some of the types. You can do unions in there, collections, nested structures. It even has this thing called type hooks, which allows you to have a custom converter for certain types of data that come in. So his example is like if for all the strings, lowercase them or something like that. But you can definitely have that for certain types.
Starting point is 00:18:41 It's pretty neat. Oh, that's cool. Or if you got like some kind of string that's a date time, you parse it out of an ISO string or whatever. Yeah, that's a good example, actually. That's cool. So one of the things that messes you up on my example of just taking a dictionary
Starting point is 00:18:56 and expanding it as arguments to a data class constructor is that it doesn't really work if all the names don't match up. But this one allows you to have, if your data class only has a few fields, but your dictionary has like tons of stuff in there, by default, it just ignores the stuff that doesn't match up. And so if you've got like a name and an ID, and there's names and IDs coming from the dictionary, but there's also like a whole bunch of other things like URL and stuff like that.
Starting point is 00:19:25 It just ignores that. That's the default, but you can also turn on strict mode that says, no, I expect it to match up directly and I want a warning. And then there's a whole bunch of exceptions that get raised if something goes wrong in the conversion. And I'm just excited to use this because it's a really cool tool to convert data to data classes. It's nice. Yeah, this looks super nice it's one of those things that seems to automate like the crummy part of programming right like
Starting point is 00:19:52 i'm getting this data submitted to me from an api or from somebody calling my api and who knows what they're sending me but here's how like long as this thing lines up right right i tell it these fields are not optional or this type has to be such and such. If that works, then we're good. Otherwise, you know, tell them 400 that didn't work or the file couldn't be loaded or whatever it is. And there's definitely, so Conrad made a point in the documentation to say that it is not a schema validation library. That's not the intent of it. It is really just intended for the conversion. So especially with external APIs,
Starting point is 00:20:25 I think combining this with a schema validation is a good idea. But you could definitely go from schema validation to this and have data classes in the end. It'd be great. Yeah, it's a cool project, and I love how it leverages the brand new Python stuff, the data classes. Anyway, we should plug ourselves as sponsors. Well, we should definitely let people know about what we're doing, right? So
Starting point is 00:20:50 you've got this book on testing or something? I actually kind of love that I had some feedback early on when the book came out. Python Testing with PyTest is the book that I'm talking about. And it did come out in 2017, the end of 2017. And I got some really great feedback from people saying they really loved following the book on this podcast. And I apologize for the lawnmower in the background. If it goes through, I wanted to point out that I had a couple of people ask me, it came out in 2017. Is it still valid? And I want to take the time to say yes, it is. The intent of the book was never to be a thorough, complete inventory of everything you can do with PyTest. It was a quick, what are
Starting point is 00:21:32 the like 80% of PyTest that you're going to use all the time? And that is the core of PyTest and how to think about it. There is new goodies that have been added since 2017. And it's good to check those out. But you could run with what's in this book and still be very productive. Nice. It's definitely made me more productive and better with PyTest. So that's great.
Starting point is 00:21:52 Thank you. Yeah, you bet. And I also want to tell people about the courses that we have over at TalkPython Training. We've got a bunch of new ones we've been releasing. I do try to let you know when the new ones are out, but we've got like 120 hours of python content over there on a bunch of projects that you can do the 100 days of code courses all have like
Starting point is 00:22:09 projects for every single day for 100 days and uh yeah so just check them out we're going to release a couple new courses coming soon and i'll be sure to let you know but yeah support us by checking out our work right yeah i want to tell people one of the things I love about the TalkPython courses is there's a lot of content there. And I'm a busy person, and sometimes it's overwhelming to me to look at a course to say it's like 12 hours of content on a course or something like that, six hours or something even. The way that you've got it set up with bookmarks into separate videos and different topics, the outline of the courses are so incredible that if you really need to just jump to the right place to learn something, you can do that. And even though you can just watch them in series and just watch the whole thing, you can do that, of course.
Starting point is 00:22:59 But being able to jump around and go back and use it as a reference is a great thing. So thanks. Yeah, thanks. Yeah, we definitely work hard on making that a possibility. So I appreciate that. Now, do you know what the Python clock reads right now? Oh, I haven't checked. What does it read?
Starting point is 00:23:15 It reads 0, 0, 0, 0, 0, 0. It's the Python clock has clock bell has told for the folks who have to convert this next thing i want to share with everyone comes from linkedin and barry warsaw barry's been part of python for a very long time doing a lot of cool stuff there and he was on the team that helped linkedin move from legacy python to modern python okay Okay. Yeah, so it's called How We Retired Python 2 and Improved Developer Happiness. So a couple years ago, 2018, LinkedIn started working on this multi-quarter effort
Starting point is 00:23:55 to transition to Python 3. So maybe some of the lessons from here will help people out there for whom they haven't actually migrated all the way to python 3 that'd be good right yeah so basically they said they did a inventory and they found they have 550 code repositories they had to migrate that's a lot of different projects and some of them depend on on the others so they said look python is not the thing powering our main web app i think it's java
Starting point is 00:24:27 i'm not 100 sure but anyway it's it's not their main thing instead there's a bunch of like independent microservices and tools and data science projects that are all using this so their first pass at getting all those different things migrated was to say, we're going to have a bilingual philosophy for Python, meaning it'll run on two and three at the same time. Okay. legacy python so i depend on a library that requires python 2 therefore everything that i use that i build that depends on that library must also be python 2 right so this bilingual thing that they did this was to prevent that blockade right so anyone who wants to build new stuff on python 3 could still use the libraries and do so that was the plan they actually had a whole team that oversaw this effort like across projects across like thousands of engineers called the
Starting point is 00:25:31 horizontal initiatives program so that was to like kind of across all these different projects address that and then in phase one quarter first quarter 2019 they went and they found the most important repositories the ones that were if you put them into a dependency graph at the bottom and they said we're going to port those to python 3 first because they're blocking everything else and then they kind of finished it off in the second half of 2019 so they basically said all right now we got the foundation done we can start upgrading the libraries that depend on all these lower level bits. And then they said, looking back, you'll like this part, Brian. They said our primary indicator for knowing that the migration was done, that we were all right,
Starting point is 00:26:14 was that our builds passed and our tests ran and everything was okay. And then eventually they went through and said, all right, we're going to turn off the ability to run Python 2 type of tests in continuous integration. Now let's see what keeps working. Oh, yeah. Okay. Yeah. So one of the things you can imagine is important is having tests, right? Because if you don't have tests, CI, CD doesn't tell you a lot. It just does the CD part. Better for better or worse. Yeah. So they said, look, here's some guidelines for people other organizations who are on similar paths but earlier i said plan early and engage your organization's python experts find and leverage champions in the affected teams and help them promote the benefits of python 3 to everyone
Starting point is 00:26:59 adopt this bilingual approach so people can at least begin if they want to go to python 3 invest in tests and test coverage code coverage because these will be your best metrics of success and then finally ensure your data models explicitly deal with this what used to be one thing bytes and strings in python 2 and now is is, of course, two totally separate things. Like, they said that was really the biggest challenge that they ran into, is making that distinction correctly. Yeah, those are a hurdle.
Starting point is 00:27:32 Are you guys all upgraded? Yeah, it was a library that we were using that didn't support Python 3 yet. The reasoning was the library talks to a DLL that has C++ strings or C strings. And old Python strings converted just fine, but they don't now. Unicode fancy ones, yeah. Not so easy.
Starting point is 00:27:54 Yeah. Cool. So to wrap this up, they said the benefits they have from this whole process is they no longer have to worry about supporting Python 2, and they've seen their support loads decrease, and decrease in a good way. You don't have to support the old crummy stuff. You can depend on the latest open-source libraries.
Starting point is 00:28:11 A lot of libraries these days only work with Python 3, and they opportunistically and enthusiastically adopted type hinting and MyPy to improve overall quality, which is pretty cool. Yeah, that is good. Yeah, I'm looking forward to this next one you got. This actually ties nicely because you brought up the Django speedups, and I probably should have talked to this about this right afterwards.
Starting point is 00:28:33 But anyway, here we go. There was an article that I'm not saying I agree or disagree because I don't know enough about it, but the article was called The Troublesome Active Record Pattern. And I guess in, you know, like Ruby and stuff, we talk about that they talk about active record more, I think. But in Python world, it's the object relational mappers, ORMs, like the Django ORM or SQL Alchemy is also an ORM. And those are essentially the same as ActiveRecord. I think that's the same pattern, right? Well, certainly the Django ORM follows that pattern.
Starting point is 00:29:09 SQL Alchemy, it has a lot of similarities, but its design pattern is technically called a unit of work. Okay. The main variation is like on Django or things like that is you go to the object and you call save. So that happens on the individual objects whereas in sql alchemy you make a bunch of changes and then there's this unit of work thing and you call save and it submits all the changes in one giant batch but basically but here's the interesting
Starting point is 00:29:37 thing is like this whole article is like the troublesome active record pattern but my reading of it really was the troublesome orm pattern and so for the most part it's kind of a immaterial distinction although technically design pattern wise they're not exactly the same okay okay well yeah so the idea being like you just brought it up that the object when you're referencing a bunch of objects and you have object save and things like that, there's a whole bunch of issues with that. One of the issues is if you want to query things about the data, not necessarily all the data, but things like if you've got a bunch of books, for example,
Starting point is 00:30:18 and you just want to count the number of books, well, you might have to just retrieve them all. Or if you want to count all of the software testing books written by Oregon authors, you might have to just retrieve them all. Or if you want to count all of the software testing books written by Oregon authors, you'd have to just ask me, or you'd have to grab like all of them and grab all the data and then search on, do in Python, look for stuff in a for loop or something. The other problem was around transactions, because if I have a book item and then change something about it and then save it back in, there's nothing stopping some other process. You know, the read, modify, write doesn't work that well if you've got multiple readers and writers. And I was looking this up.
Starting point is 00:30:59 SQL Alchemy has sessions or you said there's a unit of work thing. Don't know if those are atomic. Yeah, they're the same. Yeah. Okay. Django has an atomic setting, but I don't know if that's by default or if you have to specifically say work with transactions. I did notice in some of the Django documentation that does say that transactions slow things down. So you don't want to do transactions if you're just reading for instance. And then the author of the article, Cal Peterson, mentions that REST APIs often have the same problems and some microservice architectures have a similar sort of issue.
Starting point is 00:31:37 It's just around REST APIs instead of the object model. You're reading tons of data when you don't need to. He brought up some solutions, at least for you can just directly use SQL or use some properties that do queries that are more like SQL. Doing transactions helps too. But basically he was recommending avoiding the ActiveRecord style access patterns around the REST APIs. He brought up that GraphQL and RPC style APIs are some solutions to the same problem in REST APIs. As somebody that's moving towards learning more about web development and working with ORMs, I really did want to bring this up and find out what you thought of all of this. Sure.
Starting point is 00:32:20 It's interesting. There are a lot of good valid points that Cal's making here. I feel like the focus should almost be, instead of the troublesome active record pattern is you're using your ORM wrong, learn how to use it right. So let me give you some examples. So one of the challenge here that we see is if you're going to create a record and you want to get it back, you have to get it back by the primary key. Maybe if you're doing exactly on just the straight ORM record, but you can just do a query and do it like a, give me the first or one item or something like that. There's a part where he's looping over stuff saying, here we're looping back to just get the ISBN off these things,
Starting point is 00:33:00 right? You're pulling all the properties, like you're doing basically a select star from table just ultimately and a serialization of that result just to get like the isbn well in sql alchemy i don't know django rm well enough but sql alchemy you can say only return these columns i want just the id and the title or the i just want the id and the isbn don't return the other results right so that's an option the n plus one thing we already discussed right you just use the subquery or the b filter select or whatever it is for django and you can avoid those right so like as you kind of go through these you're like okay well most of the time these problems are actually solved with some aspect of like a proper orm now the transaction one is really, I think, super
Starting point is 00:33:46 interesting because it sort of often gets to the heart of this debate about ORMs. And you're saying, well, okay, here's this active record thing where it's not really leveraging transactions. We know transactions are good. And so this is bad because it doesn't do it. But in practice, it's not so clean as that. So for example, suppose I'm working on a web app and I have a grid, like a grid that was, maybe could be loaded off of a REST endpoint, bring that into it, right?
Starting point is 00:34:14 And I've got this grid and I can type in it and there's a button that says save. There's no way that it makes sense to do a transaction around that, right? I'm not gonna transactionally begin loading the grid and wait for a transaction around that, right? I'm not going to transactionally begin loading the grid and wait for me to press save, right? That's going to lock up the database for every user. Any scenario like that, like rest endpoints, right?
Starting point is 00:34:33 If I've got a phone and I've got my mobile app and it hits the rest endpoint, pulls it down the data and I hit a type on it and I hit save, you can't do that transactionally. Like it just, you would lock up up the site right away, right? So it doesn't make any sense. So there's just other patterns, like optimistic concurrency is a super common pattern in ORMs
Starting point is 00:34:52 that would work with ActiveRecord or SQL Alchemy's unit work beautifully. And the idea is I'm going to make some kind of version in that record. And when I pull it back, it's going to come with the version that I got. And when you hit save, you say, update this record where the version is the version I have. So if someone else has updated it, it increments that version and it says, no, no, there's no record. You can't update this. Right. So you, you basically say, ah, it looks like someone
Starting point is 00:35:19 changed this behind you, like your grid and their grid, they hit save before you. So you got to deal with like syncing this back up, right? So there's a lot of times where it's, it would feel great to like have a transaction, but that transaction actually can't be used anyway. And ORMs have like nice built-in ways
Starting point is 00:35:35 where you can easily slot in like optimistic concurrency and stuff. So that's my thought. I think this is an interesting article. It's definitely interesting to think about all the points brought up, but I often think that the tools have like clever, non-obvious ways to solve most of these problems. issue that all of our beginning tutorials on how to use Django or how to use SQL Alchemy or how to
Starting point is 00:36:05 use other ORMs are just ignoring that stuff because it's more advanced. But people often just read the beginning tutorial and then go do a startup or something. Yeah, sure. And then you end up with your page loading like in six seconds and you don't know why. Yeah. Which is not great. Maybe we could teach people the right way to do it from the beginning i do wish that some of these patterns were more built in like i wish optimistic concurrency was there by default in the orms and you've kind of got to like roll that yourself and whatnot so anyway it's a really interesting article to think about and i think it dovetails nicely with my sort of performance one as well because it's they're kind of two sides of the
Starting point is 00:36:44 same coin a bit there yeah okay all right well i have's they're kind of two sides of the same coin a bit there yeah okay all right well i have the second side to your coin that is the dacity that's this is it whatever that one was called dacity yeah so this is a cool thing by steve brazier called types at the edge of python the edges of python and so steve apparently creates a bunch of apis and i think yeah he was using fast api at the time when he was talking about all these ideas but it's kind of generally valid for all of them because look when i start with a new when i create a new api these days i start with three things i start with pidantic mypy and some kind of error tracking like rollbar or sentry or
Starting point is 00:37:23 something like that okay that's pretty interesting, right? So Pydantic is a data translation and validation library, much like Dacity. Right? They're not the same, but they kind of play in the same realm. They transform JSON with validation and type checking
Starting point is 00:37:39 over there. And then there's MyPi, which looks like you can use Pydantic to help specify some of the types on your classes and then use my pi to verify that you're not missing some kind of check so he says look the most common error you're going to run into as a python developer in general is attribute error none type object has no attribute x where x is whatever you're trying to do right yeah and i mean that just means you got none instead of a is whatever you're trying to do, right? Yeah. And I mean, that just means you got none instead of a value, and you're trying to continue to work with that class in some way.
Starting point is 00:38:11 It's a void dereference in C. Yes, exactly. So wouldn't it be nice if it said none is not an allowed value for this, or you have none and you can no longer operate on it or something like that? So Pydantic will actually give you those types of errors. It'll convert things like attribute errors and mismatch type errors to explain what was wrong, right? So that's pretty awesome. And so you can use Pydantic to actually specify what your understanding of the interface, like if you're calling an API,
Starting point is 00:38:40 the stuff that you expect to get back, okay, I think this is going to be a date. I think this is an optional string and whatnot. It says, then when you launch a code into production, your assumptions are tested against reality. That's pretty cool. And it says, if you're lucky, they turn out to be correct. But if not, you're going to run into some of these none type errors and Pydantic can help with that. But then you can also, once you put in the typing into your code, then MyPy will go on helping so for example if you're taking an argument that says you know first you think it's a string so you say colon stir first type then you go work with it and that means it cannot be none right like
Starting point is 00:39:16 none ability is explicitly set in the type thing in python in the type space so if you find out that it could be none then you're going to go and say this is a typing dot optional of string right like that's what it's got to be if it could be none or a string you'd find that out and specify that in pydantic and then if you run mypy against it and you start working with a optional string you don't check for it to be none first mypy will actually give you an error saying that you're not checking for none, basically. So it'll even tell you the missed if statements or other conditional code to verify that,
Starting point is 00:39:52 no, it's not the optional none, it's actually the value. Oh, okay. That's pretty cool, right? Yeah, that's tripped me up before, yeah. Yeah, for sure. I mean, normally it's just not present. And it's not because Python is a dynamic language.
Starting point is 00:40:04 C++ would have the same problem, right? If you take a pointer normally it's just it's just not present and it's not because python is a dynamic language like c plus plus would have the same problem right if you take a pointer and you just start to work with it in c c plus plus the compiler's not going to say you didn't check that for you know equal to null first it just it just doesn't do that right so this is a really awesome like addition for like safety in your code so he was talking about about how fast API automatically integrates with Pydanic out of the box, which is pretty cool. And then also at the end, he has a kata, a mini kata
Starting point is 00:40:31 that like works you through these ideas. So a kata is like a practice to like play with these typing ideas. Yeah. And a nice picture of how these all fit in. Yeah, yeah. Yeah, there's some cool diagrams. So anyway, if you're building APIs
Starting point is 00:40:44 and you're taking data, especially from sources where they might give you junk when you expected something valuable or you're not really sure you're like the docs say this but i remember getting something different some other time this is a really cool way to formalize that and then have your code automatically check it yeah this is cool i like it yeah awesome that's all over our six items do you have any extra little things to share? Well, I kind of went overboard on the extras this week, but I'll keep them all quick because there's a bunch of cool stuff out there
Starting point is 00:41:11 that people send in. First, Jack McHugh did a really cool thing. So Jack McHugh created a blog post or a page on a site called Python Bytes Awesome Package List. Have you seen this? Yeah, and he listened to 171 episodes in 174 days or something like that of Python Bytes. I mean, this is awesome because as I flipped through this,
Starting point is 00:41:36 there's a couple of things I've forgotten. I'm like, oh, that's cool. Oh, we must've talked about that, but I don't even remember. It's got beautiful pictures. It's, I mean, it's kind of an awesome list, it's for a podcast so that is super cool jack thank you thank you i'll be sure to link to it at the end and i hope you keep adding to it that would be great but no pressure yeah i want to talk about vb.net for a second that's kind of weird right yeah because
Starting point is 00:42:01 because i kind of appreciated vb back in the early days when it was like a drag and drop vb6 and whatnot and then microsoft came out with a thing called visual basic.net and it was complete crap didn't like it but here's what's interesting is like they have just announced that they are no longer maintained they'll keep that thing running but they will no longer work on it i just thought it was interesting like here's a fairly major language not super top five or something but it's kind of a major language that's like declared dead and i just thought it was kind of interesting to point out like man languages they can go dead it's weird yeah i think this one should have been shot a long time ago but you know it's also worth thinking about this. I agree by the way,
Starting point is 00:42:47 I should have never existed, but anyway, that's a different story. It's also an interesting take on like, here's a language controlled by a single company and they can just decide they don't like it anymore. Right? Like this wouldn't really happen to Python because there's not a single person or organization that goes, ah, we're done. Yeah. Well, that's actually one of the fears I have for, I mean, even Java. Java is not controlled by one company, but it kind of is.
Starting point is 00:43:12 Yeah. Yeah. Well, and there's also that Supreme Court case or the legal case of like, are you allowed to copy the Java API? I don't think that's resolved yet. I can't remember. It's still working its way through the cords i want to reiterate i don't people that actually have a job in visual basic or love it i'm not dissing you i just had a personally bad experience with visual basic
Starting point is 00:43:33 and didn't enjoy it so i had a good experience with visual basic 5 but that was in like 1993 or something okay so also we talked about covet 19 all the crazy stuff going on as tragic as much of it is there's some really interesting data science that can be done and some dashboards that can be built and whatnot so someone on twitter let me pull up their name uh just pointed to a whole bunch of covet 19 data sets beekeep i to call that BeeKeep. I'll put that on Twitter, so check that out. Like the Johns Hopkins CSSE data set and some other dashboards and
Starting point is 00:44:11 some things on Kaggle. So if you're in data science and you want to explore it, here's some data sets that are probably interesting. And finally, working on a new course, adding a CMS to your data-driven web app. That'll be a lot of fun. I'll talk more about that later. But I'm just super excited to be creating more courses, as we kind of talked about earlier.
Starting point is 00:44:27 Yeah. One of the things we talked about is people working from home and getting around technical problems with that. That happened to me just this morning. So this morning I tried to hook up. I realized that I had an external keyboard that's working fine-ish. I wanted to use
Starting point is 00:44:44 a real mouse, so I plugged in an external keyboard that's working fine-ish. I wanted to use like a real mouse. So I plugged in an external mouse with a little click wheel thing on it and realized that on Apple, the click wheel behavior just goes the wrong direction for scrolling. And it confused me. And you can reverse it, but I didn't want my trackpad to be reversed. The trackpad's fine. So they're tied together for some reason. Weird. So Dave Fornack, Forjack? Sorry, Dave.
Starting point is 00:45:16 He suggested I use something called Scroll Reverser. That is a little tiny app that allows you to untie those and have trackpad scrolling and mouse scrolling be different and thank you dave that's awesome that's super cool i guess my work from home thing that i've been playing with is uh with zoom you can have virtual backgrounds you don't even have to have a green screen you can have like alternate backgrounds uh just by uploading an image and it'll put you in you know an office space instead of a messy bedroom or whatever it is. Oh, nice.
Starting point is 00:45:48 Yeah. So you can block out the kids behind you and stuff like that. Yeah, exactly. You don't have to see the kids being crazy home from school and whatnot. Anyway, yeah, a lot of stuff we're learning around those types of things. And I think the joke that I chose for us this week is going to be perfect for the opening of community as documentation as building community that you brought out okay this is before that person gets inspired from listening to you
Starting point is 00:46:12 and actually makes things better all right so let me let me set the stage here there's three people two of them clearly more senior and a very excited new person sitting in a laptop like beaming with enthusiasm ready to get going on the the whole project and one of the senior person says to the other and this is jim our new developer the other one says great does he already know something about our system the new person turns around i read the whole documentation blank looks between the senior people no yeah yeah that's good right yeah definitely i started a job once in my career where i i had read the documentation because it was an internal job transfer i read the documentation before getting there and the people there that didn't know they had documentation so So it was so out of date.
Starting point is 00:47:06 Nobody knew it existed. It may be a little out of date if they don't even know it exists. Yeah. All right. Well, awesome. Well, thanks a lot. You bet. Great to be here with you as always.
Starting point is 00:47:19 See you later. Bye. Thank you for listening to Python Bytes. Follow the show on Twitter at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool. This is Brian Ocken, and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.