Python Bytes - #162 Retrofitting async and await into Django

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 162, recorded December 19, 2019. I'm Brian Ocken, and today I've got guest host Ali Sivji. He's one of the organizers of the Chicago Python Meetup, and we met last week when I was there for briefly, and we already started recording this, but I forgot to press record, so we're starting over. over. So thanks Ali for hanging in there with me. Thanks Brian, glad to be here. This episode is brought to you by Datadog. We'll talk more about them later but first Ali's going to tell us some exciting things about Django. Yeah so Django 3.0 just came out in early December and so I really wanted to find out what was going on.

Starting point is 00:00:47 I found a talk from DjangoCon 2019 by Andrew Godwin titled Just Add a Wait, Retrofitting Async into Django. And Andrew's been the one leading the implementation of asynchronous support into Django. And he discussed what the 3.0 release signifies, also went over the implementation roadmap

Starting point is 00:01:02 for upcoming releases. Okay. Wow, that sounds fascinating. Yeah, he started by giving an overview of the async landscape. signifies also went over the implementation roadmap for upcoming releases. Okay. Wow. That sounds fascinating. Yeah. He started by giving an overview of the async landscape, talked about how synchronous and asynchronous code interact. A key point that he made was that when we have asynchronous functions, they're very

Starting point is 00:01:18 different than synchronous functions. This makes it really difficult to design proper APIs. And one of the difficulties in adding asynchronous support to Django is that this project, it's been around for a really long time. There are a lot of people familiar with it. And for these new async implementations, they wanted to have that

Starting point is 00:01:35 same familiar feeling. And so they made a plan to implement async capabilities in three phases. And so phase one, that was completed with the Django 3.0 release. And this phase, it's was completed with the Django 3.0 release. And this phase, it's really meant to lay a lot of the groundwork

Starting point is 00:01:48 for future changes. And so Django now supports Asynchronous Server Gateway Interface or ASCII. And the ORM, it's also async aware. So if you actually call ORM code from async code,

Starting point is 00:02:00 it's going to raise an exception to make sure that you're not calling the synchronous function from your async code. You don't want to have unexpected behavior there. Phase two is going to raise an exception to make sure that you're not calling the synchronous function from your async code. You don't want to have unexpected behavior there. Phase two is going to be tracking with Django 3.1. And they're going to be adding async capabilities to the core parts of the request path. This means we're going to have async views, async handlers, and async middleware.

Starting point is 00:02:22 And there's already a branch in the Django repository where this is mostly working. They just need to add a couple of tests to make sure things are passing with all those cases. It sounds really well thought out. They put a lot of work into it, and there's a really detailed Django pep. I think it's called a dep of all about it. And then finally, with phase three,

Starting point is 00:02:39 that's going to make the Django ORM all async. And Andrew called this the largest, most difficult, and most unbounded part of the project. So if we just start thinking about ORM queries, sometimes they can result in a lot of implicit data-based lookups if we're not really careful. And this is where a lot of those complexities are going to come from. I really enjoyed this talk. I think it went into the right amount of depth while being really accessible. And I just want to call out the fantastic diagrams.

Starting point is 00:03:05 It really helped improve my understanding of Django internals. And if you check the show notes, I'm going to include a link to the async project wiki. Andrew needs some help putting these things into Django. He's got a lot of areas there. If you have some time, you have the expertise, please go ahead and help him out. Yeah, awesome. I like doing calls to action on the show to try to get people involved. And I've been just starting Django for like several years now, just dipping my toes in and out every once in a while. And I was kind of waiting for this new version to come out. So I'll probably check this out first. I like that idea of having, since they're gradually rolling it out, doing error messages for if you try to do things that aren't implemented yet.

Starting point is 00:03:45 So that's cool. Yeah, it's always best to be explicit. So would you say you've been playing around with Django? Yeah. Okay. Nice segue there. So the next topic, speaking of playing, I don't know how you got into programming, but a lot of people get into it with games and i actually feel bad that i'm uh i'm in that like in that camp also because i don't know it's just kind of cliche but yeah sure enough games by

Starting point is 00:04:13 example is a github repo by al swigert and other contributors as well or i think there's others but he wants more so the idea is idea is these are standard IO games. And he's got a collection of games with the source code to use for example programming lessons. They're all written in Python 3, and you can just clone the repo or just view it and copy and paste even for people that aren't used to Git stuff. I think it's neat because before i even learned any of the concepts of programming i was programming games by copying them out of magazines and typing them

Starting point is 00:04:50 into my old trs-80 way back in the stone ages that's awesome i didn't know what it was doing but i could modify things and sort of interpret it because you can kind of read basic so there's some cool features of these games that actually I would have enjoyed at the time. Some of the neat things is that they're short. They're limited to 256 lines of code, just an arbitrary number. But that's a pretty nice small code size. They're all single file, single source code files with no installers. So you just run them with Python.

Starting point is 00:05:23 They'll use just the standard library. If they're only using print and input for interacting with the user, there's some downsides to that, but there's some upsides too because they're fairly simple. He's tried to make them very well commented and very readable. So he mentions that they're elegant and efficient is nice sometimes, but understandable and readable is better for education purposes. So that's what they're planning on doing. That's what his focus is.

Starting point is 00:05:50 Also, input validation on everything so that you can't crash Python from typing in something weird. This is kind of cool. He made sure that every function has a doc string to describe what it is, because it really is meant to be teaching people. And I think this is pretty cool. Yeah, Al does a lot of good things about structuring things in a way that everybody can understand really love all the stuff he does this would be great for people to use in i think in helping your kid out i'm going to try to get my my kids uh running with some of this stuff to

Starting point is 00:06:19 just say hey watch this is how you run it. Now play with it and break it. He also included to-do list of things he wants to do with the project, but hasn't done yet. And I love that because it gives people direction if they want to help out with the project. One of the areas is testing. He wants more tests. So I'd love for people to get involved with that. Always got to make the call up for testing.

Starting point is 00:06:40 Yeah, definitely. So before we move on, I want to give a shout out to our sponsor. This episode is sponsored by Datadog, a cloud scale monitoring platform that unifies metrics, logs, distributed metrics, and plot the flow of traffic across multi-cloud environments with network performance monitoring. Plus, Datadog integrates with over 350 technologies like Postgres, Redis, and Hadoop, and their tracing client auto-instruments common frameworks and libraries like Django, Tornado, Flask, and AsyncIO. That's cool. Get started today with a 14-day free trial at pythonbytes.fm slash datadog. Now, speaking of testing, Bulwark has some... Ah, I stole your thunder. But the NIST topic has some test-related stuff, right?

Starting point is 00:07:36 Yeah, so Bulwark is an open-source library that allows users to easily property test their Pandas dataframes. It makes it easy for data analysts and data scientists to write tests. So let's just take a step back. We all know that tests are important, but tests for data, they're just a little bit different. These tests, they're not as deterministic and they require us to think about testing just a little bit differently.

Starting point is 00:08:02 So with property tests, what we can do is we're able to check that an object has a certain property. And so property tests for data frames include things like validating the shape of a data frame, checking to see if a column is within a given range, verifying that a data frame has no NANDs, and things like that. So with Bulwark, you're able to implement these property tests as checks

Starting point is 00:08:26 and each check takes a data frame and some optional arguments. This check will make an assertion about some data frame property. If things are good, the check's going to return the original data frame. If the check fails, the assertion error is raised and you have a little bit more insight

Starting point is 00:08:41 into what went wrong. And this is really helpful when you have like a large data pipeline. That's cool. Yeah, it's really, really cool. And so one of the ways that Bulwark lets you implement property tests is also through decorators. And so when you're creating data pipelines,

Starting point is 00:08:55 it's really useful to do them as functions. You have some input data, you perform some sort of action, it returns an output. And with Bulwark, what you can do is you can add decorators to your pipeline functions and validate that the properties of your input data frames meet the conditions that you really want it to have. So Bulwark has a lot of pre-built checks already in there. There's one for has certain data types. Is this column monotonic? Is this within a certain number of

Starting point is 00:09:21 standard deviations? And it seems pretty straightforward to add new checks. And instead of stacking decorators for multiple checks, they have a special decorator, a multi-check decorator, which won't fail only on the first. It's actually going to run through them all and tell you all the ways your data either passed or did not pass. Oh, that's great. Wow. Neat. Yeah, you can use Bulwark for implementing unit tests.

Starting point is 00:09:43 You can use them in the ETL pipeline, especially on the extract and the load steps. And Bulwark can be used during development. And they also have some options to turn assertion errors into warnings for production. I'm not really too sure how I feel about that, but that functionality is there if you want to use it. This is cool.

Starting point is 00:10:01 This is actually coming out of Chicago Python community member, Zax Rosenberg. He created this and gave a talk about it at Chippy. So just wanted to share it with the rest of our community. I think it's great. And I think that there's seeing more and more. I did a recent episode of Testing Code where we talked about, I talked to one of the

Starting point is 00:10:17 people on the Great Expectations Project that's around testing with data also. And you know, some frameworks are attacking the problem differently. And also just some people like one style over another. This is a little bit different style than great expectations. I think it's definitely worth checking out and having more. Gosh, more and more of our life is controlled by data pipelines, not necessarily controlled, but influenced by results based on these data

Starting point is 00:10:46 pipelines. So having these checks in place to make sure that our data is just as solid as our source code. This is awesome. Yeah, it's really good to see all these data scientists and data engineers thinking about testing in a different way. Oh, gosh, there's some interesting contributors to this. This is great. Cool. We got off on the test side. I could go down that rabbit hole for all day long. But let's pop out and talk about packaging a little bit. Packaging?

Starting point is 00:11:14 Have you talked about it before? Yeah. Yeah, actually. So packaging is such a... We do cover almost every story packaging related because it is a stumbling block for a lot of people to go forward at at some point you get into the intermediate python developer and how to share your code is and dealing with virtual environments and stuff is one of those things you have to run into so i think it's kind of cool i am not a poetry user but i might try this new one out so poetry just announced the poetry 1.0.0 they're no longer

Starting point is 00:11:46 zero versionings they've got to 1.0 awesome the announcement is by sebastian eustace and the highlights some of the changes for and a good caution right off the bat is they are breaking backwards compatibility with this version because the zero ver is often we're trying things out. And I think it's completely fair to break. It's always fair to break with major versions. But but I think this is reasonable. So one of the things I think it's this highlights, I want to highlight a few things. I think it's very for this version.

Starting point is 00:12:22 We can really take poetry seriously. I think that they're planning on sticking around and people that like it, there's a lot of changes that are good. They've added different ways to manage environments. So within a project, you can have poetry help you coordinate multiple versions of Python within the same project. That's pretty cool. I don't know if it had that before. That's neat. One of the things that I want to bring up next is private indices. And if you're working with projects within a company, this is very important to be able to have your own private, something like PyPI. One of the ways you can do that is you can have a private version of something like PyPI but you can also just throw a whole bunch of wheels

Starting point is 00:13:05 in a directory and use that as a index as well and so having there's some improved support you can have even specify a different index for each dependency if you want but you can also set up a primary and secondary or default and secondary index and have them be non something other than pipey if you want so this is cool people using poetry and other tools like this within continuous integration sometimes it's hard to pass things around and so environmental variables or sometimes they're ugly but they're useful within ci and so there's new environmental variable support. And then since there's all this new support for different sort of dependencies, the add command has been expanded to allow you to just add dependencies and put them in the right place with the right source and everything.

Starting point is 00:13:56 The other thing I want to highlight is they've improved some things around publishing to allow you to put API tokens in place instead of having to use text passwords and users. So very cool changes from Poetry. I applaud their progress. Yeah, I haven't checked it out, but I'm going to go check it out. Looks like there's been a lot of changes made. Pretty excited about the one point release.

Starting point is 00:14:18 This is neat. You know, people are respect all over the place. There's some people that absolutely love Poetry. There's some people using absolutely love poetry. There's some people using pipenv. I'm a straight just using the built-in VENV and all the tools around that. But yeah, whatever floats your boat, I guess. Yeah, for sure. What do you got for us next? So Kubernetes has been a huge part of the DevOps ecosystem in the last two, three years. With the rise of containers, Kubernetes is sort of the de facto platform for running and coordinating applications across multiple machines. It's not really something I know a lot about, but it's something I want to get to know more about. And I just found this

Starting point is 00:14:54 awesome link from DigitalOcean. They put together a Kubernetes for full stack developers curriculum. And what it does is, yeah, it follows all the steps that a new user would have to learn in order to deploy applications to Kubernetes. So you start by learning Kubernetes core concepts like what's a pod, what's a deployment. Next, you'll start building modern 12-factor web applications. And then you're going to start packaging these applications inside of containers. Next, you're going to deploy your containerized applications on Kubernetes. And then finally, you're going to learn how to manage

Starting point is 00:15:27 all that cluster orchestration and the cluster operations. I went through the first link, which was like an introduction to Kubernetes. I found the material really easy to understand. It's pretty much what you expect

Starting point is 00:15:37 from DigitalOcean. Like, I learned how to do a lot of my operations work from DigitalOcean. And I'm really excited that they have a lot more resources for the community. And I'm going to throw a link in the show notes to a talk I gave about introduction to Docker. If you're ever learning how to do Docker, I think this is a really good place to

Starting point is 00:15:54 get started. Yeah. And so just to be clear, Docker and Kubernetes are closely tied, right? Docker is the container where you package your applications in. And the Kubernetes is the platform that manages all your Dockerize, your containers across multiple machines. So this way with Docker, you can build your application. With Kubernetes, you can deploy it and scale it pretty much infinitely. Cool. And it wasn't until I, gosh, I'm forgetting the guy's name, but I saw a talk a couple of years ago about Kubernetes. And I didn't realize that this isn't just, I mean, yes, it's intended for cloud stuff, but you can test the entire Kubernetes setup locally on a laptop even. So that's pretty neat.

Starting point is 00:16:34 Yeah, for sure. I'm not lying there. Is that true? It's like a kubectl. You can have like a kubelet machine and pretty much do whatever you need to on the cloud locally. Okay, cool. Definitely good for playing around. You don't have to pay for anything.

Starting point is 00:16:47 Exactly. Well now, so the last topic, I want to apologize to people. I'm perfectly okay with making mistakes on the show. So on episode 159, we were going through a whole bunch of PyTest plugins. And one of the things we covered was PyTest picked. And I incorrectly assumed that it would run. There's a quote on the PyTest picked site that says it runs the tests related to unstaged files or the current branch on the current branch, according to Git. To me, that sounded like it runs the tests related to code that's changed.

Starting point is 00:17:21 But I was wrong. So Michael was right. I was wrong. So Michael was right, I was wrong. It just runs, it uses get to find out which files have changed, and if any of those are test files, it runs the tests related to those test files. That makes sense, and that might be, if you're developing tests, that might be exactly what you want. But if you're developing code, you might want something different, and the tool I was thinking about was pytest-testmon and that's a plugin for pytest also and i'm just going to read their blurb so pytest-testmon is a pytest plugin which selects and executes only tests you need to run it does

Starting point is 00:17:59 this by collecting dependencies between test runs and all executed code internally by using coverage, the coverage data, and comparing those dependencies against changes. So it updates the database on each test execution so it works independently of version control. So that's the thing I was thinking about. So if you use this coverage to find out what tests are affecting different parts of your code base. Very cool. So if you're making changes, this sounds like black magic to me that I'm glad somebody else wrote it, but it does look pretty neat. And I think I am definitely ready to try this again. I tried it a while ago. And for some reason, I don't even remember what I, I don't think I had any beef

Starting point is 00:18:41 with it, but I just didn't think I needed it. But this looks exciting enough to, I'm going to try that again. For large test weights, I think this would save you so much time instead of rerunning everything. Yeah. I mean, there's other ways to, if you know specifically what you want to rerun. Yeah. If you've got a whole bunch of different tests and they all run pretty fast, but you're not quite sure which ones you should run based on your code changes. This is kind of neat.

Starting point is 00:19:04 Awesome. So that's the end of our normal topics. We didn't really get to know you a little bit, but here, so you have anything extra you want to tell us what you're up to? Sure. So one of my favorite things about Python is the community. And as the organizer of Chicago Python, it's no surprise that building hyper local communities is like one of my main passions. And in Chicago, we have a lot of events for our members. We have talk nights, project nights, open source sprints. We recently started one to help people

Starting point is 00:19:32 with whiteboarding coding interviews. But there's great user groups all over the world. And I just want to include a link in the show notes so how your listeners can go out and find a local community they can be a part of. And I also want to give a special shout out to all my fellow organizers who are listening to this podcast. Thank you for all you do for the community.

Starting point is 00:19:51 Yeah, I mean, I didn't know that the Chicago community was so huge. You guys are, did you say 11 organizers? Yeah, we have 11 organizers. We've been around since like 2003. So it's like one of the OG Python user groups. And we actually had like Mathplotlib debuted at Chippy. So a lot of historical things happening. Yeah, it was cool. So I was in Chicago last week and I don't know, I posted something on Twitter

Starting point is 00:20:17 like Chicago's cold. Some people made comments, but Ali reached out and said, hey, if you're in town and available for a drink or something, we should up and so totally impromptu we met and had dinner together it was a blast and yeah talk about community that's one of the things i love about the community i'll if i'm wherever i'm traveling i can like try to hit up people in that area and say hey i'm in town i only have like three hour window but is anybody available? And usually I can get somebody to come by and we can BS or something. And so it's great. We have Python friends all around the world. You have a note here on PyTennessee. Yeah. So PyTennessee is going to be happening on March 7th and 8th. I'm going to be giving a talk on design patterns and tickets

Starting point is 00:20:59 are available at PyTennessee.org. It's going to be in Nashville, my third year in a row going, can't wait. And Brian, I think I saw your name on the talk list as well. Oh, yeah. I am going to that. Awesome. I'm like, I don't think I thought I was going to Nashville. Oh, yeah, that is in Tennessee. Yes, I failed geography. So cool. I've never been there. So it should be fun. Yeah, we'll grab some dinner or a drink or something. It'll be fun. So speaking of community and this podcast, I just want to announce that the next Python PDX West meetup, the one in Hillsborough, just west of Portland, is happening January 7th. And I'm bringing it up because Michael and I thought it'd be fun to just do a live recording of Python Bytes with everybody hanging out at the meetup. There's also going to be one to two other talks there and we'll have

Starting point is 00:21:48 food. So if anybody's in the Portland area, the second week in January, swing by. Sounds like a lot of fun. A couple jokes for us. So the first one is sent in from Tyler Madison. It's just a short joke. So two co-routines walk into a bar. Of course, it's a runtime error because bar was never awaited. Async jokes. I got one for you. So how many developers on a message board does it take to screw in a light bulb?

Starting point is 00:22:19 I don't know. So the answer is, why are you trying to do that? For all those people that are just trying to make sure that they're making sure that you're doing things their way. Yeah, I hate that. People like, you know, somebody asks a question and before anybody, it might be a simple answer, but somebody will say, yeah, you shouldn't be doing that. Don't do that. Yeah. This was fun.

Starting point is 00:22:39 Thank you so much for filling in for Michael and co-hosting today. Thanks so much for having me, Brian. Thank you for listening to Python Bytes. Follow the show on Twitter at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way.

Starting point is 00:22:59 We're always on the lookout for sharing something cool. This is Brian Ocken, and on behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Python Bytes - #162 Retrofitting async and await into Django

See the full show notes for this episode on the website at pythonbytes.fm/162...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Python Bytes - #162 Retrofitting async and await into Django

See the full show notes for this episode on the website at pythonbytes.fm/162...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.