Python Bytes - #162 Retrofitting async and await into Django
Episode Date: January 3, 2020See the full show notes for this episode on the website at pythonbytes.fm/162...
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 162, recorded December 19, 2019.
I'm Brian Ocken, and today I've got guest host Ali Sivji.
He's one of the organizers of the Chicago Python Meetup, and we met last week when I was there for briefly,
and we already started recording this, but I forgot to press record, so we're starting over. over. So thanks Ali for hanging in there with me. Thanks Brian, glad to be here.
This episode is brought to you by Datadog. We'll talk more about them later but first
Ali's going to tell us some exciting things about Django. Yeah so Django 3.0 just came out in early
December and so I really wanted to find out what was going on.
I found a talk from DjangoCon 2019 by Andrew Godwin titled
Just Add a Wait,
Retrofitting Async into Django.
And Andrew's been the one
leading the implementation
of asynchronous support into Django.
And he discussed what the 3.0 release signifies,
also went over the implementation roadmap
for upcoming releases.
Okay.
Wow, that sounds fascinating. Yeah, he started by giving an overview of the async landscape. signifies also went over the implementation roadmap for upcoming releases. Okay. Wow.
That sounds fascinating.
Yeah.
He started by giving an overview of the async landscape, talked about how synchronous and
asynchronous code interact.
A key point that he made was that when we have asynchronous functions, they're very
different than synchronous functions.
This makes it really difficult to design proper APIs.
And one of the difficulties in adding asynchronous support to
Django is that this project, it's been
around for a really long time.
There are a lot of people familiar with it.
And for these new async implementations,
they wanted to have that
same familiar feeling. And so they
made a plan to implement async capabilities
in three phases. And so phase
one, that was completed with the Django
3.0 release. And this phase, it's was completed with the Django 3.0 release.
And this phase,
it's really meant to lay
a lot of the groundwork
for future changes.
And so Django now supports
Asynchronous Server Gateway Interface
or ASCII.
And the ORM,
it's also async aware.
So if you actually call ORM code
from async code,
it's going to raise an exception
to make sure that you're not calling
the synchronous function
from your async code. You don't want to have unexpected behavior there. Phase two is going to raise an exception to make sure that you're not calling the synchronous function from your async code.
You don't want to have unexpected behavior there.
Phase two is going to be tracking with Django 3.1.
And they're going to be adding async capabilities to the core parts of the request path.
This means we're going to have async views, async handlers, and async middleware.
And there's already a branch in the Django repository where this is mostly working.
They just need to add a couple of tests
to make sure things are passing with all those cases.
It sounds really well thought out.
They put a lot of work into it,
and there's a really detailed Django pep.
I think it's called a dep of all about it.
And then finally, with phase three,
that's going to make the Django ORM all async.
And Andrew called this the largest, most difficult, and most unbounded part of the project.
So if we just start thinking about ORM queries, sometimes they can result in a lot of implicit
data-based lookups if we're not really careful.
And this is where a lot of those complexities are going to come from.
I really enjoyed this talk.
I think it went into the right amount of depth while being really accessible.
And I just want to call out the fantastic diagrams.
It really helped improve my understanding of Django internals. And if you check the show notes,
I'm going to include a link to the async project wiki. Andrew needs some help putting these things
into Django. He's got a lot of areas there. If you have some time, you have the expertise,
please go ahead and help him out. Yeah, awesome. I like doing calls to action on the show to try
to get people involved. And I've been just starting Django for like several years now, just dipping my toes
in and out every once in a while. And I was kind of waiting for this new version to come out. So
I'll probably check this out first. I like that idea of having, since they're gradually rolling
it out, doing error messages for if you try to do things that aren't implemented yet.
So that's cool.
Yeah, it's always best to be explicit.
So would you say you've been playing around with Django?
Yeah.
Okay.
Nice segue there.
So the next topic, speaking of playing, I don't know how you got into programming, but a lot of people get into it with games and i actually feel bad that i'm uh i'm in that like
in that camp also because i don't know it's just kind of cliche but yeah sure enough games by
example is a github repo by al swigert and other contributors as well or i think there's others but
he wants more so the idea is idea is these are standard IO games.
And he's got a collection of games with the source code to use for example programming lessons.
They're all written in Python 3,
and you can just clone the repo or just view it and copy and paste
even for people that aren't used to Git stuff.
I think it's neat because before i even learned any of the
concepts of programming i was programming games by copying them out of magazines and typing them
into my old trs-80 way back in the stone ages that's awesome i didn't know what it was doing
but i could modify things and sort of interpret it because you can kind of read basic so there's
some cool features of these games that actually I would have enjoyed at the time.
Some of the neat things is that they're short.
They're limited to 256 lines of code, just an arbitrary number.
But that's a pretty nice small code size.
They're all single file, single source code files with no installers.
So you just run them with Python.
They'll use just the standard library.
If they're only using print and input for interacting with the user,
there's some downsides to that, but there's some upsides too because they're fairly simple.
He's tried to make them very well commented and very readable.
So he mentions that they're elegant and efficient is nice sometimes,
but understandable and readable is better for education purposes.
So that's what they're planning on doing.
That's what his focus is.
Also, input validation on everything so that you can't crash Python from typing in something
weird.
This is kind of cool.
He made sure that every function has a doc string to describe what it is, because it
really is meant to be teaching people.
And I think this is pretty cool. Yeah, Al does a lot of good things about structuring things in a way that everybody
can understand really love all the stuff he does this would be great for people to use in i think
in helping your kid out i'm going to try to get my my kids uh running with some of this stuff to
just say hey watch this is how you run it. Now play with it and break it. He also included to-do list of things he wants to do with the project,
but hasn't done yet.
And I love that because it gives people direction
if they want to help out with the project.
One of the areas is testing.
He wants more tests.
So I'd love for people to get involved with that.
Always got to make the call up for testing.
Yeah, definitely.
So before we move on, I want to give a shout out to our sponsor. This episode is sponsored by Datadog, a cloud scale monitoring platform that unifies metrics, logs, distributed metrics, and plot the flow of traffic across multi-cloud environments with network
performance monitoring. Plus, Datadog integrates with over 350 technologies like Postgres, Redis,
and Hadoop, and their tracing client auto-instruments common frameworks and libraries
like Django, Tornado, Flask, and AsyncIO. That's cool. Get started today with a 14-day free trial at pythonbytes.fm slash datadog.
Now, speaking of testing, Bulwark has some...
Ah, I stole your thunder.
But the NIST topic has some test-related stuff, right?
Yeah, so Bulwark is an open-source library that allows users to easily property test their Pandas dataframes.
It makes it easy for data
analysts and data scientists to write tests.
So let's just take a step back.
We all know that tests are important, but tests for data, they're just a little bit
different.
These tests, they're not as deterministic and they require us to think about testing
just a little bit differently.
So with property tests, what we can do is we're able to check that an object has a certain property.
And so property tests for data frames include things like
validating the shape of a data frame,
checking to see if a column is within a given range,
verifying that a data frame has no NANDs,
and things like that.
So with Bulwark, you're able to implement
these property tests as checks
and each check takes a data frame
and some optional arguments.
This check will make an assertion
about some data frame property.
If things are good,
the check's going to return the original data frame.
If the check fails, the assertion error is raised
and you have a little bit more insight
into what went wrong.
And this is really helpful
when you have like a large data pipeline.
That's cool.
Yeah, it's really, really cool.
And so one of the ways that Bulwark lets you implement property tests
is also through decorators.
And so when you're creating data pipelines,
it's really useful to do them as functions.
You have some input data, you perform some sort of action,
it returns an output.
And with Bulwark, what you can do is you can add decorators
to your
pipeline functions and validate that the properties of your input data frames meet the conditions that
you really want it to have. So Bulwark has a lot of pre-built checks already in there. There's one
for has certain data types. Is this column monotonic? Is this within a certain number of
standard deviations? And it seems pretty straightforward to add new checks.
And instead of stacking decorators for multiple checks,
they have a special decorator, a multi-check decorator,
which won't fail only on the first.
It's actually going to run through them all
and tell you all the ways your data either passed or did not pass.
Oh, that's great. Wow. Neat.
Yeah, you can use Bulwark for implementing unit tests.
You can use them in the ETL pipeline,
especially on the extract and the load steps.
And Bulwark can be used during development.
And they also have some options to turn assertion errors
into warnings for production.
I'm not really too sure how I feel about that,
but that functionality is there if you want to use it.
This is cool.
This is actually coming out of Chicago Python community member,
Zax Rosenberg.
He created this and gave a talk about it at
Chippy. So just wanted to share it with the rest of our
community. I think it's great. And I think that there's
seeing more and more. I did a recent
episode of Testing Code where we talked
about, I talked to one of the
people on the Great Expectations Project
that's around testing with data
also. And you know, some
frameworks are attacking the problem differently.
And also just some people like one style over another.
This is a little bit different style than great expectations.
I think it's definitely worth checking out and having more.
Gosh, more and more of our life is controlled by data pipelines, not necessarily controlled, but influenced by results based on these data
pipelines. So having these checks in place to make sure that our data is just as solid as our
source code. This is awesome. Yeah, it's really good to see all these data scientists and data
engineers thinking about testing in a different way. Oh, gosh, there's some interesting contributors
to this. This is great.
Cool.
We got off on the test side.
I could go down that rabbit hole for all day long.
But let's pop out and talk about packaging a little bit. Packaging?
Have you talked about it before?
Yeah.
Yeah, actually.
So packaging is such a...
We do cover almost every story packaging related because it is a stumbling block for a lot of people to go forward at at some
point you get into the intermediate python developer and how to share your code is and
dealing with virtual environments and stuff is one of those things you have to run into so i think
it's kind of cool i am not a poetry user but i might try this new one out so poetry just announced the poetry 1.0.0 they're no longer
zero versionings they've got to 1.0 awesome the announcement is by sebastian eustace and
the highlights some of the changes for and a good caution right off the bat is they are breaking
backwards compatibility with this version because the zero ver is often we're trying things out.
And I think it's completely fair to break.
It's always fair to break with major versions.
But but I think this is reasonable.
So one of the things I think it's this highlights, I want to highlight a few things.
I think it's very for this version.
We can really take poetry seriously.
I think that they're
planning on sticking around and people that like it, there's a lot of changes that are good.
They've added different ways to manage environments. So within a project, you can have
poetry help you coordinate multiple versions of Python within the same project. That's pretty
cool. I don't know if it had that before. That's neat. One of the things that I want to bring up next is private indices. And if you're working with projects within a company, this is
very important to be able to have your own private, something like PyPI. One of the ways you
can do that is you can have a private version of something like PyPI but you can also just throw a whole bunch of wheels
in a directory and use that as a index as well and so having there's some improved support you can
have even specify a different index for each dependency if you want but you can also set up
a primary and secondary or default and secondary index and have them be non something other than pipey if you want so this
is cool people using poetry and other tools like this within continuous integration sometimes it's
hard to pass things around and so environmental variables or sometimes they're ugly but they're
useful within ci and so there's new environmental variable support. And then since there's all this new support for different sort of dependencies,
the add command has been expanded to allow you to just add dependencies
and put them in the right place with the right source and everything.
The other thing I want to highlight is they've improved some things around publishing
to allow you to put API tokens in place instead of having to use text passwords
and users.
So very cool changes from Poetry.
I applaud their progress.
Yeah, I haven't checked it out, but I'm going to go check it out.
Looks like there's been a lot of changes made.
Pretty excited about the one point release.
This is neat.
You know, people are respect all over the place.
There's some people that absolutely love Poetry.
There's some people using absolutely love poetry. There's some people using pipenv. I'm a straight just using the built-in VENV and all the tools around that.
But yeah, whatever floats your boat, I guess. Yeah, for sure. What do you got for us next?
So Kubernetes has been a huge part of the DevOps ecosystem in the last two, three years.
With the rise of containers, Kubernetes is sort of the de facto platform for running and coordinating applications across multiple machines. It's not really something I
know a lot about, but it's something I want to get to know more about. And I just found this
awesome link from DigitalOcean. They put together a Kubernetes for full stack developers curriculum.
And what it does is, yeah, it follows all the steps that a new user would have to learn in order to deploy applications to Kubernetes.
So you start by learning Kubernetes core concepts like what's a pod, what's a deployment.
Next, you'll start building modern 12-factor web applications.
And then you're going to start packaging these applications inside of containers.
Next, you're going to deploy your containerized applications on Kubernetes.
And then finally, you're going to learn
how to manage
all that cluster orchestration
and the cluster operations.
I went through the first link,
which was like an introduction
to Kubernetes.
I found the material
really easy to understand.
It's pretty much what you expect
from DigitalOcean.
Like, I learned how to do
a lot of my operations work
from DigitalOcean.
And I'm really excited
that they have a lot more resources
for the community. And I'm going to throw a link in the show notes to a talk I gave about introduction
to Docker. If you're ever learning how to do Docker, I think this is a really good place to
get started. Yeah. And so just to be clear, Docker and Kubernetes are closely tied, right?
Docker is the container where you package your applications in. And the Kubernetes is the platform that manages
all your Dockerize, your containers across multiple machines. So this way with Docker,
you can build your application. With Kubernetes, you can deploy it and scale it pretty much
infinitely. Cool. And it wasn't until I, gosh, I'm forgetting the guy's name, but I saw a talk
a couple of years ago about Kubernetes. And I didn't realize that this isn't just, I mean, yes, it's intended for cloud stuff,
but you can test the entire Kubernetes setup locally on a laptop even.
So that's pretty neat.
Yeah, for sure.
I'm not lying there.
Is that true?
It's like a kubectl.
You can have like a kubelet machine and pretty much do whatever you need to on the cloud locally.
Okay, cool.
Definitely good for playing around.
You don't have to pay for anything.
Exactly.
Well now, so the last topic, I want to apologize to people.
I'm perfectly okay with making mistakes on the show.
So on episode 159, we were going through a whole bunch of PyTest plugins.
And one of the things we covered was PyTest picked.
And I incorrectly assumed that it would run.
There's a quote on the PyTest picked site that says it runs the tests related to unstaged files or the current branch on the current branch, according to Git.
To me, that sounded like it runs the tests related to code that's changed.
But I was wrong.
So Michael was right.
I was wrong. So Michael was right, I was wrong. It just runs,
it uses get to find out which files have changed, and if any of those are test files,
it runs the tests related to those test files. That makes sense, and that might be,
if you're developing tests, that might be exactly what you want. But if you're developing code,
you might want something different, and the tool I was thinking about was pytest-testmon and that's a plugin for pytest also and i'm just going to read their blurb
so pytest-testmon is a pytest plugin which selects and executes only tests you need to run it does
this by collecting dependencies between test runs and all executed code internally by using coverage,
the coverage data, and comparing those dependencies against changes. So it updates the database on
each test execution so it works independently of version control. So that's the thing I was
thinking about. So if you use this coverage to find out what tests are affecting different parts
of your code base. Very cool. So
if you're making changes, this sounds like black magic to me that I'm glad somebody else wrote it,
but it does look pretty neat. And I think I am definitely ready to try this again. I tried it a
while ago. And for some reason, I don't even remember what I, I don't think I had any beef
with it, but I just didn't think I needed it. But this looks exciting enough to, I'm going to try that again.
For large test weights, I think this would save you so much time instead of rerunning everything.
Yeah.
I mean, there's other ways to, if you know specifically what you want to rerun.
Yeah.
If you've got a whole bunch of different tests and they all run pretty fast, but you're not
quite sure which ones you should run based on your code changes.
This is kind of neat.
Awesome.
So that's the end of our normal topics. We didn't really get to know you a little bit,
but here, so you have anything extra you want to tell us what you're up to?
Sure. So one of my favorite things about Python is the community. And as the organizer of Chicago
Python, it's no surprise that building hyper local communities is like one of my main passions.
And in Chicago, we have a lot of events for our members.
We have talk nights, project nights, open source sprints.
We recently started one to help people
with whiteboarding coding interviews.
But there's great user groups all over the world.
And I just want to include a link in the show notes
so how your listeners can go out
and find a local community they can be a part of.
And I also want to give a special shout out
to all my fellow organizers who are listening to this podcast.
Thank you for all you do for the community.
Yeah, I mean, I didn't know
that the Chicago community was so huge.
You guys are, did you say 11 organizers?
Yeah, we have 11 organizers.
We've been around since like 2003.
So it's like one of the OG Python user groups.
And we actually had like Mathplotlib debuted at Chippy. So a lot of historical things happening.
Yeah, it was cool. So I was in Chicago last week and I don't know, I posted something on Twitter
like Chicago's cold. Some people made comments, but Ali reached out and said, hey, if you're in
town and available for a drink or something, we should up and so totally impromptu we met and had dinner together it was a blast and yeah
talk about community that's one of the things i love about the community i'll if i'm wherever i'm
traveling i can like try to hit up people in that area and say hey i'm in town i only have like
three hour window but is anybody available? And usually I can get
somebody to come by and we can BS or something. And so it's great. We have Python friends all
around the world. You have a note here on PyTennessee. Yeah. So PyTennessee is going to
be happening on March 7th and 8th. I'm going to be giving a talk on design patterns and tickets
are available at PyTennessee.org. It's going to be in Nashville, my third year in a row going,
can't wait. And
Brian, I think I saw your name on the talk list as well. Oh, yeah. I am going to that.
Awesome. I'm like, I don't think I thought I was going to Nashville. Oh, yeah, that is in Tennessee.
Yes, I failed geography. So cool. I've never been there. So it should be fun. Yeah, we'll grab some
dinner or a drink or something. It'll be fun.
So speaking of community and this podcast, I just want to announce that the next Python PDX West meetup, the one in Hillsborough, just west of Portland, is happening January 7th.
And I'm bringing it up because Michael and I thought it'd be fun to just do a live recording of Python Bytes with everybody hanging out at the meetup. There's also going to be one to two other talks there and we'll have
food. So if anybody's in the Portland area, the second week in January, swing by.
Sounds like a lot of fun.
A couple jokes for us. So the first one is sent in from Tyler Madison. It's just a short joke.
So two co-routines walk into a bar.
Of course, it's a runtime error because bar was never awaited.
Async jokes.
I got one for you.
So how many developers on a message board does it take to screw in a light bulb?
I don't know.
So the answer is, why are you trying to do that?
For all those people that are just trying to make sure that they're making sure that you're doing things their way.
Yeah, I hate that.
People like, you know, somebody asks a question and before anybody, it might be a simple answer, but somebody will say, yeah, you shouldn't be doing that.
Don't do that.
Yeah.
This was fun.
Thank you so much for filling in for Michael and co-hosting today.
Thanks so much for having me, Brian.
Thank you for listening to Python Bytes.
Follow the show on Twitter at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at pythonbytes.fm.
If you have a news item you want featured,
just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
This is Brian Ocken,
and on behalf of myself and Michael Kennedy,
thank you for listening and sharing this podcast
with your friends and colleagues.