Python Bytes - #90 A Django Async Roadmap

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news directly to your earbuds. This is episode 90, recorded August 2nd, 2018. I'm Michael Kennedy. And I'm Brian Ocken. Hey, Brian. Good to be with you again. It's good to talk to you again. Yeah. And it's also good to have DigitalOcean sponsoring this episode. So thank you to DigitalOcean. We're both customers and they're sponsors of ours. So it's kind of this weird mix of everybody loving it. So pythonbytes.fm slash digitalocean

Starting point is 00:00:26 will get you $100 credit for your servers if you're a new customer. So check that out. In the meantime, we should probably talk about some data analysis. We should. And who better to talk about data analysis than Jake Vander Plaas?

Starting point is 00:00:39 I honestly don't know. He does awesome work over there at the eScience Institute. And I love listening to his talks. So there's a set of videos that he has on. This is actually from last year, but I didn't count. There's 11 videos, and it's called Reproducible Data Analysis in Jupyter. But each of the videos is like five or six minutes, so they go pretty fast.

Starting point is 00:01:02 This is a really cool thing. I think everybody should check these out anyway is because they're he goes through a problem of or just a a data set that's the bikes cross the bike crossings at a particular bridge in seattle i think but it really doesn't matter the data how the data gets in there but he's using doing all this stuff live. He's doing a Jupyter notebook and pulling data in. And sometimes the tables like the table that he ends up with doesn't look quite right. So he uses a different different column to as the index or the for instance, in the first video, he puts a graph up and all the data is sort of packed together. So he changes the sample rate into just weekly data. And all of that stuff, I didn't even know you could do those things.

Starting point is 00:01:51 So it's not necessarily a complete, it's kind of a full pass through using all the tools you can use to do exploratory data analysis with Jupyter and doing it live. And watching a pro do it, it's a thing of beauty and he's um like i said each particular tool that he use uses it's not an in-depth study on exactly how to use that to its completeness but you just get a glimpse at all the power that you can do with all these things yeah i really like it and i think this sort of view into exploring data is super interesting it really shows the power of Jupyter notebooks. Like when I first saw them, I thought, oh, well, there's like a simplified programming environment for people that aren't like real programmers

Starting point is 00:02:30 and don't want to work with different files and stuff like that. But the more I saw people using it and interacting with it, I realized it's just for people solving problems entirely differently than the type of problems I solve. And it's really great for that. Yeah, and working with data, when you're throwing up a graph or a plot of the data, sometimes you might be plotting it wrong.

Starting point is 00:02:53 You're like, well, maybe I could see something if I plot it this way. And it's not interesting. It's just a bunch of points everywhere. But if you plot it a little different with a different axis maybe, or a different type of plot, it might show you interesting information. And this is actually fascinating to me, this notion of people using large data sets and then trying to figure out, like, in real time how to use it well.

Starting point is 00:03:18 And then once you've already figured that out, if you want to put some program together using some of those tools to monitor those things, that's a great idea. But that, the ability to just use a notebook to, to just explore stuff is pretty fascinating. Yeah. And you know, there's some really interesting use cases, like suppose you want to go hit a bunch of APIs and download some data and then graph it. If you did that in like a script, every time you want to view it slightly differently you're rerunning that down you're getting or computing the data however you do that right whereas in notebooks you can just rerun one cell at a time change that cell run it again and you have to recompute or reacquire the data so it's really there's a lot of interesting aspects to it yeah and when well that's another thing he starts out with showing how to within the notebook um grab a url of data and put it in a CSV file or some local file

Starting point is 00:04:07 so you don't have to do that network all the time. Yeah, nice. But anyway. Yeah, that's a good one. I really like the series, and I'm glad you brought it up. Another thing I want to bring up, something we haven't covered very much. Have we talked about GUIs in Python yet? You know, I think they're important.

Starting point is 00:04:22 We probably should start talking about it. Yes. So because we were on our kick, of course, people said, you know, I said, oh, have you heard about this? Have you tried that? And here's another, here's another one. This one comes from Mike Barnett. He sent me a couple emails, sort of charting the progress of this project that he built. The name pretty much says it all, PySimpleGUI. So it's for simple Python GUIs. And it's just another really simple way to take what would have been like a command line program and make it a lot better. So, you know, bonus points to Mike, because this not only has a screenshot, this has many screenshots and many examples. So as all the Python GUI libraries out there should have,

Starting point is 00:05:06 if you want people to use them, screenshots. So check it out. It's pretty cool. What you do is you work more or less in a 100% Python language API, and you don't work down at the GUI toolkit layer, which is cool, I think. So you define the UI and it has this sort of auto layout mechanism and it has things like slide bars and text boxes and buttons and it works in all the things. So that's pretty nice. It works on like a Raz. It says one of the examples is do you have a Raspberry Pi with a touchscreen? Well, you know, why don't you just write this one screen full of GUI code and you've got your touchscreen GUI working. Isn't that cool? How neat.

Starting point is 00:05:45 Yeah. So for better or worse, it's based on TKinter, which is good because it comes with Python. There's no dependencies. So literally you just pip install PySimpleGUI, or if you want, there's a single Python file you can include with your code. So there's not even the pip install package stuff. You can literally just go, here's the file, PySimpleGui.py, put next to my program, and that's all it needs. So that's cool,

Starting point is 00:06:10 but honestly, I think TKinter looks a little dated, right? Like it looks like it belongs a little more in like war games than it does in 2018. But anyway, it's still pretty nice as it is. And one of the upcoming things But anyway, it's still pretty nice as it is. And one of the upcoming things that he lists on sort of the upcoming goals or if anybody wants to help out is port this to other graphics engines. Because you don't work in the toolkit API directly, but this is like a translation layer, you could hook this up to, say, WX Python or Python for Qt, Qt for Python, things like that.

Starting point is 00:06:45 So you could get one of these modern-looking UIs if somebody's willing to do that translation. That's a neat idea. I'd like to see people working on that. Yeah, wouldn't that be sweet? Then it could actually detect if you even have them, and it wouldn't even go, oh, do you have WX? Okay, great, we'll go with that. Oh, you have Qt? Let's just run that version.

Starting point is 00:07:02 Don't have to get a dependency installed. That'd be best. Yeah, and yeah you're right that the uh tk stuff looks dated but you know there's a lot of use cases where you can use a gui that doesn't have to be pretty yeah well you know what else can look dated is the command line to a lot of people so you know maybe this is a lot better interaction for non-technical people even if it does have a little bit funky shading on the buttons or something. Yeah, I mean, I don't get it, but I do have, I fight that battle every once in a while.

Starting point is 00:07:30 I'll tell people this, just run this command line thing. What do you mean run the command line thing? Oh dear. They're like, I studied accounting. I don't know where the terminal is on my Mac. Okay, okay then. Click this. Yeah, one other thing to call out for, I think

Starting point is 00:07:46 that's interesting. This is Python 3 only. So no Python 2 for PySimple GUI. And more and more, again, we're seeing things where it used to be, well, I can't switch to Python 3 because this isn't supported. Now it's like, if you don't switch to Python 3, you don't get these cool libraries.

Starting point is 00:08:02 And here's just one more example. Yeah. Great. Well, I, we've got a team that is migrating to Git. And so this isn't directly Python related, but I ran across this article called Useful Tricks You Might Not Know About Git Stash. And Git Stash is the stash command. It's something that I actually did did it took me a while to run

Starting point is 00:08:27 across because it's not it's not something you have to use but a stash is a way to sit to so let's say you've got a a repository where you've you've cloned it now you've made some changes on it and you're not ready to do anything with your changes but you need to like maybe pull down a new version or you branched at the wrong point or whatever stash is a way to save off all of your changes all the dirty stuff in your directory save it away and just like hide it somewhere and then you can reapply those changes later after you you know update or pull or something and i'm still working through how to integrate Stash into a good workflow, but I wanted to highlight this article, Useful Tricks. Useful Tricks You Might Not Know

Starting point is 00:09:12 About Get Stash, because I'm learning it and I wanted other people to know about it also. Yeah, that's cool. One of the ideas that seems like it might be relevant here is, suppose you've got some branch checked out and you're doing some work you're like halfway through it and somebody comes along and says hey i'm on the same branch as you and we have this bug could you just like fix this really quick or make this quick change so that i can carry on and you're like oh but this work i have here is like half done it won't be done till tomorrow so you could like stash that away get the latest, do some work, push that in, and then reapply that stash to get your work back without actually committing it and messing up that whole branch.

Starting point is 00:09:51 Yeah, definitely. And then the use case we often use is like the test team is working on different tests around, but we're sharing utility libraries and fixtures and stuff. And somebody updates a crucial fixture or a utility, communication utility. And so you want to use that, but you're in the middle of writing your test or changing something new.

Starting point is 00:10:19 These aren't merge conflicts at all, but Git doesn't let you pull in the new stuff on top of your old stuff i mean you can do a merge right but you still have to commit it to your local repo before you can do a pull which you might not want to do yet and if you're just starting out or whatever you might not want to do that if you just want to look at stuff so that's the case where a lot of uh we're playing with this workflow is to just stash away your changes to a pull and again people can correct me if i'm if i'm using the term pull wrong because i'm still learning at the the right times to do pulls and fetches and merges and all that stuff so yeah nice yeah this is really cool two things

Starting point is 00:10:58 that you call it here that i thought were pretty cool well i guess three one is you can label your stashes so like you know what they. They're not just hashes. That's good. Also, I didn't know you can do a dash U to include untracked files. That's pretty cool. That's pretty cool. I didn't know that either before I read the article. And the other one, the last one that I totally didn't know you could do is once you have your stash saved, you could say, well, I probably shouldn't have done it as a stash. I probably should have just put that on a branch.

Starting point is 00:11:29 There's a way to say get stash branch and then a name, and then you can specify which stash. And it just takes all those files, all those changes, and creates a branch. That's cool. Yeah, I really like that one. Like, oh, I stashed it, and actually what I want to do is more work and sort of parallel, like break it off, split off my work without committing it to sort of convert the stash to a branch. That's cool. Yeah, very nice. I like it. So let me tell you about this new thing that DigitalOcean has. So they have virtual machines and floating IPs, and they have spaces and load balances and all these sorts of things, even

Starting point is 00:12:04 domains and DNSs. And if you have a lot of stuff going on at DigitalOcean, well, you might have like 20 virtual machines, and some of them are for some project, another one is for another project, like, how do you know which one is for which? And is that one safe to delete? I think we're done with it. But I'm not sure I actually don't know really know what it belongs to. So they came up with this new feature called projects where you can group droplets, load balancers, domains, IP addresses, all that kind of stuff into one to different projects. So you can say this one is say for the training site, these three parts all fit together there. This one is for the Python Bytes podcast. And these two servers and spaces all fit together over there. So pretty cool. Check that out. It's just one more

Starting point is 00:12:45 way to make your hosting life easier. Yeah. And be sure to visit pythonbytes.fm slash digital ocean. If you're a new user, you get a hundred dollars credit. So that makes it even nicer. So one of the things I'd like to see Brian is more async stuff. And I think the place where it's most beneficial is around the web actually. Well, yes. Last last week i said because we can have like a bunch of different worker processes it's not really necessary right you can get like if you've got an eight core server you could have say 16 little micro whiskey worker processes and each one can sort of computationally chew up its stuff uh like one one core and it sort of gets shared by the os but really there's some limit where you don't want to create more because you

Starting point is 00:13:25 run out of memory, right? Like I think the 16 on the training site probably take like a two gigs of RAM. So you can't have many of them or you'll run out of RAM unless you have a lot of room there. At some point you maybe are waiting on a database call, right? You do request the request says, well, in order to process the request, I need to have, like, hit this database. And actually, this is a query that takes 500 milliseconds to return. That thread is really doing nothing but just waiting on a socket

Starting point is 00:13:55 to return something from, say, Postgres or MongoDB. And just as well could be doing other work if it could let go, but, you know, maybe it doesn't, right? So if we can build this so that we could build it with async and await, any time our code is waiting, it immediately gives up its thread, and it will go on to do more processing. And so, for example, this is one of the ways, this is basically the fundamental process concept of how Node.js can do

Starting point is 00:14:21 hundreds of thousands of requests on one server, concurrent, because most of those are waiting on a database or something else. However, the problem is many of the popular Python frameworks don't support this concept of async. We have new frameworks, Sanic, Gepronto, others that do support it, but those are not the old frameworks right so there's like this there's these new ones that are exciting and fast and there's the old ones that everybody knows how to work with and have deployed but bridging that gap is a challenge so andrew godwin the guy who worked on django channels works on django channels he came up with a django async roadmap and it's pretty interesting and pretty thorough and it talks about like the

Starting point is 00:15:05 time frame and how they might make Django support this sort of world where you can have async methods yeah that's actually really cool because I mean if Django and Flask have to get there or especially Django or it's gonna something else will take over. Right, exactly. I mean, this is one of the times you hear people say, I'm switching to Go because it does better concurrency than Python. Well, if Django and the other frameworks just had it baked in, that whole argument would largely go away. So anyway, this is really cool, and I really like how he's put it together.

Starting point is 00:15:44 He said he thinks it's time, the time has come to start seriously talking about bringing async functionality to Django. And he's shared it previously with some people internally, but this is him kind of coming out and saying I'm opening it up for public feedback. So he has some interesting goals. He says the goal is to make Django a world-class example of what async can enable for HTTP requests. And that means various things

Starting point is 00:16:04 at different parts of the stack. So doing ORM requests in parallel, right, this waiting on the database, instead of waiting the 500 milliseconds, you just continue doing processing, and then get back to it when the ORM responds, allowing views to query external API's without blocking. So you, you know, we talked about the retry stuff, if you're calling, like a credit card or other sort of external API, right, then, you know, that would go away faster. You could do like slow response, long polling, super easy. It's like a sort of WebSocket stand in all sorts of performance improvements. So it's imperative that they keep Django backwards compatible and to make sure that when people come to the project,

Starting point is 00:16:45 this is an option they can turn on, not something they have to learn. So part of the beauty of these frameworks is they've been really easy to get started with. Let's not throw this at people at the very first thing they ever do. Yeah, yeah. There's a place for it,

Starting point is 00:17:00 and sometimes not a place for it. Yeah, yeah, exactly. All right, so I said, why now? Well, Django 2.1 will be the first release to support Python 3 and above, and not the previous ones. And Python 3.5 and above, this is where async and await,

Starting point is 00:17:16 the language syntax, and it truly has become properly supported. So that's why I think now is the time to start working on this. Yeah, and then the sort of the timeline is broken out into different django releases and and which ones what sort of goals might be for each one that's nice yeah it's pretty cool like it doesn't make any sense to parallelize the web methods necessarily without paralyzing the data access so maybe start with the orm actually

Starting point is 00:17:41 things like that yeah i love this i think that it's also good also talking about people. It's not just individual developers, but companies that have applications. They might want to think about concurrency, but they don't necessarily, they don't need to do it right now. But they know that they're going to eventually do it to see this roadmap. And maybe that'll help people. In the article also, it talks about funding. So if a lot of companies are relying on this or looking forward to it in the future,

Starting point is 00:18:12 maybe kicking in some dollars to help it go faster is a good thing. Yeah, it blows my mind how many huge companies are basically built on Python infrastructure but contribute zero to it. Yeah, and that's something that we're, as a society, not just the Python community, but the web using community, we've got to tackle that. But part of this is, I guess, just also things like this to say, hey, this is where we're

Starting point is 00:18:37 going. And some of these problems need people focused on it, not just volunteer time. So some direct money to hire somebody for six months or a year would be a good idea. Yeah. A lot can get done with just like a few months of focus time. Yeah, definitely. Yeah. I mean, that's how the new PyPI got launched. So this is all the way up to, it looks like a mostly async Django by Django 3.2. Yeah, that's awesome. That's a ways, it's pretty conservative. It's not too wild. And I really think it's a well thought out plan. So I'm happy to see Andrew put it out there. Nice. Yeah, that's awesome. It's pretty conservative. It's not too wild. And I really think it's a well thought out plan. So I'm happy to see Andrew put it out there.

Starting point is 00:19:08 Nice. Yeah, nice. So you got some music you're going to play for us it makes me think that maybe a little bit of audio processing within Python might make sense. Pydub, the tagline is manipulate audio with a simple and easy high-level interface. But it's really actually pretty cool with just a single line. Like, for instance, from mp3, you can pull in a mp3 file into a variable. And then once you've got it there, you can do things like use the bracket operators to get the first 10 seconds. It's crazy. You can use the slice on it and you can use indexing operators. It's crazy. And then adding or subtracting integers changes the volume by that number of decibels. The use of operators is pretty cool.

Starting point is 00:20:07 So slicing and chopping, but you can do crossfade and repeat and fade. I'm not quite sure what the difference between crossfade and fade are. But anyway, changing formats from, say, WAV to MP3 or something. Adding meta tags, that's pretty cool. That's the one that got my attention i'm like oh oh this might help on some production stuff i'm doing making sure of a specific bit rates or mp3 has a quality level you can pass all that stuff in for saving anyway it's just like a really will include a code snippet of a few things you can do but it's pretty easy to to maintain code once

Starting point is 00:20:45 you've got it in place i think yeah this looks really interesting if you do anything with audio people should check this out i did talk about like this trade-off and how django solving its async problem within itself would be great but i still think there's room for exploration on the web in the python world and so this next one is pretty much that. It actually describes itself as an experimental framework, but it's called Molten, a modern API framework. Have you heard of this, Brian? No. So it's a minimal, fast web framework specifically for building APIs with Python. So I don't even know if it has like a template language for HTML. It's all about just building APIs, but it looks pretty awesome actually. Yeah, and pretty terse and small.

Starting point is 00:21:30 Yeah, one of the things that I like that it does is it uses type annotations for a whole bunch of cool things. So the other framework I saw do this was API star, but I don't think it quite used it as much. So for example, you can have an API function that has a name, which is a string and an age, which is an integer, and it will automatically pass that data over to you, as you call it, which is pretty awesome. It also does request validation. So you can create a class, which looks very much like, like a data annotation, or looks like a data annotation class. And you give it a decorator and say, this is a schema. And what happens is if you say my API function takes this class as an argument, so their example has a to-do class.

Starting point is 00:22:19 So if you say the input is colon to-do, right, You annotate it as a to do, then it will actually parse all the things like the ID and the description, all the various pieces out of the input and verify that, um, you know, the ID is a string or the ID is an integer. The description is a string, all that kind of stuff, just by using type annotations. That's pretty cool. Yeah, yeah, it's pretty sweet. Here, I'll throw out the next one, see what you think about this. They also support dependency injection for allowing you to pass like different data access layers and stuff like that.

Starting point is 00:22:58 So if you want to test it, you could pass in like a mocked out data layer, whereas by default, you just sort of register it app startup, and it'll create all the different pieces of infrastructure and pass them to the methods automatically. Okay, some people like that. Yeah, you know, I don't see that very often in the Python space. I've mixed emotions. Sometimes it's nice, sometimes it's not. But anyway, it supports that you don't have to use it, right. But I do think the validation and the schema and the auto mapping of your sort of JSON documents to and from just strong classes with Python based declarative requirements and stuff is really cool. Yeah, I think the extra thing that they're adding this idea of using annotations as a schema.

Starting point is 00:23:37 It's pretty cool. That's neat. Yeah, I really like it too. And the other one that I looked at, sorry if I get this a little bit wrong, but there's some other framework that also used annotations that I thought was really cool, but it used them in a way that Python itself didn't make a lot of sense of. So like you could say, like I'm getting an API key passed to me and you would say colon header to say this API key is coming out of the header. But when you actually work with it, it's not actually a header, it's a string. It just came from the header.

Starting point is 00:24:07 And so things like PyCharm and stuff would freak and go, that doesn't have this method. You're like, no, I know it's a string, even though I just actually said it's a header. Like this is cool because the thing you say it is actually is what it is. The framework is consistent sort of with the programming model. I like that a lot.

Starting point is 00:24:22 Yep. Anyway, pretty cool. And people can check that out if they're building APIs. Remember, it's in the experimental stage, but you know, you can play with it, see if it fits your needs or make it better. Definitely. Nice. Yeah, pretty cool. All right. Anything else you want to share with us, Brian? No, I can't believe we're already done. I know. Same for me. I covered it all last week. So just always fun to share this stuff with you.

Starting point is 00:24:44 Thanks for being here. Definitely fun. And everybody, keep on sending us things that we should check out. I love getting tips from people. Absolutely. Same here. See you later. Bye.

Starting point is 00:24:52 Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Auchin,

Starting point is 00:25:13 this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Python Bytes - #90 A Django Async Roadmap

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.