Python Bytes - #90 A Django Async Roadmap
Episode Date: August 7, 2018Topics covered in this episode: Reproducible Data Analysis in Jupyter PySimpleGUI - For simple Python GUIs Useful tricks you might not know about Git stash A Django Async Roadmap pydub Molten: Mode...rn API framework Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/90
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news directly to your earbuds.
This is episode 90, recorded August 2nd, 2018. I'm Michael Kennedy.
And I'm Brian Ocken.
Hey, Brian. Good to be with you again.
It's good to talk to you again.
Yeah. And it's also good to have DigitalOcean sponsoring this episode. So thank you to
DigitalOcean. We're both customers and they're sponsors of ours. So it's kind of this weird
mix of everybody loving it. So pythonbytes.fm slash digitalocean
will get you $100 credit for your servers
if you're a new customer.
So check that out.
In the meantime,
we should probably talk about some data analysis.
We should.
And who better to talk about data analysis
than Jake Vander Plaas?
I honestly don't know.
He does awesome work over there
at the eScience Institute.
And I love listening to his talks.
So there's a set of videos that he has on.
This is actually from last year, but I didn't count.
There's 11 videos, and it's called Reproducible Data Analysis in Jupyter.
But each of the videos is like five or six minutes, so they go pretty fast.
This is a really cool thing.
I think everybody should check these out anyway is because they're he goes through a problem of or just a a data set that's
the bikes cross the bike crossings at a particular bridge in seattle i think but it really doesn't
matter the data how the data gets in there but he's using doing all this stuff live. He's doing a Jupyter notebook
and pulling data in. And sometimes the tables like the table that he ends up with doesn't
look quite right. So he uses a different different column to as the index or the for instance,
in the first video, he puts a graph up and all the data is sort of packed together. So he changes the sample rate into just weekly data.
And all of that stuff, I didn't even know you could do those things.
So it's not necessarily a complete, it's kind of a full pass through using all the tools you can use to do exploratory data analysis with Jupyter and doing it live.
And watching a pro do it, it's a thing of beauty and
he's um like i said each particular tool that he use uses it's not an in-depth study on exactly how
to use that to its completeness but you just get a glimpse at all the power that you can do with
all these things yeah i really like it and i think this sort of view into exploring data is super
interesting it really shows the power of Jupyter notebooks.
Like when I first saw them, I thought,
oh, well, there's like a simplified programming environment for people that aren't like real programmers
and don't want to work with different files and stuff like that.
But the more I saw people using it and interacting with it,
I realized it's just for people solving problems entirely differently
than the type of problems I solve.
And it's really great for that.
Yeah, and working with data,
when you're throwing up a graph or a plot of the data,
sometimes you might be plotting it wrong.
You're like, well, maybe I could see something
if I plot it this way.
And it's not interesting.
It's just a bunch of points everywhere.
But if you plot it a little different
with a different axis maybe,
or a different type of plot, it might show you interesting information.
And this is actually fascinating to me, this notion of people using large data sets and then trying to figure out, like, in real time how to use it well.
And then once you've already figured that out, if you want to put some program together using some of those tools to monitor those things, that's a great idea. But that, the ability to just use a notebook to, to just explore
stuff is pretty fascinating. Yeah. And you know, there's some really interesting use cases, like
suppose you want to go hit a bunch of APIs and download some data and then graph it. If you did
that in like a script, every time you want to view it slightly differently you're rerunning that down
you're getting or computing the data however you do that right whereas in notebooks you can just
rerun one cell at a time change that cell run it again and you have to recompute or reacquire the
data so it's really there's a lot of interesting aspects to it yeah and when well that's another
thing he starts out with showing how to within the notebook um grab a url of data and put it in a CSV file or some local file
so you don't have to do that network all the time.
Yeah, nice.
But anyway.
Yeah, that's a good one.
I really like the series, and I'm glad you brought it up.
Another thing I want to bring up, something we haven't covered very much.
Have we talked about GUIs in Python yet?
You know, I think they're important.
We probably should start talking about it.
Yes. So because we were on our kick, of course, people said, you know,
I said, oh, have you heard about this? Have you tried that? And here's another, here's another
one. This one comes from Mike Barnett. He sent me a couple emails, sort of charting the progress of
this project that he built. The name pretty much says it all, PySimpleGUI. So it's for simple Python GUIs.
And it's just another really simple way to take what would have been like a command line
program and make it a lot better. So, you know, bonus points to Mike, because this not only has
a screenshot, this has many screenshots and many examples. So as all the Python GUI libraries out there should have,
if you want people to use them, screenshots. So check it out. It's pretty cool. What you do is
you work more or less in a 100% Python language API, and you don't work down at the GUI toolkit
layer, which is cool, I think. So you define the UI and it has this sort of auto layout mechanism
and it has things like slide bars and text boxes and buttons and it works in all the things. So
that's pretty nice. It works on like a Raz. It says one of the examples is do you have a Raspberry
Pi with a touchscreen? Well, you know, why don't you just write this one screen full of GUI code
and you've got your touchscreen GUI working. Isn't that cool?
How neat.
Yeah. So for better or worse, it's based on TKinter, which is good because it comes with
Python. There's no dependencies. So literally you just pip install PySimpleGUI, or if you want,
there's a single Python file you can include with your code. So there's not even the pip
install package stuff. You can literally just go, here's the file,
PySimpleGui.py,
put next to my program,
and that's all it needs.
So that's cool,
but honestly,
I think TKinter looks a little dated, right?
Like it looks like it belongs a little more
in like war games than it does in 2018.
But anyway,
it's still pretty nice as it is.
And one of the upcoming things But anyway, it's still pretty nice as it is.
And one of the upcoming things that he lists on sort of the upcoming goals or if anybody wants to help out is port this to other graphics engines. Because you don't work in the toolkit API directly, but this is like a translation layer, you could hook this up to, say, WX Python or Python for Qt, Qt for Python, things like that.
So you could get one of these modern-looking UIs
if somebody's willing to do that translation.
That's a neat idea. I'd like to see people working on that.
Yeah, wouldn't that be sweet?
Then it could actually detect if you even have them,
and it wouldn't even go, oh, do you have WX?
Okay, great, we'll go with that.
Oh, you have Qt? Let's just run that version.
Don't have to get a dependency installed. That'd be best.
Yeah, and yeah
you're right that the uh tk stuff looks dated but you know there's a lot of use cases where
you can use a gui that doesn't have to be pretty yeah well you know what else can look dated is the
command line to a lot of people so you know maybe this is a lot better interaction for non-technical
people even if it does have a little bit funky shading on the buttons or something.
Yeah, I mean, I don't get it,
but I do have, I fight that battle every once in a while.
I'll tell people this, just run this command line thing.
What do you mean run the command line thing?
Oh dear.
They're like, I studied accounting.
I don't know where the terminal is on my Mac.
Okay, okay then.
Click this.
Yeah, one other thing to call out for, I think
that's interesting. This is
Python 3 only. So
no Python 2 for PySimple
GUI. And more and more, again,
we're seeing things where it used to be,
well, I can't switch to Python 3 because this isn't
supported. Now it's like, if you don't switch to
Python 3, you don't get these cool libraries.
And here's just one more example. Yeah.
Great.
Well, I, we've got a team that is migrating to Git.
And so this isn't directly Python related,
but I ran across this article called
Useful Tricks You Might Not Know About Git Stash.
And Git Stash is the stash command.
It's something that I actually did did it took me a while to run
across because it's not it's not something you have to use but a stash is a way to sit to so
let's say you've got a a repository where you've you've cloned it now you've made some changes on
it and you're not ready to do anything with your changes but you need to like maybe pull down a new version
or you branched at the wrong point or whatever stash is a way to save off all of your changes
all the dirty stuff in your directory save it away and just like hide it somewhere and then
you can reapply those changes later after you you know update or pull or something and i'm still
working through how to integrate Stash into a good
workflow, but I wanted to highlight this article, Useful Tricks. Useful Tricks You Might Not Know
About Get Stash, because I'm learning it and I wanted other people to know about it also.
Yeah, that's cool. One of the ideas that seems like it might be relevant here is,
suppose you've got some branch checked out and you're doing some work you're like halfway through it and somebody comes along and says hey i'm on the same branch
as you and we have this bug could you just like fix this really quick or make this quick change
so that i can carry on and you're like oh but this work i have here is like half done it won't be
done till tomorrow so you could like stash that away get the latest, do some work, push that in,
and then reapply that stash to get your work back
without actually committing it and messing up that whole branch.
Yeah, definitely.
And then the use case we often use is like the test team is working on different tests around,
but we're sharing utility libraries and fixtures and stuff.
And somebody updates a crucial fixture
or a utility, communication utility.
And so you want to use that,
but you're in the middle of writing your test
or changing something new.
These aren't merge conflicts at all,
but Git doesn't let you pull in the new stuff on top of your old
stuff i mean you can do a merge right but you still have to commit it to your local repo before
you can do a pull which you might not want to do yet and if you're just starting out or whatever
you might not want to do that if you just want to look at stuff so that's the case where a lot of
uh we're playing with this workflow is to just stash away your changes to a pull and again people can correct
me if i'm if i'm using the term pull wrong because i'm still learning at the the right times to do
pulls and fetches and merges and all that stuff so yeah nice yeah this is really cool two things
that you call it here that i thought were pretty cool well i guess three one is you can label your
stashes so like you know what they. They're not just hashes.
That's good.
Also, I didn't know you can do a dash U to include untracked files.
That's pretty cool.
That's pretty cool.
I didn't know that either before I read the article. And the other one, the last one that I totally didn't know you could do is once you have your stash saved, you could say, well, I probably shouldn't have done it as a stash.
I probably should have just put that on a branch.
There's a way to say get stash branch and then a name, and then you can specify which stash.
And it just takes all those files, all those changes, and creates a branch.
That's cool.
Yeah, I really like that one.
Like, oh, I stashed it, and actually what I want to do is more work and sort of parallel, like break it off, split off my work
without committing it to sort of convert the stash to a branch. That's cool. Yeah, very nice. I like
it. So let me tell you about this new thing that DigitalOcean has. So they have virtual machines
and floating IPs, and they have spaces and load balances and all these sorts of things, even
domains and DNSs. And if you have a lot of stuff going on at DigitalOcean, well, you might have
like 20 virtual machines, and some of them are for some project, another one is for another project,
like, how do you know which one is for which? And is that one safe to delete? I think we're done
with it. But I'm not sure I actually don't know really know what it belongs to. So they came up
with this new feature called projects where you can group droplets, load balancers, domains, IP addresses, all that kind of
stuff into one to different projects. So you can say this one is say for the training site, these
three parts all fit together there. This one is for the Python Bytes podcast. And these two servers
and spaces all fit together over there. So pretty cool. Check that out. It's just one more
way to make your hosting life easier. Yeah. And be sure to visit pythonbytes.fm slash digital ocean.
If you're a new user, you get a hundred dollars credit. So that makes it even nicer. So one of
the things I'd like to see Brian is more async stuff. And I think the place where it's most
beneficial is around the web actually. Well, yes. Last last week i said because we can have like a
bunch of different worker processes it's not really necessary right you can get like if you've
got an eight core server you could have say 16 little micro whiskey worker processes and each
one can sort of computationally chew up its stuff uh like one one core and it sort of gets shared
by the os but really there's some limit where you don't want to create more because you
run out of memory, right?
Like I think the 16 on the training site probably take like a two gigs of RAM.
So you can't have many of them or you'll run out of RAM unless you have a lot
of room there. At some point you maybe are waiting on a database call, right?
You do request the request says, well, in order to process the request,
I need to have, like, hit this database.
And actually, this is a query that takes 500 milliseconds to return.
That thread is really doing nothing but just waiting on a socket
to return something from, say, Postgres or MongoDB.
And just as well could be doing other work if it could let go,
but, you know, maybe it doesn't, right?
So if we can build this so that we could build it with async and await,
any time our code is waiting, it immediately gives up its thread,
and it will go on to do more processing.
And so, for example, this is one of the ways,
this is basically the fundamental process concept of how Node.js can do
hundreds of thousands of requests on one server, concurrent,
because most of those are waiting on a database or something else. However, the problem is many of the popular
Python frameworks don't support this concept of async. We have new frameworks, Sanic, Gepronto,
others that do support it, but those are not the old frameworks right so there's like this there's
these new ones that are exciting and fast and there's the old ones that everybody knows how
to work with and have deployed but bridging that gap is a challenge so andrew godwin the guy who
worked on django channels works on django channels he came up with a django async roadmap and it's
pretty interesting and pretty thorough and it talks about like the
time frame and how they might make Django support this sort of world where you can have async
methods yeah that's actually really cool because I mean if Django and Flask have to get there or
especially Django or it's gonna something else will take over. Right, exactly.
I mean, this is one of the times you hear people say,
I'm switching to Go because it does better concurrency than Python.
Well, if Django and the other frameworks just had it baked in,
that whole argument would largely go away.
So anyway, this is really cool, and I really like how he's put it together.
He said he thinks it's time, the time has come to start seriously talking about bringing async functionality to Django.
And he's shared it previously with some people internally,
but this is him kind of coming out
and saying I'm opening it up for public feedback.
So he has some interesting goals.
He says the goal is to make Django a world-class example
of what async can enable for HTTP requests.
And that means various things
at different parts of the
stack. So doing ORM requests in parallel, right, this waiting on the database, instead of waiting
the 500 milliseconds, you just continue doing processing, and then get back to it when the ORM
responds, allowing views to query external API's without blocking. So you, you know, we talked
about the retry stuff, if you're calling, like a credit card or other sort of external API, right, then, you know, that would go away faster.
You could do like slow response, long polling, super easy.
It's like a sort of WebSocket stand in all sorts of performance improvements.
So it's imperative that they keep Django backwards compatible and to make sure that when people come to the project,
this is an option they can turn on,
not something they have to learn.
So part of the beauty of these frameworks
is they've been really easy to get started with.
Let's not throw this at people
at the very first thing they ever do.
Yeah, yeah.
There's a place for it,
and sometimes not a place for it.
Yeah, yeah, exactly.
All right, so I said, why now?
Well, Django 2.1 will be the first release
to support Python 3 and above,
and not the previous ones.
And Python 3.5 and above,
this is where async and await,
the language syntax,
and it truly has become properly supported.
So that's why I think now is the time
to start working on this.
Yeah, and then the sort of
the timeline is broken out into different django releases and and which ones what sort of goals
might be for each one that's nice yeah it's pretty cool like it doesn't make any sense to parallelize
the web methods necessarily without paralyzing the data access so maybe start with the orm actually
things like that yeah i love this i think that it's also good also talking about people.
It's not just individual developers, but companies that have applications.
They might want to think about concurrency, but they don't necessarily, they don't need to do it right now.
But they know that they're going to eventually do it to see this roadmap.
And maybe that'll help people.
In the article also, it talks about funding.
So if a lot of companies are relying on this
or looking forward to it in the future,
maybe kicking in some dollars to help it go faster
is a good thing.
Yeah, it blows my mind how many huge companies
are basically built on Python infrastructure
but contribute zero to it.
Yeah, and that's something that we're, as a society, not just the Python community, but the web
using community, we've got to tackle that.
But part of this is, I guess, just also things like this to say, hey, this is where we're
going.
And some of these problems need people focused on it, not just volunteer time.
So some direct money to hire somebody for six months or
a year would be a good idea. Yeah. A lot can get done with just like a few months of focus time.
Yeah, definitely. Yeah. I mean, that's how the new PyPI got launched.
So this is all the way up to, it looks like a mostly async Django by Django 3.2.
Yeah, that's awesome. That's a ways, it's pretty conservative. It's not too wild. And I really
think it's a well thought out plan. So I'm happy to see Andrew put it out there. Nice. Yeah, that's awesome. It's pretty conservative. It's not too wild. And I really think it's a well thought out plan. So I'm happy to see Andrew put it out there.
Nice.
Yeah, nice. So you got some music you're going to play for us it makes me think that maybe a little bit of audio processing within Python might make sense.
Pydub, the tagline is manipulate audio with a simple and easy high-level interface.
But it's really actually pretty cool with just a single line.
Like, for instance, from mp3, you can pull in a mp3 file into a variable. And then once you've
got it there, you can do things like use the bracket operators to get the first 10 seconds.
It's crazy. You can use the slice on it and you can use indexing operators. It's crazy.
And then adding or subtracting integers changes the volume by that number of decibels. The use of operators is pretty cool.
So slicing and chopping, but you can do crossfade and repeat and fade.
I'm not quite sure what the difference between crossfade and fade are.
But anyway, changing formats from, say, WAV to MP3 or something.
Adding meta tags, that's pretty cool.
That's the one that got my attention i'm like
oh oh this might help on some production stuff i'm doing making sure of a specific bit rates or
mp3 has a quality level you can pass all that stuff in for saving anyway it's just like a really
will include a code snippet of a few things you can do but it's pretty easy to to maintain code once
you've got it in place i think yeah this looks really interesting if you do anything with audio
people should check this out i did talk about like this trade-off and how django solving its async
problem within itself would be great but i still think there's room for exploration on the web in
the python world and so this next one is pretty much that. It actually describes itself as an experimental framework, but it's called Molten, a modern API framework.
Have you heard of this, Brian? No. So it's a minimal, fast web framework specifically for
building APIs with Python. So I don't even know if it has like a template language for HTML.
It's all about just building APIs, but it looks pretty awesome actually.
Yeah, and pretty terse and small.
Yeah, one of the things that I like that it does is it uses type annotations for a whole bunch of cool things.
So the other framework I saw do this was API star, but I don't think it quite used it as much. So for example, you can have an
API function that has a name, which is a string and an age, which is an integer, and it will
automatically pass that data over to you, as you call it, which is pretty awesome. It also does
request validation. So you can create a class, which looks very much like, like a data annotation,
or looks like a data annotation class.
And you give it a decorator and say, this is a schema.
And what happens is if you say my API function takes this class as an argument, so their example has a to-do class.
So if you say the input is colon to-do, right, You annotate it as a to do, then it will actually parse all
the things like the ID and the description, all the various pieces out of the input and verify
that, um, you know, the ID is a string or the ID is an integer. The description is a string,
all that kind of stuff, just by using type annotations. That's pretty cool.
Yeah, yeah, it's pretty sweet.
Here, I'll throw out the next one, see what you think about this.
They also support dependency injection for allowing you to pass
like different data access layers and stuff like that.
So if you want to test it, you could pass in like a mocked out data layer,
whereas by default, you just sort of register it app startup,
and it'll create all the different pieces of infrastructure and pass them to the methods
automatically. Okay, some people like that. Yeah, you know, I don't see that very often in the
Python space. I've mixed emotions. Sometimes it's nice, sometimes it's not. But anyway,
it supports that you don't have to use it, right. But I do think the validation and the schema and
the auto mapping of your sort of JSON documents to and from just strong classes with Python based declarative requirements and stuff is really cool.
Yeah, I think the extra thing that they're adding this idea of using annotations as a schema.
It's pretty cool.
That's neat.
Yeah, I really like it too.
And the other one that I looked at, sorry if I get this a little bit wrong, but there's some other framework that also used annotations that I thought was really cool, but it used them in a way
that Python itself didn't make a lot of sense of. So like you could say, like I'm getting an API key
passed to me and you would say colon header to say this API key is coming out of the header.
But when you actually work with it, it's not actually a header, it's a string.
It just came from the header.
And so things like PyCharm and stuff would freak and go,
that doesn't have this method.
You're like, no, I know it's a string,
even though I just actually said it's a header.
Like this is cool because the thing you say it is
actually is what it is.
The framework is consistent sort of with the programming model.
I like that a lot.
Yep.
Anyway, pretty cool.
And people can check that out if they're building APIs. Remember, it's in the experimental stage,
but you know, you can play with it, see if it fits your needs or make it better.
Definitely. Nice.
Yeah, pretty cool. All right. Anything else you want to share with us, Brian?
No, I can't believe we're already done.
I know. Same for me. I covered it all last week. So just always fun to share this stuff with you.
Thanks for being here.
Definitely fun.
And everybody, keep on sending us things that we should check out.
I love getting tips from people.
Absolutely.
Same here.
See you later.
Bye.
Thank you for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at PythonBytes.fm.
If you have a news item you want featured,
just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Auchin,
this is Michael Kennedy.
Thank you for listening and sharing this podcast
with your friends and colleagues.