Python Bytes - #31 You should have a change log
Episode Date: June 21, 2017Topics covered in this episode: [more] TinyMongo A dead simple Python data validation library PuDB Analyzing Django requirement files on GitHub Changelogs Understanding Asynchronous Programming in ...Python Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/31
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This time it's Python Bytes episode 31, recorded on Tuesday, June 20th, 2017.
I'm Michael Kennedy.
And I'm Brian Ocken.
And we have a bunch of cool things to talk about.
Some of them are huge and some of them are kind of tiny.
Let's start small, huh?
Yeah, let's start small.
I really appreciate, it's one of the reasons why I like following Twitter for Python news is that's where I found TinyMongo.
So I saw somebody talking about it last week.
That's awesome.
I'm a fan of MongoDB and TinyDB.
And if they could come together, that'd be even better.
Right.
So this is essentially an attempt to put, it's not an exact same interface, but it's his intent to always be right on top of TinyDB,
but so far he's been really happy with TinyDB as the backend for TinyMongo. And so, yeah,
it just sits, it's using TinyDB as the database part, but exposes an interface that's very close
to Mongo. Yeah, that's super cool. So basically if you have code that talks to MongoDB through the PyMongo API, you could more or less adapt that really quickly to TinyMongo and TinyDB, the backing store for this thing, more or less is like, let's create a simple document database that's really just some json files living on your disk it's not a full-on
production database but if you're doing simple stuff like really simple things like this is
actually pretty sweet there's no server right right and yeah it's no there's no server i would
say probably the other direction probably works the best so if you were if you were going to your
end goal was to use mongo that tiny mongo might be a good way to start because it isn't the full set of functionality.
I don't have a complete list of what's missing. I just have the personal experience of I tried
to take a Mongo application and just swap this in and I ran across a few errors and I haven't
finished debugging those yet. I'm just really excited about it because there's more than one document database
that I can use in small applications.
Yeah, that's cool.
And then also, one of the applications for this,
when I was talking with the maintainer of it,
is that he's using it on Raspberry Pis even.
So having a Mongo-like...
That is really cool,
because you don't want to start up a whole separate server on like a Raspberry Pi, but certainly having a little couple of JSON files laying around that you have like a database interface over top of, that's cool.
Yeah, definitely.
So I was excited about this and I'm going to start using it right away.
That's sweet.
Yeah, if people are interested in TinyDB, I back on episode 80 of talk Python many moons ago, I interviewed the guy
who created tiny DB and talked about some of the use cases. And I think there's some extensions
you can get like indexing add ons and stuff like that. So there's a lot of stuff to do with this
pretty cool. So that sounds pretty dead simple, right? Just fire up tiny DB and off you go.
Yeah, dead simple.
You know what else I want want some dead simple validation.
And so the next project I chose is called Validus. And Validus is on GitHub and it's described itself as a dead simple Python data validation library. And have you ever tried to
write a regular expression to match an email or a URL or something like that? Oh, yes. Yeah.
That's super fun, right? No. You think you get
it working that someone emails you like, I have a proper email address, but I can't sign up your
system. It says my email is invalid. You're like, oh, gosh. So this validus thing kind of like
solves that for a class of types of data, basically simple input. So you can just import this and say
validus.isemail and give it a string and it will say yes or no and you can ask it questions like is it an rgb color is it a phone number is it an isbn is it a ipv4 or ipv6 address
is it a number is it a slug like would it fit at the end of a url without you know needing encoding
all that kind of stuff that's pretty awesome that's cool i'd say it's dead simple it's even
got is mongo id so nice yeah yeah that's awesome so um you know what else i like about say it's dead simple. It's even got is Mongo ID. Nice. Yeah, yeah, that's awesome.
So you know what else I like about this?
It's Python only, no legacy Python.
3.6, 3.3.
Yeah, yeah.
Yeah, 3.3 and above.
So it's only a Python 3 thing.
So yet another sweet example of that.
I have a lot of interesting stuff to say about that at the end of the show.
Not Validus, but Python versus legacy Python.
While this works pretty well, we may still need to jump in the debugger, right?
Yeah, definitely. And I'm a command line debugger kind of person. Actually,
I don't really jump into the debugger too much.
You're a last resort, a debugger of last resort type person?
Yes. Yeah, definitely. Last resort. And so in episode 29, we talked about launching
the ability to launch PDB, the Python debugger, from a failed PyTest. Somebody on Twitter, another Twitter person, KidPixo, I think.
Yeah, KidPixo, he runs the Geek Cookies Italian podcast, which I was a guest on like two and a half years ago. He's a great guy. He sends us lots of good stuff. Yeah. Well, he passed this along because he said he really loves the PUDB debugger.
And my first reaction is, oh my God, this thing is ugly.
Because it does look like you're back in the 80s running on a 386 or something.
I feel like I've dialed into a VBS.
But it does have themes.
So after I played with it for a while, I switched it to a midnight theme, and it looks just like I'm in my editor.
And then it's actually pretty slick.
And one of the things that you can do with it, it's a lot better than PDB, and it's still small and fast.
And there's some documentation in it for how you can do the same thing that we did with PyTest.
You can launch it just whenever you hit a PyTest failure.
So that's pretty cool.
Yeah, it's really nice.
I mean, you can use it over like SSH and stuff.
So if you're SSH into a server, you can debug with this,
but it actually has like little windows.
I mean, it really does feel like I'm back on a BBS.
It's awesome.
Like you see your code and you can step through it.
You've got like a variables window and a stack and breakpoints.
And like, it's really nice. It's like a variables window and a stack and breakpoints.
And like, it's really nice.
It's like a ASCII curses type thing. But the local, yeah, the local window of already having your listing up and also all your local variables and that changing when you go up and down the stack is, it's just, it's usually enough.
So I like it.
Yeah.
Yeah.
It definitely hits the sweet spot.
Like the 80, 80% case for debuggers. It's cool. All right. So I'm definitely gonna start using that
if I need to debug anything, uh, without a windows environment, a windowing environment,
like Mac OS or Linux or windows. Okay. So the next thing that I want to talk about is a really
interesting sort of wide ranging study that the guys at pie up.io did. So piup.io is a cool service.
I'm actually a paying customer of theirs because I really think what they're doing is awesome. And
I use it for my web apps. So the idea is you basically point you give piup.io access to your
requirements file in your public or private GitHub repo. And if there's a new version of
indie requirement or transitive requirement that you depend upon, it will tell you like,
hey, there's a new release of the pyramid web framework. And here's the change log. And actually,
this one's a security update. So get in there and fix it quick. So it'll like basically watch
your requirements and tell you if there are any upgrades and things like that. And it'll
issue them as a pull request. So really cool. So these guys have access to all these requirements files and many other things,
right? And they studied some Django requirements files on GitHub. Now this isn't through their
business, they were able to use BigQuery to just get ahold of all of the Django requirement files
that are on GitHub. And they found some interesting things. And I guess this is not private,
not the private repos, probably just the public ones.
But anyway, they said that Django is the most popular web framework.
And it's pretty old.
It's been around for 12 years, used in all sorts of different projects.
So let's look at these requirements files, which specify like all the dependencies you have to install and see what we can get from them.
So the first thing they ask is, do developers pin or freeze
their requirements, right? That's where in your requirements TXT, you could say, I depend on Django
and I depend on SQL alchemy and I depend on requests. Or you could say, I depend on Django
equal, equal this version, request equal, equal that version, right? That's pinning or freezing.
And they said that 64% of Django developers pin their requirements.
That's interesting.
And another 20% or so do ranges.
So like I'm willing to take this range of versions, but not leave it unpinned.
And then some of them are just like, give me whatever I can when I ask for it.
So that's interesting. Another thing that they said was pretty interesting is that Django 1.8, even though I think 1.10, 1.11 is the latest, Django 1.8 is the most popular of them.
And that was pretty cool.
But one of the things I really wanted to point out here is they said that what is more worrisome is 1.9, 1.7, and 1.6 are second, third, and fourth most popular on the list.
Why is that a problem?
None of them are receiving any security updates at all.
Oh, weird.
Isn't that bad?
So 1.7 and 1.6 went end of life over two years ago.
So if you are on the web and your application listens on a socket,
you want it to have all the security patches, let me tell you.
That's bad news.
And here's like, if I add those up really quick,
that's something like 40% of Django files they found
are using these older versions.
And in fact, he said only 2% of all Django projects
they could find are actually on a secure release.
Among all the projects, more than 60% use Django releases
with one or more known
security vulnerabilities. And that's pretty intense, man, that only 2% of them are on a 100%
known secure release. Well, I mean, clearly it's recommended to go make sure that you're using a
secure release, but I was curious about the pinning or freezing. Is that considered best practice?
So I think it depends on what you're doing. For large, complicated applications, it's definitely
considered a best practice. The idea is you want to make the upgrade in your dependencies at the
time of your choosing, right? Like you want to have, so if you're going to upgrade from,
especially major frameworks like Django, if you're going to go from Django 1.8 to 1.9, you don't want that to just happen one day when it gets released and you happen to refresh your server because that might have breaking changes.
So you want to explicitly say, I depend on this one.
Oh, there's a new one out.
Let me test the new one and then explicitly change that number and have it like flip it for you okay and basically that's what the pi up service does
that that i i use like it will automatically upgrade like my pyramid web framework from like
17 to 18 to 19 but it doesn't flip it immediately it's like i have to it'll tell me and change my
requirements files as a pr and i have to like accept it basically okay yeah but pretty interesting
stats there uh especially if you're into dango, check that out. Yeah, definitely.
It's kind of concerning that there's so many.
And then there's, those are, I'm sorry to like hang out on this so much, but this is, was this projects or applications and is there a difference?
So as far as I can tell from the, I don't really know, Yanis, I think this guy who wrote it probably could maybe chime in in the comments if he's listening. But my understanding is basically they went and they studied the public repos that use Django.
Okay.
So this also may not be quite representative because companies like Pinterest that depend on Django,
they're obviously not going to make their code public, right?
So they may be doing slightly different things.
But still, it's interesting
for you into at least the open source side of Django. Definitely. It's cool. Speaking of open
source projects, do you think they should have a changelog? Well, that's what I was curious about.
Yeah. So I kind of am warming to the idea of changelogs. I appreciate other projects with
changelogs. I actually asked some people back on Twitter again what they thought of them.
And there's a couple of things I came across, which was a website called Keep a Changelog.
I really like that site. It's so clear and compelling. It's great.
Yeah. Well, it's also, it talks about that there really isn't a standard,
if there is a standard format forum, this is probably as close as you can get.
And it talks about different standards in either REST or in Markdown.
There's different ways to do it.
And then when I was talking on Twitter about changelogs,
some of the people from the PyTest project piped up and said that they're using a tool called Town Crier to maintain their change log.
That looks really cool, but I've never done anything with it.
What's Town Crier do? So what it does is you keep a separate directory within your project
so that you can have it on different, if you're using different branches,
and then different changes go in,
and you keep the changes in little snippet files
so that since they're separate files,
they merge easy because they're going to be a new file for each change. And then you go through and
say, okay, I've pulled all these things in. I want to go ahead and take everything in the directory
and add it to the changelog. Oh, I see. You can keep a separate file that says, these are the
breaking changes. These are the new features or whatever. then it'll build a changelog out of them?
Yeah.
Oh, sweet. Okay.
Well, it adds to your existing, and it can add to your existing one.
And one of the things I liked, if you're not doing something like Town Crier, one of the recommendations from Keep a changelog was to keep at the top a unreleased changes so that you, things that you haven't put a label on or or done a official supported release
yet because those are things that may i don't know maybe you may end up kicking out yeah they
also have some things that you shouldn't do like don't just take your get change log and make that
your proper change log things like that yeah and the one of the things there i saw when i was doing
some research for this i did see some some various automated ways to do it but that's the sort of thing is you're going to pull things out of
file changes and that's not really what you want you really want a a human moderated list of things
that went in and that's one of the reasons why i like town crier because it was uh sort of halfway
in between yep yeah it's it's definitely really, it's like a nice way to sort of manage that human.
Because you don't want merged conflict, took PR, accepted this.
I changed the spelling.
Like, you know, you don't need all that noise.
You just want the four things that change.
Do I want to upgrade to this or not?
Whatever.
Let's just move on, right?
Yeah.
And then I guess I would lump this in.
Last time we talked about uh different
decisions based on scaling and for projects that i'm just i'm the main maintainer of i would
definitely just keep a file but if if we start getting a lot of contributors then something like
town crier totally makes sense so yeah i think it's i think it's really nice i'm gonna definitely
look into it all right last thing i want to talk about is asynchronous programming,
which is something that I talk about often because I'm a big fan.
This is an article called Understanding Asynchronous Programming in Python
by Doug Farrell from Dan Bader's site.
And we've had some of Doug's stuff on before.
He does good writing.
He works at Shutterfly doing Python there.
So he takes some of his experience and puts it in this article,
and it's pretty cool. What I would call or sort of describe this as this is like a very friendly introduction to asynchronous programming so starts out and says let's imagine like a web
server and could it be synchronous sure it'd be fine if we had a synchronous web server and we
could optimize the heck out of it but no matter how much we optimize it like at some point you're waiting on a thing and you want to go
do other stuff for example just like shipping the html back to the browser on a slow network right
like you want to be processing other requests and do that in the background and so he's got
something to the effect of like eight or nine examples.
And to sort of start them off, he says, look, the real world is asynchronous.
For example, if you're a parent, kids are a long-running task with high priority, superseding any other task you might be doing, like a checkbook balancing or laundry or something like this.
So he has a lot of like analogies back to real life that are pretty cool. Then he says, okay, we're going to go through some examples, like eight examples and build them up. Start with like a synchronous sort
of job doing program that has a queue, you put some work in the queue, it does the work. And
then it says, all right, let's see how we can use generator methods with the yield keyword to
instantiate like cooperative multi threading or cooperative concurrency, I guess, between those
two methods, which is actually a really cool way to do it where there's no concurrent IO,
there's no threads, there's no multi-processing. It's just like, let's interweave the work of
these two methods or multiple methods using generators, which I thought was really a cool
way to look at it. And it says, okay, well, what if some of that work is slow? That's a problem.
And then he kind of takes you on a tour of different apis and libraries to make this work so g event twisted twisted callbacks and so you
can compare all these different ways of doing things and i should throw in there some aio http
type things as well but yeah very very cool article if you want a super gentle introduction
to asyncers programming so this doesn't cover the uh ao ai ai yes exactly So this doesn't cover the AO, AI. AI. Yes,
exactly. Yeah. It doesn't cover the, the basically the three, five stuff. Okay. Yeah. So this would
work on any version. I really liked this article because we've been talking about asynchronous for
a while and I, I have to admit, I have my hard time getting my head around how to think about,
I've been doing it for so long in C++, but I have a hard time
getting my hand around it in Python. And this article is really a good starter.
Yeah, I feel like it's definitely a good starter. I was happy to pick one of our picks this week.
All right, so that's all the news that we have that we've kind of found,
but you have extra credit, don't you?
Yeah. Well, yeah. In episode 29, I gave the wrong credit to the wrong person for cluing me into
PipCash. I'm sure they appreciated it though. Yeah. But it really was KidPixo and he reminded
me that it was him. And so sorry about that. And thanks a lot for keeping us informed.
Yeah, definitely. We really appreciate these ideas and these notes and these little
topics people send us. They're very nice.
And then I just had, I couldn't resist, this is going to be hard to do over a podcast,
but we have a link to a funny comic about Python private methods. And if you haven't seen this,
check it out. It's just, it's basically a key under the mat in front of a door.
I love it.
I love it.
That's really awesome.
Yeah, that's kind of the thing.
It's like, it's private unless you want to look for it, then it's right there.
Yeah.
Nice.
All right.
So update us on the book. The book is coming along and taking almost all of my time.
The multitasking is a hard thing. But yeah, the third beta is coming
out, should be out this week with the last chapter, chapter seven. And this one is using
PyTest with other tools like PDB and coverage and mock and talks and Jenkins and things that I get
a lot of questions about. So I'm really happy to get this chapter out. Yeah, that's awesome.
How about you? Yeah, last time we talked, I was recording and recording and recording
TalkPython episodes. So now I'm kind of finishing up recording courses. So I've actually got two
eight and nine hour courses that I've finished recording over the last couple of weeks.
So I've finished recording the RESTful and HCP Services and Pyramid. And I've also finished
recording, writing and recording the MongoDB for in Pyramid. And I've also finished recording, writing,
recording the MongoDB for Python developers courses.
So I'm working on editing the final videos for those and getting those up.
So I'm really excited to get that out.
Really fun.
I'm really excited to take a look at that MongoDB course.
That sounds very interesting.
It's a cool hands-on one.
We build like this database that represents a dealership and it's got like
millions of records in it.
We get it to where we'll like do queries in like one millisecond,
even with millions of records.
It's fun.
Nice.
Yeah.
Cool.
All right.
Well,
that's,
that's our news for the week,
Brian.
Thank you so much for,
as always sharing with everyone.
All right.
Thank you.
Yep.
See you all later.
Thank you for listening to Python bites.
Follow the show on Twitter via at Python bites.
That's Python bites as in in B-Y-T-E-S.
And get the full show notes at PythonBytes.fm.
If you have a news item you want featured, just visit PythonBytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Auchin, this is Michael Kennedy.
Thank you for listening and sharing this podcast with your friends and colleagues.