Python Bytes - #79 15 Tips to Enhance your Github Flow

Episode Date: May 25, 2018

Topics covered in this episode: pytest 3.6.0 * Hello* Qt for Python MongoDB 4.0.0-rc0 available Pipenv review, after using it in production Pandas goes Python 3 only Extras Joke See the full sho...w notes for this episode on the website at pythonbytes.fm/79

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 79, recorded May 23rd, 2018. I'm Michael Kennedy. And I'm Brian Ocken. Hey, Brian, how you doing? I'm doing great. Nice. I think, as always, we've got a bunch of fun stuff to talk about. And we wouldn't be doing it without DigitalOcean. A couple reasons, but DigitalOcean is sponsoring this episode and a bunch of the upcoming ones. So thank you to DigitalOcean. Get $100 off or $100
Starting point is 00:00:25 credit at pythonbytes.fm slash DigitalOcean for new customers. Tell you more about that later. I would be totally surprised, Brian, if you wanted to cover something about, say, testing or PyTest. Yeah, yeah. So PyTest 3.6.0 just got announced. So 3.6.0 for PyTest. And this is like an inside baseball kind of release because there's not a lot that if... I think 80% of the people using PyTest won't see a difference. But however, this was a big deal for the team. Essentially, it's a revamp of the implementation of the marker system and the data type that was used to hold the markers.
Starting point is 00:01:09 So there was a couple other things. That's the big thing that's going on in the 3.6.0 release is a reworking of the markers in it. And there's a list on their release notes of all the different defects that they fixed with this. The takeaway for a lot of people is if you were using, if you're writing a plugin or something or using the plugin features and using get marker to find out which markers are applied to a particular function, the get marker is deprecated. There's a new API, there's inner markers and get closest marker. And yeah, we'll have a link in the show notes to read more on that. So most of it's a plugin writers change, the API change, but it's exciting.
Starting point is 00:01:51 And I'm excited for the team to get that out because kind of like the Django 2 release, it's about maintenance and going forward. And so that's great. There is one more feature that we, from a couple of other things, a breakpoint, the PyTest is supporting the breakpoint functionality in 3.7. And that is brought to you by our friend, Anthony Shaw. So he put that in.
Starting point is 00:02:13 Oh, nice. Yeah. He's doing a lot of work on Python 3.7 because rumor is he may be doing a course on Python 3.7. So awesome. He was able to bring it over here. And a couple other smaller things like the, apparently I had never run into this, but if you have an assertion failure on equality and the only thing different is white space, it's kind of hard to tell. So, they now escape characters, too. So, you can see what the white space difference is a little bit better, which is kind of cool. I've never run into that. That's a little hard to print it on. You're like, those are the same.
Starting point is 00:02:46 No, no. That's two spaces, not one right there. Yeah, so for the main port, I wanted to get this out to as many people as possible. So if you are depending on reading markers in your internal code, pay attention to this. So that's it. Yeah, it sounds like a nice cleanup of the internal APIs for extension writers. And that's always good because that probably means more extensions extensions are more likely to be built yep sounds good so have we talked about guis python guis on this podcast i'm not sure if
Starting point is 00:03:14 we have we probably should yeah we probably should yeah let's do that so there's a lot of stuff going on i you know part of the reason i went on that rant is because stuff needs to be happening there, but also because some things are happening. Like we had the WXPython Phoenix release, which is kind of a rebirth of WXPython, which is really great. Well, we also have the same thing going on for Qt. So for a while, there was a sort of a split. There was PyQt. There was PySide.
Starting point is 00:03:43 There was PySide2. There are all these ways. They defended it on different versions of Qt, and it PySide, there was PySide2. There are all these ways. They defended it on different versions of Qt, and it was kind of just generally a mess. So the Qt company is now officially making something called Qt for Python, which as far as I can tell is more or less like a rebirth of PySide2 for what that's worth. So it's really nice that the company that makes Qt, the cross-platform GUI framework,
Starting point is 00:04:09 is really dedicating itself to Python. Yeah. One of the things that I think is cool about the Qt space is they have the Qt designer, and I think that's really nice and important for a heavy visual way to design the UI. I know you can write code and say the position is 20, 20 and it stretches this wide, but like that is not the same as draggy droppy, press the button. You know what I mean? So I got a lot of, I'm pretty excited about this,
Starting point is 00:04:36 let's say. So that's really cool. They basically are keeping it super similar to the Qt C++ API, where that makes sense. So like if you read documentation about C++, that which is the native language for Qt, if you replace the pointer dereference, so the arrow, the dash arrow, to replace that with a dot, you know, that may well be the Python API. Okay, which is good. But some of the problem, some of the drawbacks, let's say that are like, it doesn't necessarily leverage the Pythonic features. So like maybe you call a function to do a thing rather than put a decorator on to something else, things like that.
Starting point is 00:05:13 One thing that is nice is a lot of these UI frameworks are super painful to install, right? You can install them on the system and then they don't work so well. Maybe there's like some big long compilation step like WXPython takes forever to pip install it onto Ubuntu. The last time I tried doing that. So they're planning on shipping a wheels version of Qt, which before you had to get like some separate installer or something. So that'll be pretty sweet that you'll just be able to pip install your thing
Starting point is 00:05:42 and it'll come with the foundational stuff you need. That's exciting. Yeah, that is pretty cool, right? So, I mean, I really hope that the company behind Qt putting a big effort into this is going to mean, like, finally a polished version. So, we'll see. I think the licensing might still be GPL and LGPL. So, as a combination, take your pick. I'm not sure what the variations are exactly
Starting point is 00:06:07 there but i don't know i'd like to see something more permissive but who knows still still nice to see some progress here so do you know i was trying to find it do you know the projected release date for the official qt for python or is that they're about, so the article I'm linking to is a blog post calling hello QT for Python. And they say they're working on a technology preview. So that's all I've seen, but they don't seem to have any further information that I easily found. It may be somewhere else. Yeah.
Starting point is 00:06:39 So it'll be, it does say it'll be available under GPL, LGPL and commercial licenses. It talks about when development be available under GPL, LGPL, and commercial licenses. It talks about when development started and stuff like that, but it doesn't seem to have a release date. So there it is. All right, cool. Nice. Well, speaking of sneak peeks on things,
Starting point is 00:06:56 we've talked about MongoDB, the 4.0 release that's coming. We've talked about that before, but now you can play with it. So the 400 RCO zero is now available. It's the very zeroth version of the RCO. Yes. There's a lot of zeros there. Yeah. So that is out and ready for testing so people can actually get their hands on it and try working. Again, the big news for for this there's a lot of new features but the big news is uh acid transactions and multi-document multi-document acid transactions yes that's that's a pretty big deal and i actually don't know if this is a big deal well there's a lot of things here but non-blocking secondary reads i don't even know if i know what that means so
Starting point is 00:07:41 the idea with the non-blocking secondary reads is one of the ways you can set up mongodb is in a what's called a replica set so there's like a primary thing that you read and write to and then there are other ones which are constantly just staying in sync with that server okay there's a couple of benefits to that like you could put them into say different data centers the primary thing is if like for some reason the the main server the primary server fails it'll automatically switch to one of the secondary ones so it's kind of like a failover redundancy sort of thing as well but you can configure it in a way that you can say i would like to read from the the non-primary database as a way of like adding read scalability.
Starting point is 00:08:28 So like if I have five servers in the cluster, if I don't do anything, I can only talk to the primary one as a single server and I get no boost of concurrency, let's say. But if you say, I want to read from the others, well, then all of a sudden there's like, you're sort of farming that across six different servers know servers primary plus other five or whatever right that used to block for uh consistency reasons and now apparently they found a different way to ensure consistency
Starting point is 00:08:55 maybe because of the transactions okay anyway that's a long long explanation for what i think that means that doesn't make sense and that's cool. At least I know that there are a lot of people that choose a SQL database over a document database mostly because of the lack of transactions. And so that's one of the reasons why I brought this up because I'm excited about transactions. Yeah, I think that's super exciting as well for the reason you just said.
Starting point is 00:09:21 What I do think is interesting is as people get to more serious applications, they get to a place where often they give up transactions anyway for sort of concurrency, right? Like, you know, if I go to Amazon, it's not like, and I go to order something, it's not just going to lock all of Amazon while I interact with, you know, my order. What it's going to do is like, say, we'll replace the order. And if it happens to be that actually the thing you ordered sort of sold out just at the moment that you pressed it, you'll get like a message or something, right? Like, hey, sorry, we couldn't fulfill it or whatever. Here's your refund. So there's a lot of these
Starting point is 00:09:59 sort of compensation things that get put into like high scalable stuff. I just grabbed Amazon as an example. I don't really know how they work. But you know, there's a lot of these large sites that sort of don't use full on transactions in the same sense that other ones do. So it's pretty interesting. It's interesting in that I don't really think transactions are something I'm going to be using in any of my sites. They just don't really seem to be necessary with a few possible exceptions. I'll get to what those might be in a little bit. But yeah, I think you're right that when people feel they need it, or there are a few situations where you really do need it, this is super interesting. One other thing that's kind of cool that's not a 4.0 thing, but it's in a 3.6, which is the one right before this, as far as I know, is actually the streaming API. So if I've got like, say, WebSockets, or something that I want
Starting point is 00:10:50 notification of like push of change to the database, you could like run a query and say, I want to stream new results that hit this query. And then as stuff is inserted to the database that matches, it'll get pushed out to you instead of repolling the database. So suppose I connect to a chat server and I set up WebSockets. You could like literally subscribe to like these change streams on like the conversation record. And you would just get them pushed back down instantly without any polling end to end. Okay. That's pretty cool.
Starting point is 00:11:21 It's kind of like RethinkDB's feature, primary feature was. I guess where I would probably use transactions a lot, and it's not really transactions, but because of transactions, you can do this, is I believe 4.0 also includes rollback checkpoints. For instance, you can grab a replica of a big database or something. And like, for instance, for like during testing, you can have a starting point, do a whole bunch of transactions on it, query it, and then roll back to a previous state. Yeah, that is pretty cool.
Starting point is 00:11:55 And I think maybe that secondary non-blocking read stuff has to do with that as well. You sort of begin a transaction and you start reading. Yeah, anyway. Yeah, yeah, very cool. So I'm glad to see that that's coming along. I feel like the NoSQL document database world and the relational world are kind of like merging.
Starting point is 00:12:13 They're getting closer to each other in a lot of ways, right? We have Postgres getting JSON stuff. We get MongoDB getting transactions, and they're all kind of sort of growing and intersecting in interesting ways. Speaking of interesting, DigitalOcean is pretty interesting. They're doing a lot of good stuff for us. So like the files that you're getting, when you download the podcast, the website, all that stuff is running on DigitalOcean servers. And I'm super, super happy customer of theirs. And they're sponsoring the show as well. So one of the
Starting point is 00:12:42 things that's cool, maybe I mentioned this a while ago, Brian, is their sort of one-click app server configuration. So if I want to create, say, a server with MongoDB all configured, I can go there, say, create me a droplet with this version of Mongo or with this other web framework set up, and it'll automatically create all the server configuration and have everything set up and ready to go within like 60 seconds so really really nice and the probably the biggest thing if you are not using digital ocean you can get a hundred dollar credit by going to pythonbytes.fm slash digital ocean so that's a pretty good deal yeah that's great yeah awesome so if you're looking for a nice affordable fair and very fast server hosting them out, pythonbytes.fm slash digitalocean.
Starting point is 00:13:28 So, Michael, have we talked about pipenv in the show before? If I recall correctly, I think we were confused about pipenv. I was confused about pipenv. that pip became sort of the officially recommended way of the packaging authority in Python to manage packages. And I'm like, oh, when did that happen? That was pretty interesting. So there's been a lot of debate. And you said there was kind of a coarse Reddit thread.
Starting point is 00:13:58 Like, imagine Reddit was unkind to people. Could you imagine? Right, yeah. That's unfortunate. But I think it's too bad that that kind of stuff happens. And maybe we should all just speak up like, hey, that comment is out of bounds, right? Anyway, I'm not going to link to it. I don't want to encourage it.
Starting point is 00:14:19 But I do want to link to this thing called PipMF Review after using it in production. So there's this team that used PipMF in production since November 2017. So what is that? A little over half a year, maybe almost exactly half a year. And this sort of comes, they talk about, this is what worked for us. This is what wasn't working so well for us. And in the end, they're like, at no point did anyone in the team ever mention getting rid of PIP-EMF, which actually is a pretty strong statement, apparently. So like, you know, we got to get rid of statement, apparently. So like if no, he said,
Starting point is 00:14:49 no, we got to get rid of this. It's just like, ah, it's not quite working in some way. So here, I'll give you the rundown. The article starts off pretty accurately. It says, the current state of Python's packaging is awful. I don't think anyone would disagree with that. The problem is recognized and there are many attempts to solve the mess. And pip-F was the first and it did get a lot of traction, but not everyone loved it. And he said, one of the areas where PipMF can be a challenge is for libraries. So PipMF is around, is more built for managing the dependency of an application. But if you're a library author, that it doesn't necessarily make a lot of sense. Yeah, I'm on the fence on that. Sure. Sorry, I forgot the guy's name.
Starting point is 00:15:26 The reason that he said this was basically supporting multiple environments goes against PipM's philosophy. So they want a deterministic, reproducible application environment. But, you know, if you're going to do that for, say, PyPy and Python 2.7 and Python 3.6 or whatever, well, then it doesn't really work potentially, right? Because it's, you know, once exact hashes of the exact libraries, and if those don't match, then you're out of luck, right? So that's a challenge. I think that's the primary challenge. Yeah, yeah. And I agree with that. And it's just a challenge. I think that's the primary challenge. Yeah, yeah. And I agree with that. And it's just partly, I think, it's a miscommunication.
Starting point is 00:16:15 PIPM was never intended to work for every library's sort of use. Because libraries, by definition, they don't have their dependencies pinned. It's at the application level where you pin your dependencies. So you say there's this miscommunication, and I definitely think you're right. Because when I looked at pipenv on GitHub, I really saw that as the statement, pipenv is the officially recommended tool for managing application dependencies from PyPA, as pipenv is the officially recommended tool for managing Python dependencies, where really the application should have been bolded, underlined, and all caps. Something to that effect, right?
Starting point is 00:16:52 Right. So pretty interesting. But yeah, I think generally their review of it was good. So I'll try to give you the quick rundown here. So pipfile and pipf pip file lock really are superior to requirements.txt by a ton and the guy said hey i first disliked having flake aid and a security checking tool all built into one thing but i think it's actually great installing from private repositories that works really well creating a new pip file is easy No problems introducing pipenv to new users or installing from a mixture of indexes
Starting point is 00:17:27 and git repos. That was all really good. Virtual env is much easier to get into and understand. Now, let's see, dependencies can be easily installed into a system like Docker. And finally, like I said, no one proposed getting rid of it they were just a few edge cases mostly around the library side of things so yeah pretty good but if you're thinking about using pip in production you know check this article out it's kind of got some good discussion and a lot of follow-up as well i want to add that i was for development, I am going to start, I haven't been using it, but I'm going to start using it, not from the standpoint of handling all of the dependencies for the library dependencies, but more because the setup.py does that. But the transitive dependencies and
Starting point is 00:18:21 also mostly the developer dependencies. So PIPM has a developer feature where you can either create the environment for running or create the environment for development, and those can be different. And traditionally, we've had a requirements underscore dev or something like that, but it's just you kind of have to know it's there. So for that reason, I'm going to try PIPM. The other reason is the dash dash run flag to be able to run in the environment without activating the environment is going to be useful for things like Jenkins runs and things like that. I'm going to give it a shot. I don't have a report yet, but I'm going to start using this as well. Yeah, sounds good. You're going to have to give us a report after a
Starting point is 00:18:59 while. Yeah, definitely. Nice. All right. So you've got some stuff for GitHub Flow, the whole sort of working in GitHub, PRs, submitting issues, open source goodness. Yeah. I've got a development team that's migrating to both a lot of changes in our development workflow. But one of the things is using Git more. And we're using GitLab at work. But this is so a lot of these some of these I use GitHub for open source projects, of course. But here's an article called 15 Tips to Enhance Your GitHub Workflow or GitHub Flow. And a lot of these apply to both Git and GitHub and GitLab.
Starting point is 00:19:37 Some of them are GitHub only. But there's some things that you just sort of need to know about the culture around Git and GitHub and GitLab and everything that you don't actually, it isn't obvious from the start. So I like having an article that calls out a lot of these things. Like one of the talks about, I'm not using projects yet, but I'd like to try to use projects to prioritize issues and maybe track progress and plan for what's going in which release and stuff. Maybe if that's built in, might as well try it. Using tags on issues, I've started using that. I know we have tags on a lot of open source projects like Help Wanted and things like that. There's some standard ones.
Starting point is 00:20:21 Getting to know those are good. Templates are something that really – so a lot of this stuff isn't stuff I know about yet. It's stuff I want to start using. Templates are something like if somebody does a pull request against your project, having some predefined stuff filled in
Starting point is 00:20:38 for them to know what to fill in. And the default template is sometimes kind of lame for certain projects. Like I've got a library that the default one asks for like operating system. Well, I don't really care. It's not going to affect the library I'm using. If the issue is really hard to reproduce, I'll ask somebody and say, hey, I'm trying to reproduce it here and I can't reproduce it.
Starting point is 00:21:05 Anyway, there's a whole bunch of great things like down. One of the things I didn't know about at first was squashing pull requests and squashing commits. That's something that is totally foreign to other, if you're coming to get from other revision control systems. So there's a, just a good list of a whole bunch of goodies. Yeah, that's really cool. And I like the, um, the automated tests and checks on, on pull requests. Like that's really nice. Like if the um the automated tests and checks on on pull requests like that's really nice like if i do a pr to someone else's repo and like my pr automatically gets tested like flake aided or whatever they're you know wanting to have checked right that can tell me right away before they get back to me oh there's a step i missed let me fix that and then you know resubmit the PR or just update
Starting point is 00:21:46 the PR and then have it rerunning. Okay, now everything's good. And I'm sure that on the other side of things, if someone is running a project and it's already passing all that before they even get to it, they can take it more seriously. Yeah. And that helps you with even, you know, you're splitting up branches and so you can have tests running on multiple branches, which is nice to if you're have a long running development feature and then one of the things i want to play with here is a there's a discussion in some about pre-commit hooks and hooking things like black up to your pre-commit hook to make sure the styling is correct oh nice yeah instead of asking just change it yeah your styling is wrong you need to break that line. Fine. We did that for you. Yeah. That's pretty cool. All right. So the last one I got, Brian, is just a feel good story. Python versus legacy Python, that type of thing. So pandas goes Python
Starting point is 00:22:36 only. No more legacy Python for pandas. Wow. That's cool. That's a pretty big deal. Like pandas in the data science space is one of the true foundational items. Maybe it's more popular than any of the others. I feel like people almost always start with pandas, and then once they get their data processed, they, like, move to another library. So pandas going Python 3 only is really, really awesome. I got this off of Twitter from Randy Olson. Thank you for that.
Starting point is 00:23:01 And basically, they're following NumPy's lead. Remember, NumPy is going Python 3 only. So officially starting January 1st, 2019, which is not that far away, seven months-ish, six months, pandas will drop support for legacy Python, and this includes
Starting point is 00:23:17 no backports of security or bug fixes. The final release will be the day before, and that one's going to support python 2 and we're just going to leave it there apparently yeah so i feel like data science has got a little bit of an edge on the python 3 story for everyone and partly because they've come into the ecosystem as a large group more recently than say the web developers or the automator folks who have been around for a long time like the data science stuff has really exploded 2012 and onward so it was a slightly
Starting point is 00:23:52 easier choice i think yeah i think so yeah pretty cool all right well that's it for our news anything personally you want to share uh no i'm just i'm excited to get back to like podcasting and stuff it's been good i know it was a lot of fun to do the live one though at pycon right like nobody cheered for us today not totally hurt anyway right it was so fun to just be in the and get the yeah and like nobody laughed at my jokes yeah maybe they did we'll just never know maybe we need a like a sound yeah you like one of those fake audience tracks no that that'll take away from the real ones we'll do some more live ones we're talking about it right yeah definitely that'd be it was so fun we want to the real ones. We'll do some more live ones. We're talking about it, right? Yeah, definitely. It was so fun. We want to do more. Yeah, maybe we can do some more. We'll figure that out. So are you excited? It's GDPR Eve. Yeah, the only what? Well, yeah, GDPR.
Starting point is 00:24:35 I don't really know how that affects me. But I'm telling people that's why I forget their name so quickly is because I'm complying with GDPR. Oh, man. I have very mixed feelings about GDPR. I'm a fan of privacy and respecting data stuff. I'm not a fan of some of the ways in which they're going about it. I mean, it's a tech requirement written by non-technical people, for starters. Do you have to change, for instance, the courses site of yours? I've been doing nothing but 10 hours a day of GDPR programming all week. Oh, geez.
Starting point is 00:25:11 Yeah, and I'm not done. I got one or two more days. And what drives me crazy about this is I'm an American company, 100% in America. And that Europe has these rules that apply to us which it's not about europe or america like what if india decides later that they have other rules that are inconsistent with what i've done for gdpr and then brazil has other like i just think it's kind of crazy to say like lawmakers in one country can like impose their will on all of the world through these laws it's kind of funky but i'm gonna do it because they pretty much have to. So basically, the reason I'm throwing this out there
Starting point is 00:25:48 is if you run a site where you've got, say, a mailing list or people buy stuff or you collect user data, just be sure to be really careful and look into this. And also, we talked about environments and we talked about pipenv and various other bits of packaging. So I just want to give a quick shout out to the xkcd python environment cartoon which came out a few weeks ago so that would be xkcd.com slash 1987 it's just about the the sort of madness so my python
Starting point is 00:26:20 environment has become so degraded that my laptop has been declared a super fun site it's got homebrew for python it's got the os python anaconda it's got um pip another pip easy install okay it's pretty good right yeah yeah i think we'll probably try to i'm going to link to uh the uh kenneth reitz's writes his reitz i should just stop trying to pronounce names his PyCon talk because there was a lot of stuff in there about like the history of packaging that I didn't know about so it's a good listen yeah you should definitely link to that that's awesome
Starting point is 00:26:56 alright well thank you Kenneth for working on pipenv and thank you Brian for sharing everything with all of our listeners thank you thank you for listening, for sharing everything with all of our listeners. Thank you. Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm.
Starting point is 00:27:17 If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Auchin, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.