Python Bytes - #39 The new PyPI

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 39, recorded August 14th, 2017. I'm Brian Ocken, and again, Michael is on vacation, and we have a guest host, and this week we have Mahmoud Hashemi. Hey, Mahmoud. Hi there. Great to be here. Yeah, you've been on Testing Code, and you've been on TalkP've been on testing code and you've been on talk python a couple times yeah a couple of my faves for sure yeah well what when i was looking up

Starting point is 00:00:31 talk python it noticed that you were on episode 4 and 54 yeah and i don't know when guido was on you know michael was kind enough to ask my question and i did like a panel thing i i don't know i guess uh yeah it's been really nice to have repeat appearances. People recognize me by my voice now. It's kind of strange, but like I'm very appreciative at the same time. That's good. That's great. And so, thanks a lot for helping to do this today. Yeah, hopefully I can do Michael Wright taking his spot here. Well, let's just jump right in. I'm really excited about your first topic. Oh, sure. So, let's see. First up, I mean, one thing that's been on my radar, I'm not sure if you guys talk about this before, like sometimes I'm listening to Python Bytes and

Starting point is 00:01:10 it's a little bit garbled or something. Have you guys tried calling decode? I'm kind of curious, like why it's not Python stirs. But one thing that's been on my radar is the new PyPI. So if you haven't been on Distutils SIG, you may have not seen that there's actually a new PyPI, pypi.org. And this is going to be the Python package index going forward. So this is what we've been calling warehouse before. Is that right? So warehouse is the software that runs PyPI, know okay and uh so yeah it's a it's a package index it's going to be where all of your wheels and s discs live and there's basically a lot of development that's happening here my friend uh donald stuffed is doing a amazing job with his team basically yeah we're up to 114,598 projects at the moment.

Starting point is 00:02:12 This even lists the number of files, almost a million files with 230,000 users. And so, yeah, I would definitely check out this PyPI.org for yourself. But for the most part, I wanted to talk about how they're deprecating the old PyPI. So PyPI.python.org is now basically just a read-only interface. And if you've tried to upload a package recently, then you may have seen an error, HTTP 4.10, which is like a 404, but this is 4.10 gone, meaning it was here, but now it's gone. And so yeah, you basically make sure to use a new version of setup tools, and it'll automatically start using the new one. As long as your configs don't state otherwise, you might have to update a config. But this is a tremendous leap forward in a lot of ways. And they need some help doing it too,

Starting point is 00:02:55 you know, so it's all open source on GitHub. There are issues. I'm working on one right now. Yeah, it's got a lot of cool features. Have you taken a look, Brian? I've looked around a little bit. Now, one of the things I've noticed like right off the bat is it says up at the top, there's a big red bar that says... I know, it's kind of scary. Yeah. So do you know, I'm guessing eventually at some point, the other interface will just

Starting point is 00:03:18 redirect to here or is there... I mean, you know, cool URLs don't change. Personally, in my view, I'd like it if they just kept it up and put the red bar over there that this is a, you know, archive version of PyPI. But for now, all those URLs are still working. And if you ask me, PyPI.org has been in use for so long because actually, if you've paid close attention, a lot of your downloads, Pip is downloading from the new one. Oh, okay. So, yeah, it's been in production a long time. In fact, they just hit, I think, a petabyte a month in bandwidth downloads.

Starting point is 00:03:52 So, yeah, just for a sense of the cost there, I think it's like in the tens of thousands, like $30,000, $40,000 a month to host PyPI. And that's kindly donated by the Fastly CDN. Should they stop feeling so generous, you know, we got to support our community somehow. So there is a donate button here. But I think that right now, what they need most is sort of like people to work on cool features, like one that I saw has been working on that I'm very excited for. Not strictly PyPI.org, but same team, the Python Packaging Authority. They are working on making a dependency graph between all packages. So if you've ever wondered what depends on what ahead of time, then this would enable that.

Starting point is 00:04:38 So yeah. How do I start working on it? Do I go to the GitHub page? Yeah. So I think it's github.com forward slash pi pa for or i think might be forward slash warehouse yeah okay so and you know donald has been uh very candid about like you know the areas that need development and he's been working very hard he's at amazon now uh and he spends some time working on stuff there oh Oh, one last thing, like Distutils, right? So, they still, there's an email list called distutils-sig, which stands for special interest group. And so, distutils-sig, you can just go join the listserv and you can read the archive and see the conversations they're

Starting point is 00:05:17 having. If you care about packaging, you're probably already on there. But if you aren't, definitely subscribe. Ah, I didn't know about it yeah so we'll try to drop a link in the show notes for that so okay well that's that's really cool pretty good for first topic you know i don't know yeah um definitely and and i and the one one thing i want to add is i know that donald has uh been vocal before about how awful the previous code was yeah i mean it's it's pretty old code right, I don't even know. It may not predate WSGI, but it's pretty old.

Starting point is 00:05:50 You've looked at the new code. I've looked at the new code. I can talk about the new code if we got a second. So I've looked at it. I've used it. It's got 100% coverage. It's got a lot of CI stuff set up. It uses Docker. I had a little bit of trouble like, you know, with the make based approach to running the thing, but it's pretty complex. Like it runs, I think an elastic search and all this stuff. So basically, yeah, you just... People shouldn't be afraid to help out just because they've heard bad things about the old code. No, the new code is pretty idiomatic, I think. And, you know, if you're familiar with SQL alchemy, and I think it uses

Starting point is 00:06:25 also maybe like Pyramid, I think. And it looks like the tests are in PyTest too. Yeah, the tests are definitely in PyTest, which is frankly the only way I've heard and have also found myself. So yeah, it's been good. I could talk about this for a long time, but let's move on to the next topic. Absolutely. So one of the things I just, I just read about this yesterday. There's a, I read about it on make, I think it's the make website, but it's circuit Python is now going to be, is supported by a whole bunch of Adafruit hardware. It's great news for, for hardware hackers and also tinkerers like myself. And so we'll put a link in the show notes to the Make article. But there's also, so I had heard Adafruit announced CircuitPython in January. And it's in open source. It's based on MicroPython. So CircuitPython is also open

Starting point is 00:07:18 source. So I'm not quite sure how they differ, but they've added some things to make it easier to control hardware. And they already had like two devices, Metro M0 and Feather M0 Express versions that support CircuitPython right off the bat. And I guess they're working on a Circuit Playground Express. All of these look like really fun things. But the thing that really caught my attention was Gemma M0 that was announced at the end of July. And this thing is like the size of a quarter. It's a little small thing that you can make wearable software projects with like LEDs and whatever. And you just plug it in and into your computer and you instantly, it's like an

Starting point is 00:08:03 extra drive. You can see a main.py and it just, you can just start programming in Python right away. Yeah, right. So, so basically it's just like, it sort of functions kind of like a USB drive and there's a single main entry point in there and you can just modify it and then, you know, you don't need to install anything or anything like that. Yeah, there's no loading. Apparently it does support Arduino, but you don't, like right off the bat, you don't have to install anything. You can just start programming.

Starting point is 00:08:32 And these are, right now they're currently out of stock, but I'm sure they get new stuff in pretty quick. But it's under 10 bucks to start programming some wearable programming. So I definitely have to get one of these. Yeah, I can't wait to start wearing some running Python. That'd be taking it to the next level. And I'm also going to link to what I thought was great was they realized that, I mean, they are encouraging people to use Python if they can for programming hardware,

Starting point is 00:08:55 but they realized that a lot of people are new to the Python community. So there's a page called Creating and Sharing CircuitPython, a CircuitPython library. And it's got a whole bunch of great links, like basically just telling people what, when we say library, we mean a package or a module with a setup file and doing it all right. And there's little intros to GitHub and Read the Docs and Travis. So is it like, when you say package or module, is this their own format? Or is this like Python packages, wheels, that sort of thing? Yeah, it's just Python stuff. But it's just really quick

Starting point is 00:09:30 tutorials to get people up to speed fast. Sure. So it's like sort of a full, it's got like an end to end thing. It doesn't just send you left and right to other sites. Yeah, right. It's really telling you everything. And it's, they're pretty condensed. Actually, they're pretty good job condensing all that information. Yeah, you don't need the whole context and history of Python packaging. We've come a long way since, you know, eggs and that sort of stuff. Yeah, and then one of the things that is kind of interesting is they have a concept of bundles. And really all a bundle is is a bunch of installable Python packages that are zipped up into a bundle. Sure. We normally don't really care about that because on a larger computer, it's not that big of a deal.

Starting point is 00:10:18 But these little tiny devices, you have to care about how big it is. So you might want to get everything that somebody cool has made, but you don't need it all. You just need like the little part that, you know, blinks the LED for you or whatever. Sure. So it sort of freezes it all together. Yeah. These embedded applications are interesting. So now that, so I maintain this one library called hyperlink, and I guess it's pretty widely used because Twisted depends on it. And so I've gotten some interesting feedback of a few things, like one code review I just went through. I promise this is related.

Starting point is 00:10:51 Basically, I'm using PyTest and I'm writing my assert statements. And, you know, I love that PyTest rewriting with the great error messages and so forth. But I got a comment on my code review that these tests are not runnable in an embedded environment because they will run with dash OO, which elides all of those assert statements. And I'm like, well, you're kind of running the tests wrong if you're not using PyTest. But in these embedded environments, I don't know, maybe the convention is different. So when you get yours, definitely test it out. Maybe you'll have to put a little caveat on your PyTest recommendation if that's not what we can do on hardware. I don't know.

Starting point is 00:11:30 Oh, that's interesting. Yeah. Yeah, I'll definitely have to check that out. So I don't want the hardware people to not buy my book. That would be terrible. Well, that's the thing. With something like Hyperlink, which is for URLs, I'm like 99.9% sure it's going to run exactly the same everywhere.

Starting point is 00:11:47 So I'm confident that if it runs on my machine, it runs on Travis CI, it runs on CodeVeyor or whatever, it's going to app Veyor, I think. It'll be fine. But at the same time, hardware people can be sticklers, as I'm sure you know. So I respect that. I respect that. Cool. Yeah, neat. Well, what do we got next, Mahm. So I respect that. I respect that. Cool. Yeah, neat. Well, what do we got next, Mahmoud?

Starting point is 00:12:08 Oh, right. It's back to me. So I don't know. I mean, so I spend a lot of my time pretty deep into development of all sorts of infrastructural sorts. And I find myself subscribed to Python Dev, Python Ideas, Distutils, SIG. And, you know, you can't read everything there and still have a life. So only a few things catch my eye. But this one in particular caught my eye

Starting point is 00:12:31 because my friend Hinek has this great library called Adders. If you haven't heard of it, my other friend Glyph has a whole blog post that tells you why you have to use this library ATTRS. And it's basically class decorators that make writing high level classes very easy. So it sort of derives from this sort of tradition of name tuples, right? Raymond Heidinger had this great idea to make name tuples, which let us define a class like structured thing for within just one line. But the problem with name tuples is that if you want to add methods to it, then you have to inherit from it. And they're immutable by default. And they don't really even though they generate a dunder init for you, they don't do a whole heck of a lot of validation. So adders comes along fixes all these things as a bunch of other cool

Starting point is 00:13:21 functionality and does it with class decorators, it doesn't pollute your final object with anything you don't want, right? It doesn't because you don't inherit from anything. So you just inherit from object. After glyphs post took off for something, the core Python devs sat up took some notice of this and said, maybe we have been neglecting a higher level interface for quickly defining classes. You know, you just want to have four or five fields all sort of batched together. And you don't want to have a lot of functions that everywhere have to define 15 arguments. So like, how can we quickly, in a nice, concise, Pythonic way define a Python class. And they came up with this new thing, which is still I guess, kind of this is what I mean, like, I don't know if this is a little bit too deep underground, but there's this kind of, there's this GitHub that Eric V. Smith, who is a Python core dev, has called data classes.

Starting point is 00:14:13 And the issues of this have been really interesting to watch because Hynik and a bunch of core devs have been kind of debating, like, hey, should we just use adders? If adders is getting so popular, should it just be part of the core Python? And, you know, people seem to like it, why make something that's so close to it, that sort of thing. There's sort of a draft pep inside of the data classes repo. And there's some examples of how it's used, has some semantic differences, has some syntactic differences. I think that it's pretty interesting to watch. And in fact, they seem to be encouraging more experimentation in this area. Even though I like adders, they seem to want even more options, at least from themselves. So I don't know. I had a good time reading the issues. Maybe other people enjoy it too.

Starting point is 00:15:00 Yeah. So is this, it's similar to adders then? Yeah, it's pretty similar to adders. There are, the differences are sort of fine enough that you have to kind of look closely. Basically, I think that what it is, is like, there's actually an issue called why not just adders. And they sort of explained that they want to use like the new, I think, type hint syntax type stuff. Okay. So, yeah. Other people like kind of said that, hey, maybe like naming wise, data classes is a little bit clearer than others

Starting point is 00:15:34 because someone who is a new Python programmer doesn't know that other is an attribute or something like that. That's true. So, it has some syntactic differences, yeah. And there are some big names in this discussion. There are, there are. So that's what I mean. It's sort of like the inner circle, right?

Starting point is 00:15:52 This is kind of like the sort of stuff that I have to follow. Oh, that's awesome. To be on the edge here. And it happens kind of behind the scenes, but I really do encourage people to join these email lists if you want to see the action happening. You know, you don't have to be a spectator or you don't have to sit maybe in the nosebleed section of the arena on open source, right? You can get up close on the, on like, you know, get the, get the front row seats. And before you know it, you'll actually get up. That's cool. Well, speaking of trying to get involved, unless you've had your head under a rock, data science is a thing. Is it really?

Starting point is 00:16:31 It isn't something that I have to use on a daily basis, but it's definitely something I want to pay attention to. And I ran across, there's a lot of books and tutorials that are huge because it's a huge topic. And I ran across an article called Pandas in a Nutshell. And I like it because it's a Jupyter Notebook style post so you can just see the code working. And it's mostly tutorial by example with just a little bit of extra code for explanation. And the big part of it is really just talking about a couple of data structures. It's talking about the series data structure, which is a one-dimensional array with indices. So just kind of like a vector. And then the data frame, which is like a two-dimensional

Starting point is 00:17:17 array. And all the sort of common things that you need to do with it like um specifying a custom index or adding combining two series or with matrix stuff adding columns adding a column that's based on another column then this sort of stuff sort of seems like excel like working on a spreadsheet i think for a lot of people like that, that is the natural next step, you know, when they want to get into programming. It's either going to be doing visual, or is it like, you know, basic script of some sort inside of Excel, or, you know, maybe move into Python. Yeah, and that's, I guess that's one of the things I like about this little nutshell article is that it's, if somebody is already doing some things in spreadsheets, and they want to switch to working with pandas,

Starting point is 00:18:06 this might be a pretty good stepping point to try to get things going. And it's actually something I'm going to grab some of the concepts in here to try to deal with some of the large amounts of data that I deal with on a daily basis as well. Oh, for sure. So I haven't used, and I bring this up because I'm just starting. I'm trying to use Pandas on a daily basis now. And it is, like, I've actually faced a lot of the same challenges. It's just because it's Python doesn't mean that it, you know, doesn't require some sort of kind of paradigm shift in your thought. It's like thinking about data frames is very different than thinking about lists in Python or dictionaries in Python. It's somewhere between

Starting point is 00:18:45 Python and like full blown relational databases. And so you do have to change the way you think how to approach a problem, especially if you want to get some performance out of the thing, because it has all this great broadcasting logic that it can perform. But it's not going to work if you just iterate over it in four loops. Yeah, and I guess that's where the data frames and series stuff comes in is because you want to do some computation on everything or searching on stuff. So it's kind of like a combination of a database and an in-memory database and something else.

Starting point is 00:19:19 Where I work, some of our data scientists are coming from an R background and the data frame is based on R construct, I believe. So, they, you know, find it quite natural and the Python is what they sort of struggle with and they come to me for that. But a Python person would want to ramp up on the data frame itself. And so, this notebook seems like a great option to do that quickly. Yeah. So, that's just a quick quickie. So that's it. Your last topic. Oh, already. So yeah, basically, just yesterday, I was at this conference, Pi Bay 2017. It's sort of the Bay Area Silicon Valley Regional Python Conference, only the second annual one. It's surprising how long it took to spin up here. Meanwhile, PyOhio has been going for who knows how long.

Starting point is 00:20:05 So anyways, but it was a great conference. Almost 500 developers, pretty good turnout, and a lot of great topics covered. I gave a packaging talk, but the thing I'm going to talk about today is actually the opening panel was on static typing. And it was quite an interesting mix. They had, first of all, it was very international. They had people from Germany, Russia, Poland, USA, and Netherlands. It seems like Europeans are big fans of static typing, for whatever reason, Guido included. So yeah, they had people from, I think, let's see, PyCharm, University of California, Berkeley, then also Quora, Google, and I think another guy too. So it was a really nice cross section of the industry and also the world.

Starting point is 00:20:58 And they just talked about the state of static typing. So right now, just to bring you up to date, I'm not sure how recently you covered this stuff on the podcast, but there are currently three or four static type checkers. So in Python 3, you can specify your types however you'd like. Built into the language, it's not going to do a lot of complaining in case types don't match. First of all, at at runtime nothing is checked right so if you want to like check it would be at like a compile time step the annotations are still

Starting point is 00:21:31 there at runtime and then you have a static type checker the most popular which is mypy run over that and check it kind of like a linter or any other, I mean, static analysis tool. And so there are other ones too, though. Google has one that is not super well documented, but they use it internally. Then PyCharm has this functionality as well, which is also kind of built from scratch. And they made a pretty good case why you would want one built into PyCharm, which is that basically it can do incremental checking. So while you're still writing, it can do sort of partial

Starting point is 00:22:11 checks, maybe a little bit better than MyPy. Oh, right. The last person on the panel, Ukrash Lange from Facebook. He also comes to my meetup. Anyways, so yeah, he's very opinionated about types. We'll get to that in a second. One that wasn't talked about was pylint. So I was actually blown away. I updated my Emacs config recently, and I sort of integrated some more linting stuff. And the default pylint these days can do an amazing amount of inference. It'll tell you you have the wrong number of arguments. It'll tell you that like, oh, this default doesn't match that type. It'll do so many different things in addition to its standard, very opinionated idea of how many arguments a function should even have and that sort of thing. Anyways, so those are our four

Starting point is 00:22:56 sort of type inference engines. And they all are slightly different, but everyone seemed to get along pretty well on stage. And they talked about, you know, potentially in the future, actually merging these things and making a pep that would allow them to all sort of comply together, maybe even turn into a single project. So that was really nice to see. And one of the most interesting questions was basically from the audience. They said like, well, what is the real point behind the static typing? Like, what is the biggest benefit that you see? And there was a

Starting point is 00:23:31 little bit of divergence on this, right? Some people like it for the strictness of it all, being, you know, kind of the dictator of your own code base or whatever, right? But everyone else seemed to be pretty much on the same page that this is for human readability. This is a sort of documentation that can then be checked automatically at a rather large scale. So it's attached to the function, but it's more than just a doc test. And so the interesting side effect of this is that they, even though they all work on static typing stuff, they have a pretty nuanced view of how much static typing you should apply. So they say that like, you know, maybe a list of a certain type, right? But actually defining,

Starting point is 00:24:19 say, a completely recursive type is one, not supported, and two, maybe not even that desirable, because you don't want your function signatures to get super, super complex. So yeah, I mean, it was interesting that they thought the human side of this was the most important part as opposed to say, like a Haskell programmer or something where they want the mathematical correctness of it all. It's also interesting that there's, I would have liked to listen to the discussion of how much you should use of it all. It's also interesting that there's, I would have liked to listen to the discussion of how much you should use of it. Well, it was at LinkedIn. I think that they recorded it. It should go up pretty soon. Yeah, I'll definitely, you know, it was only a couple of days ago,

Starting point is 00:24:53 but once the video is available, I'll maybe send it to you. You can add it to the show notes. Yeah. Some interesting side effects of this, by the way, like some things to consider. So Cython does not support the new Python type syntax. So even though all these guys are kind of on the same page and buddy, buddy, like, you know, for us, people who really like Cython and have used it to achieve a lot of performance and type correctness to some degree, are a little bit out of luck at the moment there, I think that people are working on making a pull request to it or something that would support support for this, but it's such a big change to the syntax. And Cython has its own type syntax, which is less focused on semantic types as this is, and more focused on being in line with C types, which

Starting point is 00:25:37 allows you to have more compact memory-like usage. And the people on the panel were actually pretty clear that the static types advantage is not in performance so a project like pi pi which actually can use types to achieve higher performance they find that the jit is faster without taking hints from the user in the code so it just disregards this stuff oh interesting yeah because the jit has the actual types so just a real quick thought experiment. Like imagine that I say I'm going to pass you a list of integers. That list is three integers long. Okay. I can just check them. One, two, three, all integers. Good to go. No type error. Right. But if I pass you a list of 20,000 integers, right. Every time I pass that to

Starting point is 00:26:23 you, I have to check that every single one is an integer. Otherwise, like, you know, I want to have a type error. That sort of thing is going a little bit against the spirit of Python and being like sort of practical and duck typey and whatnot. So a friend of mine from Intel, you know, was sitting next to me and he was saying

Starting point is 00:26:40 how he came to Python so he wouldn't have to type everything. But thankfully, you don't have to type everything. Like the standard library itself, for instance, all the type definitions for that are available in this joint type shed repo that all of these static type people sort of built together. And I'll link to that in the show notes for sure. Yeah, my favorite use so far that I've came across for my own work

Starting point is 00:27:03 is putting type hints in interface areas like an API module. That's how you interact with the package. So those are great places for type hints. Oh, for sure. And so wait, are you saying that, so there is this old thing like they're trying to get rid of it. Basically, Python has these sort of stub files, these interface files. Some people call them the header files for Python. Like I think it's a.py file. Okay,.py. I was just thinking like I've got a package that has a whole bunch of internal code, but it has like an API module that

Starting point is 00:27:37 you should, people interact with from the outside world. That's a great place for pretty much any interfaces that are not you that's going to use it, that somebody else is going to use it. Those are great places to put type hints, if it matters. Oh, definitely, definitely. Cool. But I'm pretty new to it too. So, thanks for bringing that up. That was very interesting. Yeah, yeah. And I mean, I think that they're still changing this stuff quite a bit, right? So, early adopters go nuts but for the rest of us that like a little bit more boring technologies you know i'm gonna go ahead and let the auto inference engine of pilot figure things out for me i'm not gonna you know jump on

Starting point is 00:28:14 the bandwagon so quickly and i'm glad you brought pilot up i've been sort of dismissing it because i've been using uh flake 8 but i'll have to take a look at PyLint again. Oh, yeah, they've definitely ramped up development on that again. I mean, you have to, for me anyways, right, I just blacklist a lot of the errors because I kind of don't agree with every single thing that they test for, but they make it pretty easy to do. You just change it in i and iFile, no big deal. Last topic, again, comes back to me finally getting my head out

Starting point is 00:28:43 of thinking about PyTest 24 hours a day and one of the things i want to start looking at is some of the some of the web frameworks like uh like django and flask i haven't played with them much personally and there's a bunch of personal projects and work projects i'd like to do with them and And also quite a few people that listen to test and code are web people. And so just to kind of get more understanding of that, I'm trying to learn more frameworks. And one of the things that I've had a hard time getting my head around is ORMs or object relational mappers. So luckily, I ran across a article on Fullstack Python, which is Matt McKay's site.

Starting point is 00:29:26 Amazing site, yeah. And basically, it's Fullstack Python. I don't remember what it's called, but I think it's just object relational mappers. And it goes through what they are. some code that automates the data transfer of uh transfer of data from your internal python objects and classes to database tables and they're useful so that you can write python code instead of writing sql queries and uh he goes talks about that and then also talks about why you need them and some downsides and yeah so the downsides actually were interesting. I didn't think that anybody would talk about

Starting point is 00:30:07 what's wrong with using ORMs. Yeah, I mean, realistically, there are some definite engineering trade-offs. So what do you say? Well, he said, well, a few things are impedance mismatch, which coming from electrical world, I was like, impedance mismatch? That's like 50 ohms to 75 ohms, right?

Starting point is 00:30:26 Yeah, yeah. But it's basically the way a developer is using the objects is different from how can be different from how the data is stored and joined in the tables in your database and especially if you've set up the tables in a way that's not like it's contradictory to how it's being used all the time. It might be slow and you can maybe reshaping your data might speed that up. And then potential for reduced performance. And this isn't surprising to me. If you stick some code in the middle, it's not free. It's got to run. And then also shifting complexity from database to the application code,

Starting point is 00:31:04 which this is something that I didn't quite understand right off the bat, but if you think about it, it's not too bad. But databases are complex pieces of software that have things like stored procedures and a whole bunch of fancy join math and stuff. Right, right. That might not be supported by an ORM. So if you had to do that stuff, you have to do it in your application instead.

Starting point is 00:31:29 So it's using a database in a simpler way, but that complexity has to go somewhere and it'll go in your application code. Yeah, almost certainly. But I mean, until you get database specialists, then it makes it a little bit easier for you as a sole developer, for instance. Yeah, so I punted, it makes it a little bit easier for you as, you know, a sole developer, for instance. Yeah, so I punted at first and used document databases, because I didn't have to think

Starting point is 00:31:51 about ORMs right off the bat. But, but I mean, so so but the thing is that an ORM like he's correct, like a database is definitely a very advanced, complex tool. But a lot of that advances in complexity, you retain even when using an ORM. For instance, a lot of document databases don't have great transaction models, don't have great, you know, sort of multi version concurrency models. And, you know, so when they put all that work into Postgres, or even like MariaDB, or something like that, you can, just by using an ORM, it seems almost as simple as a document database, but you get that operational, you know, feature. Yeah, I'd definitely heard of SQL Alchemy, or SQL Alchemy. But I hadn't heard

Starting point is 00:32:34 of a couple of the others that he listed here, PeeWee and Pony, and SQL Object. Have you used any of these? Yeah, so SQL Alchemy is definitely my go-to. And I'll talk about why in a second. But yeah, I mean, I've used Django's ORM because I did the Django tutorial. And that's one of the first things they teach you. Django has a serviceable ORM, but there are some issues with it that SQLAlchemy actually does a much better job with. And I have used PeeWee, in fact. I like PeeWee.

Starting point is 00:33:03 It's sort of like a simplified version of Django. In my opinion, it basically says like, look, if you're not going to be SQL Alchemy, then you can just be plain simple. And it does a pretty good job. But these days, SQL Alchemy has gotten so good that I just reach for that every single time I'm going to work with a relational database in Python. So one thing that SQL Alchemy has is that it sort of has this working copy of all the models, and they end up being kind of like singletons within a given process space. So with Django, you can actually get two copies of the same thing from the database within the same request or the same process. And that means that basically concurrently somewhere else in your program,

Starting point is 00:33:50 it could change something, save it. And then when you change it in the request handler you're actually trying to work on, that will overwrite the previous change. You know, like if you change column A in one thread and column B in another thread, whichever thread saves first is going to overwrite the other unchanged value. So there's a setting that's off by default, I think, in Django called atomic requests. And you have to enable that to prevent that sort of situation. But Django is not alone in this.

Starting point is 00:34:20 I think that Rails, at least for a very long time, did the same thing. And Django, of course, is sort of Python's response to Ruby on Rails. So, yeah. Does SQL Alchemy not have this problem? So SQL Alchemy doesn't have this problem because basically, yeah, you only get one copy of that thing in your system. It has this sort of local index of primary key to the object version of that row that you're representing, for instance. Okay. key to the object version of that row that you're representing, for instance. So yeah, SQLAlchemy sort of has, it adds a lot of machinery, makes SQLAlchemy a little bit more complex. But I had a friend who I think spent days tracking down this issue with Django. And SQLAlchemy

Starting point is 00:34:58 never would have happened. So you pay some upfront costs with setup with SQLAlchemy, but I think it's definitely worth it. When it comes to this sort of ORM thing, though, like if I can provide some general advice, ORMs are sort of the tools of applications. And if you want to see, if you want to form a real opinion on object relational mappers, you should look at and compare applications. So I spent a fair amount of time reading Reddit source code, which does, I think, use SQLAlchemy. And it uses it without the declarative object mapper. It uses it with the sort of legacy or lower level SQLAlchemy tools. But you still get a real sense for where they use an ORM and where they don't. And SQLAlchemy actually makes it very easy to

Starting point is 00:35:43 pass through normal SQL text. That's another thing I really like about it. It understands that ORMs are an abstraction that's useful 90% of the time. And for that last 10%, you really want the full power of the driver or the database itself. Okay, cool.

Starting point is 00:35:59 I don't have any opinion on these extra couple links that I put in here, but Matt has some dedicated pages for SQL Alchemy and PeeWee. And one of the things I like about Matt's site anyway, the Fullstack Python, is he gives his opinion and information when he has it. And when somebody else has already explained it well enough or better, he just links to their stuff and says, go read that. Yeah, absolutely. No, I mean, he's a real team player in that regard. But I also, I just got to, you know, give a shout out to him. Like he so consistently adds to the site.

Starting point is 00:36:30 It's become such a tremendous resource for someone who wants to develop an application. I'm sure that listeners of this podcast are, for the most part, like already aware of it. But yeah, definitely check it out. Definitely. Well, that's all of our topics so far. We didn't address what you're up to lately other than helping out with podcasts. Yeah. No, it's funny. I'm also like prepping for another podcast as well, but partially examine life, I guess.

Starting point is 00:36:59 But basically, yeah, what am I up to lately? Well, I had a talk at PyBay and because it was based a blog post, I thought it'd be easy to put together slides. Now, it still took like just full disclosure here. It took like another 40, 50 hours to make slides from that blog post. But it seemed really well received. And so I'm very relieved right now. I got some nice life events coming through. Parents coming to town, keeping me real busy.

Starting point is 00:37:22 I also am working on this hyperlink library like I mentioned earlier urls and python and it's used by twisted and some other big projects so fixing bugs in there is always kind of uh contentious which is why I got a lot of support for people who work on things like setup tools which is even more widely used so then beyond this let's see yeah writing blog posts got, I think my draft count is up to like 100 now. But yeah, maybe more conferences, more talks. I don't know why I keep signing up for these things. But it's great meeting people out there. People out there should really look into Pi Bay and regional conferences, meetups. Oh, well, I run a meetup too. The Pine Insula meetup, the hottest new meetup in the Bay Area, Silicon Valley. And so, yeah. Pine Insula. Yeah, yeah. That's a terrible pun.

Starting point is 00:38:12 Hey, this is programming, man. It's all about the terrible puns. So, we, but yeah, Pine Insula. Yeah, I think we even have the site now, pineinsula.org. And, you know, we're on Twitter and so forth. I do my best to record the talks. But for people who want to break into this type of, you know, speaking and that sort of thing, just look no further than your local meetup, right? Go make a 15-minute, 30-minute talk. See how it goes. Iterate on it, right?

Starting point is 00:38:41 Have a brown bag at your company. Just keep iterating on it, right? Have a brown bag at your company, just keep iterating on it. And, you know, something will stick. And then you can submit something like PyCon or whatever. That's a great idea. I think a lot of people think that you could you just have to work really hard on a talk and give it once and then it's done. But a lot of people give them several times. Yeah. And also, like, if there's not a meetup in your area just maybe start one python programmers are literally everywhere so we uh like you know even though there's a south bay python meetup which is sort of like more towards sunnyvale like kind of uh south of mountain view area and there's this sf python meetup uh which is up in san francisco we put one right in the middle and i guess california traffic's bad enough that we sort of

Starting point is 00:39:25 have a captive audience, literally. But we'll get like, you know, I think when Guido came, there were almost 100 people at the meetup. And normally we get like 50. But it's great because everyone can socialize and something a little more intimate. It's a little bit less stressful when you're trying to give the talk yourself too. Yeah. So it wouldn't be a Python Bytes episode if I didn't plug my book. By all means. So one of the be a Python Bytes episode if I didn't plug in my book. By all means. So one of the things I want to bring up is the Python testing with PyTest has a nice discussion forum. It's kind of built into what Pragmatic offers for all the books. But if you ever ask a question on there, it pings me and emails me and says there's a question.

Starting point is 00:40:03 Just this morning, I answered a question. Somebody got on and said that there were actually this, I love this. They said that the book is helping them understand testing better. And I love comments like that. But the, he asked, he had a question about monkey patch versus mock, and I'm not going to get into it too much here, but I did reply to him and it's all up there for everybody else to read too so i'll have a link in the show notes to that so that's great uh yeah those sorts of comments really keep you going i wish that my o'reilly thing had had uh such a discussion forum instead i have to i got my feedback through reviews for a while uh oh yeah yeah but i mean emails too people email and I appreciate it

Starting point is 00:40:45 I get it from all over the place I get it through the discussion forum I get it from Twitter we've got a Slack channel so people come and tell me what's wrong in Slack yeah definitely I know if we're just chatting here I've been really into

Starting point is 00:40:59 Riot.im which is a Python based open source Slack sort of thing and there's also Zul, which is a Python-based open source Slack sort of thing. And there's also Zulip, which is just everywhere these days. They're doing an amazing job. So what's the first one, Riot? Yeah, so riot.im, and it runs a sort of protocol called Matrix. And it's a very, very large thing. It's basically like you can have end-to-end encrypted chats with people who are on it, but I use it because it's an IRC bridge. Like I said, if you want to be sort of in this inner circle, see the goings-ons, IRC is still very much alive. So you've got your list serves and IRC and so forth. And Riot makes that pretty easy to get into. There's a free node bridge and you just join a free node thing

Starting point is 00:41:46 and you can look at IRC through your browser while having end-to-end encrypted chats with your other friends. It also has a sort of peer-to-peer video chat that works really, really well because it's just the WebRTC open source protocol. Works great in Firefox. Well, I'm going to cut you off because we're running long. Oh, wait, yeah, we're way long. Anyways, that's great. Also, I'm going to cut you off because we're running long. Oh, wait. Yeah, we're way long.

Starting point is 00:42:05 Anyways, that's great. Also, I think this is an awesome topic. I think that you should come on to Testing Code and we can talk about IRC and communication channels. That'd be fun. That's actually a great idea. Yeah, for sure. I'm always coming up short with topics. But yeah, here we are just chatting.

Starting point is 00:42:23 That's a great idea. Again, thank you so much for coming on. I love having new voices on here. It's been my pleasure. And thank Michael. You know, when he gets back, I'll send him an email. This has been great.

Starting point is 00:42:34 Yeah, and we'll keep in touch. Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. Get the full show notes, including links at pythonbytes.fm. If you have a news story you'd like featured, visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool.

Starting point is 00:42:59 This is Brian Ocken. On behalf of myself and Michael Kennedy, thank you for listening and sharing this podcast with your friends and colleagues.

Python Bytes - #39 The new PyPI

Topics covered in this episode: [more] The New PyPI CircuitPython Snakes its Way onto Adafruit Hardware Dataclasses Pandas in a Nutshell Extras Joke See the full show notes for this episode on t...he website at pythonbytes.fm/39

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.