Python Bytes - #99 parse - the regex antidote in Python

Episode Date: October 16, 2018

Topics covered in this episode: parse fman Build System fastjsonschema IPython 7.0, Async REPL molten A Python love letter Extras Joke See the full show notes for this episode on the website at ...pythonbytes.fm/99

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everyone, Michael here. I want to take a quick moment for an editorial comment before we get to this week's show. On the last episode, we covered a story from Bloomberg about China implanting hardware hacking devices into motherboards for servers. Since that article came out, there's been a lot of pushback from the organizations named Amazon, Apple, and others. And there's been a few articles that raised some doubts about the veracity of the original Bloomberg article. I'm linking to an article in Forbes called Doubts Swirl About Bloomberg's China Chip Hack Report. This doesn't mean the original article is false or implausible, but it may be. And because of that, I felt like I should add this disclaimer and warning about the coverage we had on episode 98.
Starting point is 00:00:41 Sorry about that. Now, let's get on to some Python-focused topics on this episode with Brian. Hello, and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 99, recorded October 10th, 2018. I'm Michael Kennedy. I'm Brian Akin. Brian, we're coming up really close on episode 100. Yeah, one more. This is 99. Wow. Yeah, we're going to have to do something cool for that one.
Starting point is 00:01:06 But for now, I think it is super cool that DigitalOcean is sponsoring the show, not just for today, but for the rest of the year. That is very cool. Yeah, thank you, DigitalOcean. Check them out at pythonbytes.fm slash DigitalOcean. Get $100 credit for new users. I think I had said this before as a joke to you,
Starting point is 00:01:22 and you didn't necessarily agree with it, but your first item here may belial some agreement that if you have a problem and you solve it with a regular expression, you now have two problems. Yes. Yeah. Well, definitely. We think code, at least you have two things to support. Yeah, that's right.
Starting point is 00:01:38 But you've run across some library that's actually really awesome for simple what you might think of as regex problems. I got this from a tweet from Kenneth Reitz. He's like said, Oh yeah, everybody, by the way, parse is a thing or something like that. Parse is a library that the tagline is it's the opposite of format.
Starting point is 00:01:58 So in the general sense of it, there's a bunch of things you can do. You can, you can parse strings, you can search inside strings, you can find all the element patterns or whatever from a string but you give it you give it both this the string that you're searching and then also a instead of a regular expression for what to search for basically the same string but with parts of it replaced with the curly braces or something like that to say, if I were to have printed it with format using this string, I would have gotten this output and reverse that out.
Starting point is 00:02:33 And then you can get the results out to see all the stuff. That's awesome. So you could say like, this is episode curly curly of Python bytes, and then you could actually parse it. And that little curly curly would say,, and then you could actually parse it. And that little curly curly would say, give you the number. Well, I guess the first example would give you a string and you could put a colon D and it would actually give you an integer 99. Yeah, definitely. That is cool. It has some like cool things too. Cause if you were going to do that and pass in elements of a dictionary, you can have this thing return basically things that look like dictionaries with named elements and both positional elements and named elements.
Starting point is 00:03:09 And it's pretty neat. And I was playing with it like the for each or the find all. So you can put that in a loop to say like for all the elements in. And I gave it a big file, finding a whole bunch of colon or a CSV file or something and pulling out elements. And it works really good. And the thing I like about it is it's more readable than a regular expression. So yeah, for sure. If you've got something simple like that, that you've got multiple people that have
Starting point is 00:03:37 to be able to support it, I think this is a good choice. Yeah, I love it. It's a really cool example. And you can tell that it's probably written by somebody who understands regex well under the covers, but you don't have to think about it because it has like a compiled mode and things like that, which regex often do. Yeah. And you can pass in a pattern apparently, but if you were going to figure out patterns, then why not just use regex? Yeah, quite cool. Anyway. I like it. I'm going to see if I can use it next time I need something like this.
Starting point is 00:04:08 So this next one I want to talk about has to do with GUIs. Can you believe it? Yeah, we've covered that a few times. I think we have. So this one is called the FMAN Build System, and it comes from the project, which is like a dual pane file explorer for Windows and Mac and so on from Michael Herman. So it's a pretty cool project but i'm not interested so much in covering the desktop app that he built per se but the tool to build it so the fman build system right so what it lets you do is it lets you create gui apps for windows mac and linux as in here is my dot app file or my dot exe go click it in fact it
Starting point is 00:04:43 gives you an installer, right? And like a proper installer for Windows, one of those, you know, drag here and has like the applications folder for a.appfile, disk image and macOS. I mean, this is nice. Wow, that's kind of like one of the missing pieces that we've had for packaging and sending out things.
Starting point is 00:05:03 It really is, right? It's quite cool. So, like I said, Windows, Mac, and Linux works really well. It's what he uses for his project. It's open source, so you can use it for free on open source projects. It's licensed under the GPL for commercial stuff, so you can basically buy a license for it. Now, if you're using Qt, you also have to buy a license for Qt, and that's kind of a complicated story. Looking to figure out a little bit more about that,
Starting point is 00:05:31 honestly, I don't really know the full story there. It's sort of, I got this commercial side and this open source side, but at least if you're doing open source stuff, I think it's a really cool option. Yeah, I like that. Even the idea of being able to, matching the model, similar model,
Starting point is 00:05:46 is what Qt's model is, is a decent idea. Yeah, and if you're at the point is to package Qt apps, you know, it's almost probably unavoidable. Yeah, definitely. And I also like, I got to give a shout out to Michael Herman. It's not trivial to say,
Starting point is 00:06:01 take a piece of your project and pull it off as he did with the build system so that it can be usable on other, for other people. That's pretty cool. Yeah, absolutely. So, JSON, not a whole lot of validation there, is there? Well, I think there's a lot of ways to validate JSON, but I don't know if everybody does. I don't think I've, in all the times where I've used Jason to talk with different parts of an application, I usually just kind of assume it's all working right.
Starting point is 00:06:32 Yeah, for sure. But there are validators out there and this one, the one I want to cover has been recommended by a few people. It's, the documentation is a little light. So I think it's called Fast Jason Schema. But the name is descriptive. Yeah, definitely.
Starting point is 00:06:48 And I'm not sure what the, so one of the articles I'm going to link to is got four different libraries, including Fast Jason Schema. And I'm not sure what they were validating. It's like way faster than everything else. So Jason Schema takes five seconds. Jason Spec takes seven.
Starting point is 00:07:05 And then his was the fast JSON schema, 250 milliseconds. And I'm not sure how big of a data set this is to have anything take five seconds or seven seconds. Yeah, it must be the same size one would hope, you know? Yeah, it's a compiling scheme. So the kind of the scheme is it's a pretty simple interface. I think, like I said, the documentation is a little slim, but you describe a schema in terms of what the types of each element is supposed to be like. And I think there's some optional keywords and stuff like that you can throw in there. And then you compile it.
Starting point is 00:07:43 You compile that into your own validator. So this is a, as a, like we were talking about with regex, you could compile it so that it runs faster. And that's what this does. It just comes up, you compile your own validator and then you can use that to,
Starting point is 00:07:58 to validate any, any strings that you want. Yeah. It's cool. Yeah. So JSON schema is a separate specification and you can even learn and learn about it. JSON dash schema.org that allows you to create a secondary JSON file. That is the type system for the original data exchange, right? So if I have
Starting point is 00:08:19 like an address, I could say, okay, here, my schema is, this is an object that has properties like post office box and extended address. Those are strings and so on. You can even have like dependencies and stuff like so. The post office box must be a valid street address, which is defined elsewhere. So this is pretty cool. You take those files, you feed it to this validator
Starting point is 00:08:42 and it'll take anything you get back from say a web service or something and say, yeah, this is valid or not. And this project's been around for a while. But the big news lately is that there's multiple drafts of this JSON schema. And the tool we're talking about covers drafts four, six and seven. Right. Which is pretty nice. Cool. Yeah, very cool.
Starting point is 00:09:01 So there was other ones and they were apparently kind of stale, didn't cover the latest drafts and things. So nice, nice find. And it's way, very cool. So there was other ones and they were apparently kind of stale, didn't cover the latest drafts and things. So nice, nice find. And it's way, way faster. Now, before we get to the next one, I want to tell you about a cool feature at DigitalOcean. So at DigitalOcean, you can, of course, log in and say, create me what they call a droplet, a new virtual machine or various other things, load balancers, firewalls, and so on. And it'll spin up your machine and off you go. And you get some choices like various versions of Ubuntu and other stuff. But what you can do if you'd like is you can create your own local virtual machine, whatever you want, some kind of Linux, as long as you can install a few dependencies that it needs to interact with the DigitalOcean infrastructure and upload that. And from then on, you can just click a button and say, create me
Starting point is 00:09:43 my super special private server, as many as you want with their API or whatever. Very cool. Definitely. Yeah, pretty cool. So if you want more control over how your virtual machines get created and what they even look like, check them out at pythonbytes.fm slash digital ocean. New users get a hundred dollars credit and they've got a bunch of cool stuff that you can do with all their infrastructure. Speaking of infrastructure, a lot of people use IPython these days in the whole data science space, right? Yeah, very big. And people might be tired of me going on and on about async. I know some people are not a fan, but it's just so powerful. And when it's used at the right time, very, very nice.
Starting point is 00:10:22 But until recently, IPython was a thing that you put Python code into, and Async was a thing that you did in files, you know, applications that executed Python code, but they didn't really go together. Yeah, I'm still trying to get my head around them going together, but yeah. So here's the thing. If I have an Async library that I want to use,
Starting point is 00:10:44 basically the only way to use it in IPython previously, I believe, was to spin up all the infrastructure to sort of host the async loop yourself, which is like five or six lines to just call the function. So now in IPython, you can just say, oh, wait, give it a function, and it just runs it automatically. Oh, okay.
Starting point is 00:11:03 That's cool. Yeah, so IPython 7 is out. And one of the big features that it has is the interactive shell now supports async and await, which is really cool. Yeah, that's very neat. Yeah. So this one came to us from Nick Spirit. Thank you, Nick, for sending it in.
Starting point is 00:11:19 And this is written by Mathias Boussignet. And he is the guy who originally cloned the term legacy Python for the world, as far as I can tell. Yeah, I think you're right. Yeah, yeah. He wrote a cool article called planning an early death for Python 2 or something, you know, friendly like that, and talked about referring to as legacy Python, which I think is great. So he also wrote this and he works on IPython and whatnot quite a bit. Talks about how when IPython dropped support for Python 2, how are they able to sort of make these features possible, right? If you want to support these types of things, it was much harder to do so
Starting point is 00:11:57 if you want to use a Python 3.5 feature, but you also want to support Python 2. So it's cool how they talk a lot about that. Yeah. And I also think the open source community is a little, is sort of changing. We, we had this idea that kind of from, from, I think other commercial applications that you should support as many platforms as possible, or like your library should support as many versions of Python as possible. Right. If it could support 2.1, that'd be awesome. But at the same time, there's the reality that if you only support the more recent versions, you can clean up your code and have it be an easier code base for other developers to work in and increase your open source contributions. And I think that's a very real thing. And I think
Starting point is 00:12:43 that's one of, like you said, it's one of the things that they probably addressed with IPython. Yeah, definitely. It sounds like it here, when they talked about doing the same thing for Django was we were able to delete a bunch of code. And the easiest way to maintain code is to not have it. So yeah, it's a great point. And here's another example. A lot of people are using IPython to teach Python and whether or not that there's a debate as to whether or not that's a good thing. But at least now they'll be able to teach async. Yeah, that's a good point. Yeah, it's there's a lot of presentations and stuff done there. And now it's nice and easy to call it. Super cool.
Starting point is 00:13:17 All right. What's the next one you got for us? I have a library called Molten and Molten is... Is it for studying volcanoes or what is this? It's an API framework. So the link we're going to include has a little video demo on it. But it's like a REST API framework used similar to like API star. And in fact,
Starting point is 00:13:38 the kind of the motivation, there's a motivation page that talks about how API star is kind of awesome, but there's some of page that talks about how api star is kind of awesome but there's some of the implementation like a hook model for middleware that this author didn't quite like so they took inspiration from api star and rocket which is a rust framework and tried to make this one and it's a python 3 only because uh only because they're leveraging type hints and type annotation. Yeah, in a really nice way.
Starting point is 00:14:10 It's really clean looking. You can implement an API fairly quickly. But there's also built in, speaking of schema validation, there is schema validation built into this system so that you know the code that you're writing to deal with requests or the coming in, they're already going to be valid before it even hits you. So you won't be hit with invalid data,
Starting point is 00:14:36 which is pretty cool. Yeah, there's a lot of cool validation. So for example, the hello world type method for a web view method is just like def hello and then name colon stir age colon int and it actually you know grabs the value say out of the url or somewhere puts it in there converts it to an integer and passes it and you don't have to figure out you know how do i go get that from the the route match data other weird data sources like that so that's really cool and then they take it farther you say okay well you could have like a string and an integer in the function or if you've got something
Starting point is 00:15:08 more interesting you could define an actual class that has a little decorator so it's a schema like it has an id that's an optional int that it has a description that has a status that can be you know certain values and has default values all sorts of cool stuff and then you can just say this web method takes this like they have a to-do example as a class right it takes a to-do item and it automatically pulls that out does the validation and checking and yeah i'm i'm loving this this is great yeah and also you can define the schema on the output as well to make sure that you're complying. I think it's kind of neat. And the other, there's a couple other neat features of it. It's, or at least features of it, whether or not you like it. The middleware is a functional programming based middleware.
Starting point is 00:15:55 And a lot of the different pieces, like if you want to have a database management, they're all set up to allow them to be isolated easily. So using dependency injection, it's a thing and it allows you to sort of test your different components by themselves or swap out new ones. So it's fun. Yeah, it looks really cool. I'm, you know, well done on that project, you guys. I think it looks like something if I was building an API, I might be pretty excited about using. So I want to round out this episode with something a little fluffy, but nice to just remind everybody why we're here and why we use the tools and the technology that we do. So this last one is called a Python love letter.
Starting point is 00:16:37 Well, I love Python. Yeah, I love it too. So this was actually a thing posted to a Reddit thread by a guy who's pretty new to Python. And he posted, it said, dear Python, where have you been all my life? Right. And, and the thing that the thing he posted was pretty interesting, but also the comments, right. There are many, many comments and just all the people either agreeing or disagreeing or whatever. So the guy says, look, I'm not a developer, but I've been teeing with programming for, um, you know, basic and Pearl and whatnot. been teeing with programming for, you know, with BASIC and Perl and whatnot. And for some reason he decided,
Starting point is 00:17:09 you know, he's done with shell scripts. So we've heard that before, right? He's going to go write some Python. And he said, look, I went and I learned Python. No, I didn't go from zero to production in a day, but if my coworkers will leave me alone,
Starting point is 00:17:21 I might be in production tomorrow. Which is, you know, I think that's just, like kind of sums up a lot of what happens in the Python space. That's neat. Just kind of a fun story. Yeah, it's definitely a fun story. A couple of the comments that came up that I thought were interesting were one person
Starting point is 00:17:36 said, welcome to the club. I came up on C++. My job highly trained me in C and assembly and every project I touch. Can't we do 90% of this, 95% of this in Python? And we do, right? We don't need inline assembly most of the time. Another person said, I have a chip on my shoulder. I want to do things the hard way and understand them.
Starting point is 00:17:57 So I went with C++ because that's real programming. Dang it. But later, after suffering a lot, I kind of learned that learned that you know doing things smarter was way better than doing the hard way and whatnot so uh he loves you know sort of found his way to python i guess one other person said i felt exactly the same way i decided to learn it what a breath of fresh air sadly there are a few things in my life that make me feel like this python and bitcoin give me the same levels of enjoyment i've used j Java, Groovy, Scala, Objective-C, C, C++, et cetera. And nothing feels as good as Python does. So definitely, definitely cool.
Starting point is 00:18:32 And then this person, this is what was notable to me, Brian, closed out his comment is, hell, my next two plan tattoos are Bitcoin and Python logos on my wrists. Way to go. Okay. That's some commitment. The Python fine fine but you're probably gonna regret the bitcoin one is there an abstract cryptocurrency that it's gonna encapsulate like whatever comes next i agree i agree anyway i thought that was fun and it just reminds us what a great community and ecosystem yeah definitely i also just wanted to say, assembly code, real man program in bits.
Starting point is 00:19:07 That's right. 01110, baby. Anyway. All right. So that's for our official items. But I see you have one little extra one here that will also bring fun, excitement, and joy to any presentation if you're just sitting down with a coworker or even at like a meetup, right? Yeah. And it's, I, Oliver Best Walter got me excited about this and it's PowerMode. And I'm linking to something called PowerMode 2, which is a plugin for PyCharm, but there's, there's PowerModes in a couple of different, and it started in Atom, I think. And, and people have probably done it other places. It just makes programming more fun.
Starting point is 00:19:47 This is funny. You introduced this to me. So let me just sort of give people a little description. So imagine as you start typing, it's kind of like a bit like a comic book. The faster you type, the more excited your editor gets. If you copy and like duplicate a method, like a big bam pow thing will pop out. Sparks shoot off of your cursor. The faster you type, the more intense it gets.
Starting point is 00:20:11 Yeah, it's super, super productive and awesome. I've left it. I've turned off the shaking screen, which is a little unsettling to me and the flames. But the rest of it it the sparks flying and everything like that i i've been using it for like a week you leave it on all the time yeah that's awesome because i really like it when i copy and paste and it goes bam i know bam pow that's nice that's cool yeah power mode if you're using pyjama it's definitely fun to check out and you got the link in the show notes cool okay. Okay. All right.
Starting point is 00:20:45 Well, that's a fun one to close it out for sure. All right. Thanks, Brian. And chat with you. Yeah, thank you. Yeah, chat with you next time. Bye, everyone. Okay, bye.
Starting point is 00:20:52 Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Auchin,
Starting point is 00:21:12 this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.