Python Bytes - #99 parse - the regex antidote in Python
Episode Date: October 16, 2018Topics covered in this episode: parse fman Build System fastjsonschema IPython 7.0, Async REPL molten A Python love letter Extras Joke See the full show notes for this episode on the website at ...pythonbytes.fm/99
Transcript
Discussion (0)
Hey everyone, Michael here. I want to take a quick moment for an editorial comment before we get to this week's show.
On the last episode, we covered a story from Bloomberg about China implanting hardware hacking devices into motherboards for servers.
Since that article came out, there's been a lot of pushback from the organizations named Amazon, Apple, and others.
And there's been a few articles that raised some doubts about the veracity of the original Bloomberg article.
I'm linking to an article in Forbes called Doubts Swirl About Bloomberg's China Chip Hack Report.
This doesn't mean the original article is false or implausible, but it may be.
And because of that, I felt like I should add this disclaimer and warning about the
coverage we had on episode 98.
Sorry about that.
Now, let's get on to some Python-focused topics
on this episode with Brian. Hello, and welcome to Python Bytes, where we deliver Python news
and headlines directly to your earbuds. This is episode 99, recorded October 10th, 2018.
I'm Michael Kennedy. I'm Brian Akin.
Brian, we're coming up really close on episode 100.
Yeah, one more. This is 99. Wow.
Yeah, we're going to have to do something cool for that one.
But for now, I think it is super cool
that DigitalOcean is sponsoring the show,
not just for today, but for the rest of the year.
That is very cool.
Yeah, thank you, DigitalOcean.
Check them out at pythonbytes.fm slash DigitalOcean.
Get $100 credit for new users.
I think I had said this before as a joke to you,
and you didn't necessarily agree with it,
but your first item here may belial some agreement that if you have a problem and you solve it with a regular expression,
you now have two problems.
Yes.
Yeah.
Well, definitely.
We think code, at least you have two things to support.
Yeah, that's right.
But you've run across some library that's actually really awesome for simple what you might think of as regex problems.
I got this from a tweet from Kenneth Reitz.
He's like said,
Oh yeah,
everybody,
by the way,
parse is a thing or something like that.
Parse is a library that the tagline is it's the opposite of format.
So in the general sense of it,
there's a bunch of things you can do.
You can,
you can parse strings,
you can search inside strings, you can find all the element patterns or whatever from a string but you give it
you give it both this the string that you're searching and then also a instead of a regular
expression for what to search for basically the same string but with parts of it replaced with
the curly braces or something like that to say, if I were to have printed it with format using this string, I would have gotten this output and reverse that out.
And then you can get the results out to see all the stuff.
That's awesome.
So you could say like, this is episode curly curly of Python bytes, and then you could actually parse it.
And that little curly curly would say,, and then you could actually parse it. And that little curly curly
would say, give you the number. Well, I guess the first example would give you a string and
you could put a colon D and it would actually give you an integer 99. Yeah, definitely. That
is cool. It has some like cool things too. Cause if you were going to do that and pass in elements
of a dictionary, you can have this thing return basically things that look like dictionaries with named elements and both positional elements and named elements.
And it's pretty neat.
And I was playing with it like the for each or the find all.
So you can put that in a loop to say like for all the elements in.
And I gave it a big file, finding a whole bunch of colon or a CSV file or something and pulling out elements.
And it works really good.
And the thing I like about it is it's more readable than a regular expression.
So yeah, for sure.
If you've got something simple like that, that you've got multiple people that have
to be able to support it, I think this is a good choice.
Yeah, I love it.
It's a really cool example.
And you can tell that it's probably written by somebody who understands regex well under the covers, but you don't have to think about it because it has like a compiled mode and things like that, which regex often do.
Yeah. And you can pass in a pattern apparently, but if you were going to figure out patterns, then why not just use regex?
Yeah, quite cool.
Anyway.
I like it. I'm going to see if I can use it next time I need something like this.
So this next one I want to talk about has to do with GUIs.
Can you believe it?
Yeah, we've covered that a few times.
I think we have.
So this one is called the FMAN Build System, and it comes from the project, which is like a dual pane file explorer for Windows and Mac and so on from Michael Herman.
So it's a pretty cool project but i'm not interested so much in covering the desktop app that he built per se but the tool to
build it so the fman build system right so what it lets you do is it lets you create gui apps
for windows mac and linux as in here is my dot app file or my dot exe go click it in fact it
gives you an installer, right?
And like a proper installer for Windows,
one of those, you know, drag here
and has like the applications folder
for a.appfile, disk image and macOS.
I mean, this is nice.
Wow, that's kind of like one of the missing pieces
that we've had for packaging and sending out things.
It really is, right?
It's quite cool. So,
like I said, Windows, Mac, and Linux works really well. It's what he uses for his project.
It's open source, so you can use it for free on open source projects. It's licensed under the GPL
for commercial stuff, so you can basically buy a license for it. Now, if you're using Qt,
you also have to buy a license for Qt,
and that's kind of a complicated story.
Looking to figure out a little bit more about that,
honestly, I don't really know the full story there.
It's sort of, I got this commercial side
and this open source side,
but at least if you're doing open source stuff,
I think it's a really cool option.
Yeah, I like that.
Even the idea of being able to,
matching the model, similar model,
is what Qt's model is, is a decent idea.
Yeah, and if you're at the point
is to package Qt apps, you know,
it's almost probably unavoidable.
Yeah, definitely.
And I also like, I got to give a shout out
to Michael Herman.
It's not trivial to say,
take a piece of your project
and pull it off as he did with the build
system so that it can be usable on other, for other people. That's pretty cool.
Yeah, absolutely. So, JSON, not a whole lot of validation there, is there?
Well, I think there's a lot of ways to validate JSON, but I don't know if everybody does. I don't
think I've, in all the times where I've used Jason
to talk with different parts of an application,
I usually just kind of assume it's all working right.
Yeah, for sure.
But there are validators out there
and this one, the one I want to cover
has been recommended by a few people.
It's, the documentation is a little light.
So I think it's called Fast Jason Schema.
But the name is descriptive.
Yeah, definitely.
And I'm not sure what the,
so one of the articles I'm going to link to
is got four different libraries,
including Fast Jason Schema.
And I'm not sure what they were validating.
It's like way faster than everything else.
So Jason Schema takes five seconds.
Jason Spec takes seven.
And then his was the fast JSON schema, 250 milliseconds.
And I'm not sure how big of a data set this is to have anything take five seconds or seven seconds.
Yeah, it must be the same size one would hope, you know?
Yeah, it's a compiling scheme.
So the kind of the scheme is it's a pretty simple interface.
I think, like I said, the documentation is a little slim, but you describe a schema in terms of what the types of each element is supposed to be like.
And I think there's some optional keywords and stuff like that you can throw in there.
And then you compile it.
You compile that into your own validator.
So this is a,
as a,
like we were talking about with regex,
you could compile it so that it runs faster.
And that's what this does.
It just comes up,
you compile your own validator and then you can use that to,
to validate any,
any strings that you want.
Yeah.
It's cool.
Yeah.
So JSON schema is a separate specification
and you can even learn and learn about it. JSON dash schema.org that allows you to create a
secondary JSON file. That is the type system for the original data exchange, right? So if I have
like an address, I could say, okay, here, my schema is, this is an object that has properties
like post office box and extended address.
Those are strings and so on.
You can even have like dependencies and stuff like so.
The post office box must be a valid street address,
which is defined elsewhere.
So this is pretty cool.
You take those files, you feed it to this validator
and it'll take anything you get back
from say a web service or something and say, yeah, this is valid or not.
And this project's been around for a while.
But the big news lately is that there's multiple drafts of this JSON schema.
And the tool we're talking about covers drafts four, six and seven.
Right. Which is pretty nice.
Cool.
Yeah, very cool.
So there was other ones and they were apparently kind of stale, didn't cover the latest drafts and things.
So nice, nice find. And it's way, very cool. So there was other ones and they were apparently kind of stale, didn't cover the latest drafts and things. So nice, nice find. And it's way, way faster.
Now, before we get to the next one, I want to tell you about a cool feature at DigitalOcean.
So at DigitalOcean, you can, of course, log in and say, create me what they call a droplet,
a new virtual machine or various other things, load balancers, firewalls, and so on. And it'll spin up your machine and off you go. And you get some choices like various versions of Ubuntu and other stuff. But what you can do if you'd like is you can
create your own local virtual machine, whatever you want, some kind of Linux, as long as you can
install a few dependencies that it needs to interact with the DigitalOcean infrastructure
and upload that. And from then on, you can just click a button and say, create me
my super special private server, as many as you want with their API or whatever. Very cool. Definitely. Yeah, pretty
cool. So if you want more control over how your virtual machines get created and what they even
look like, check them out at pythonbytes.fm slash digital ocean. New users get a hundred dollars
credit and they've got a bunch of cool stuff that you can do with all their infrastructure.
Speaking of infrastructure, a lot of people use IPython these days in the whole data science space, right?
Yeah, very big.
And people might be tired of me going on and on about async. I know some people are not a fan, but it's just so powerful.
And when it's used at the right time, very, very nice.
But until recently, IPython was a thing that you put Python code into,
and Async was a thing that you did in files,
you know, applications that executed Python code,
but they didn't really go together.
Yeah, I'm still trying to get my head around
them going together, but yeah.
So here's the thing.
If I have an Async library that I want to use,
basically the only way to use it in IPython previously,
I believe, was to spin up all the infrastructure
to sort of host the async loop yourself,
which is like five or six lines to just call the function.
So now in IPython, you can just say,
oh, wait, give it a function,
and it just runs it automatically.
Oh, okay.
That's cool.
Yeah, so IPython 7 is out.
And one of the big features that it has is the interactive shell now supports async and
await, which is really cool.
Yeah, that's very neat.
Yeah.
So this one came to us from Nick Spirit.
Thank you, Nick, for sending it in.
And this is written by Mathias Boussignet.
And he is the guy who originally cloned the term legacy Python for
the world, as far as I can tell. Yeah, I think you're right.
Yeah, yeah. He wrote a cool article called planning an early death for Python 2 or something,
you know, friendly like that, and talked about referring to as legacy Python, which I think is
great. So he also wrote this and he works on IPython and whatnot quite a bit. Talks about how
when IPython dropped support for Python 2, how are they able to sort of make these features
possible, right? If you want to support these types of things, it was much harder to do so
if you want to use a Python 3.5 feature, but you also want to support Python 2. So
it's cool how they talk a lot about that.
Yeah. And I also think the open source community is a little, is sort of changing. We,
we had this idea that kind of from, from, I think other commercial applications that
you should support as many platforms as possible, or like your library should support as many
versions of Python as possible. Right. If it could support 2.1, that'd be awesome.
But at the same time, there's the reality that if you only support the more recent versions,
you can clean up your code and have it be an easier code base for other developers to work in and increase your open source contributions. And I think that's a very real thing. And I think
that's one of, like you said, it's one of the things that they probably addressed with IPython.
Yeah, definitely. It sounds like it here, when they talked about doing the same thing for Django
was we were able to delete a bunch of code. And the easiest way to maintain code is to not have it.
So yeah, it's a great point. And here's another example.
A lot of people are using IPython to teach Python and whether or not that there's a debate as to whether or not that's a good thing.
But at least now they'll be able to teach async.
Yeah, that's a good point. Yeah, it's there's a lot of presentations and stuff done there.
And now it's nice and easy to call it. Super cool.
All right. What's the next one you got for us?
I have a library called Molten and Molten is...
Is it for studying volcanoes or what is this?
It's an API framework.
So the link we're going to include has a little video demo on it.
But it's like a REST API framework
used similar to like API star.
And in fact,
the kind of the motivation,
there's a motivation page
that talks about how
API star is kind of awesome, but there's some of page that talks about how api star is kind of awesome but there's
some of the implementation like a hook model for middleware that this author didn't quite like so
they took inspiration from api star and rocket which is a rust framework and tried to make this
one and it's a python 3 only because uh only because they're leveraging type hints and type annotation.
Yeah, in a really nice way.
It's really clean looking.
You can implement an API fairly quickly.
But there's also built in, speaking of schema validation,
there is schema validation built into this system
so that you know the code that you're writing
to deal with requests or the coming in,
they're already going to be valid before it even hits you.
So you won't be hit with invalid data,
which is pretty cool.
Yeah, there's a lot of cool validation.
So for example, the hello world type method
for a web view method is just like def hello and then name colon stir age
colon int and it actually you know grabs the value say out of the url or somewhere puts it in there
converts it to an integer and passes it and you don't have to figure out you know how do i go get
that from the the route match data other weird data sources like that so that's really cool
and then they take it farther you say okay well you could have like a string and an integer in the function or if you've got something
more interesting you could define an actual class that has a little decorator so it's a schema like
it has an id that's an optional int that it has a description that has a status that can be you
know certain values and has default values all sorts of cool stuff and then you can just say this web method takes this like they have a to-do
example as a class right it takes a to-do item and it automatically pulls that out does the
validation and checking and yeah i'm i'm loving this this is great yeah and also you can define
the schema on the output as well to make sure that you're complying. I think it's kind of
neat. And the other, there's a couple other neat features of it. It's, or at least features of it,
whether or not you like it. The middleware is a functional programming based middleware.
And a lot of the different pieces, like if you want to have a database management, they're all
set up to allow them to be isolated easily. So using dependency injection,
it's a thing and it allows you to sort of test your different components by themselves
or swap out new ones. So it's fun. Yeah, it looks really cool. I'm, you know,
well done on that project, you guys. I think it looks like something if I was building an API,
I might be pretty excited about using. So I want to round out this episode with something a little fluffy, but nice to just remind everybody
why we're here and why we use the tools and the technology that we do.
So this last one is called a Python love letter.
Well, I love Python.
Yeah, I love it too.
So this was actually a thing posted to a Reddit thread by a guy who's pretty new to Python. And he posted, it said, dear Python, where have you been all my life? Right. And,
and the thing that the thing he posted was pretty interesting, but also the comments,
right. There are many, many comments and just all the people either agreeing or disagreeing
or whatever. So the guy says, look, I'm not a developer, but I've been teeing with programming
for, um, you know, basic and Pearl and whatnot. been teeing with programming for, you know, with BASIC and Perl and whatnot.
And for some reason he decided,
you know, he's done with shell scripts.
So we've heard that before, right?
He's going to go write some Python.
And he said,
look, I went and I learned Python.
No, I didn't go from zero
to production in a day,
but if my coworkers will leave me alone,
I might be in production tomorrow.
Which is, you know,
I think that's just, like kind of sums up a lot of what happens in the
Python space.
That's neat.
Just kind of a fun story.
Yeah, it's definitely a fun story.
A couple of the comments that came up that I thought were interesting were one person
said, welcome to the club.
I came up on C++.
My job highly trained me in C and assembly and every project I touch.
Can't we do 90% of this, 95% of this in Python?
And we do, right?
We don't need inline assembly most of the time.
Another person said, I have a chip on my shoulder.
I want to do things the hard way and understand them.
So I went with C++ because that's real programming.
Dang it.
But later, after suffering a lot, I kind of learned that learned that you know doing things smarter was way better
than doing the hard way and whatnot so uh he loves you know sort of found his way to python
i guess one other person said i felt exactly the same way i decided to learn it what a breath of
fresh air sadly there are a few things in my life that make me feel like this python and bitcoin
give me the same levels of enjoyment i've used j Java, Groovy, Scala, Objective-C,
C, C++, et cetera. And nothing feels as good as Python does. So definitely, definitely cool.
And then this person, this is what was notable to me, Brian, closed out his comment is,
hell, my next two plan tattoos are Bitcoin and Python logos on my wrists. Way to go.
Okay.
That's some commitment.
The Python fine fine but you're
probably gonna regret the bitcoin one is there an abstract cryptocurrency that it's gonna encapsulate
like whatever comes next i agree i agree anyway i thought that was fun and it just reminds us
what a great community and ecosystem yeah definitely i also just wanted to say, assembly code, real man program in bits.
That's right. 01110, baby.
Anyway.
All right. So that's for our official items. But I see you have one little extra one here that will also bring fun, excitement, and joy to any presentation if you're just sitting down with a coworker or even at like a meetup, right?
Yeah. And it's, I, Oliver Best Walter got me excited about this and it's PowerMode.
And I'm linking to something called PowerMode 2, which is a plugin for PyCharm, but there's,
there's PowerModes in a couple of different, and it started in Atom, I think. And, and people have
probably done it other places.
It just makes programming more fun.
This is funny.
You introduced this to me.
So let me just sort of give people a little description.
So imagine as you start typing, it's kind of like a bit like a comic book.
The faster you type, the more excited your editor gets.
If you copy and like duplicate a method, like a big bam pow thing will pop out.
Sparks shoot off of your cursor.
The faster you type, the more intense it gets.
Yeah, it's super, super productive and awesome.
I've left it.
I've turned off the shaking screen, which is a little unsettling to me and the flames.
But the rest of it it the sparks flying and everything
like that i i've been using it for like a week you leave it on all the time yeah that's awesome
because i really like it when i copy and paste and it goes bam i know bam pow that's nice that's
cool yeah power mode if you're using pyjama it's definitely fun to check out and you got the link
in the show notes cool okay. Okay. All right.
Well, that's a fun one to close it out for sure.
All right.
Thanks, Brian.
And chat with you.
Yeah, thank you.
Yeah, chat with you next time.
Bye, everyone.
Okay, bye.
Thank you for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at PythonBytes.fm.
If you have a news item you want featured,
just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Auchin,
this is Michael Kennedy.
Thank you for listening and sharing this podcast
with your friends and colleagues.