Python Bytes - #57 Our take on Excel and Python
Episode Date: December 21, 2017Topics covered in this episode: Testing Python 3 and 2 simultaneously with retox Robo 3T / RoboMongo regular expressions MongoEngine Introducing PrettyPrinter for Python Excel and Python Extras Jok...e See the full show notes for this episode on the website at pythonbytes.fm/57
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 57, recorded December 19th, 2017.
I'm Michael Kennedy.
And I'm Brian Ocken.
And we have a bunch of really cool stuff to share with you.
As always, we've been looking through all of the news sources and finding all the Python goodies for you.
Brian, what's the first goody that you found?
Anthony Shaw, or I know him as Tony,
has at work on GitHub. I like his handle, Tony Maloney. But he has been working on a tool called
Retox. And this is a tool you can really run anything with it, but the original intent is
to run your tests. What Tox does is it talks as a tool that will run your setup.
If you have packages and you can turn that off, if you don't want the setup to get tested, but it'll
test your setup and then run it in different environments. And the typical example is
running multiple Python versions and running all your tests in multiple Python or multiple
library versions to, to make sure all your tests pass.
And there is a detox version that can distribute that across processors,
can speed it up two to three times or two to four times faster or more if you run it on distributed processors. Or if you spent like, what, $15,000 on a new iMac Pro and got 18 cores?
Oh, yeah, yeah. You could be 18 times faster or 16 or something.
Anthony put together Retox, which does all this,
but also does it with a nice GUI so you can watch your tests run,
which is really kind of cool.
And he did it for Python 2 at first,
and then very quickly within a week ported it to Python 3 with some problems along the way,
but he worked through them, which is nice.
Yeah, well done.
It also has the cool capability of watching a directory.
So you can have this GUI sitting up on a monitor somewhere
if you've got a couple monitors, or in the corner of your window,
and then you can have it watch your source code,
and as you save changes, it tests everything on multiple versions of python or multiple hardware
or whatever you want to do yeah that's really really cool you know i'll just describe kind
of what it looks like so you run it in terminal or command prompt and you get like an antsy art
type of interactive thing and it shows you different columns for the different
tests that are all running like Python 2.7, Python 3.6, Lint, PyLint, all these things running in
parallel and their status. It's pretty cool, right? Yeah. And one of the things I was trying to do
this this morning, I didn't get a chance to quite get it done with meetings and all, but instead of
running Python 2 versus 3 and stuff, what I'm going to use it for
is running the same set of tests
with the same Python,
but against different instruments.
So I can run across multiple hardware.
That's cool.
So you're parallelizing the same code running,
but you're parallelizing the hardware,
not the versions of Python.
Yeah.
Yeah, that's pretty awesome.
That'll be good.
Yeah, that'll be good.
So I thought this episode I'd talk about some of the tools that I've been using for a little while that I really enjoy, but just haven't thought to bring up on the show.
So the first one I want to talk about is a database management tool for MongoDB.
It used to be called RoboMongo.
It got bought by a company, which is an interesting story in and of itself, by a company called 3T. And so now it's Robo 3T, but in my heart, it will always be Robo Mongo. Anyway,
what this thing is, is a shell for MongoDB that you type. So command line type of interaction
with it, but all the results appear as a GUI results. So you have this kind of part where
you type in and you run commands against Mongo, but all the results show up in maybe a tree view, or you can right click and interact with them once
the results come back and stuff. So it's really, really nice. If you're doing anything with MongoDB
runs on all the platforms, check it out. Robo3T. Nice way to visualize your database. That's cool.
Yeah. It's one of the best, like simple visual interactions with databases because
the majority of what you do is still the CLI.
It's still the straight interface for working with it.
But all the results that come out of that interaction are GUI-based, which I think is cool.
It's also interesting in an open-source way.
On one hand, it's number 34, most popular repository on GitHub, which is cool.
It's built with Qt, which makes it look like a really nice cross-platform app, which you
could build out with PySide or PyCute, I think, and get the same looking app.
So that's also a really interesting example.
And it was an open source project that was started somewhere in Eastern Europe.
I don't remember where.
Became really successful.
And this other company bought it.
And so here's like also like an interesting business model around
open source stuff. Yeah. Cool. So anyway, if you're doing anything with MongoDB, check out RoboMongo.
Cool. Sometimes we talked about in the new version of Django, you don't have to do regular expressions
anymore for... For the routes, right? For the route definitions, which I think is a positive thing for
sure. Yeah, definitely. But there's definitely sometimes where I've been using regular expressions
for as long as I've been programming, almost.
So programming's always been hard.
Yeah.
Yeah, shortly into my CS program,
I got thrown a Perl book and said, learn this.
So yeah.
Anyway, there's a couple articles that came out recently
that I thought were really good for people
that need to get a handle on regular expressions quickly.
And one of them is Regular Expressions Practical Guide.
And it's kind of a nice article that talks through using the RE package to do things like parse email addresses and phone numbers and URLs.
And those are good examples, even if you don't have to do that. Everybody knows what
those look like. So it's good to sort of learn some of the regular expressions with that.
Yeah, I really like the example driven approach as well. Like here's how you match
a bunch of different things you might want to. And also the fact that it's using the Python
libraries and not just here's a random regular expression means it's really quick to just drop
in directly, which is cool.
And then there's another article called
Regular Expressions for Data Scientists that do some of this,
but then also it's mostly focused around parsing text,
of course, with regular expressions.
But it's also a good intro,
and it dives a little bit deeper into find all and search.
And so I think check both those out if you want to beef up on regular expressions.
Yeah, that's really cool. Nice.
I like the motivational introduction for the data science one.
Sometimes this might include searching massive corpus of text.
For example, suppose you're asked to figure out who's been emailing whom in a scandal of the Panama Papers. That's 11.5 million documents. We need regular expressions.
Let's go. Yeah, definitely. Yeah. Yeah. Very exciting. Very exciting. Cool. So before we go
on, let me just tell you about the sponsor for this week's episode. That is DigitalOcean. Thank
you, DigitalOcean, for sponsoring the show as they are, I would say, the major sponsor for the show.
They definitely sponsor it more than anyone else.
And with good reason.
They have really, really cool things going on over there.
You know, there's a lot of places you can get your cloud computing resources, but a lot of them are overly complicated, overly expensive, and so on.
With DigitalOcean, go over there, quickly set up a server, set up some storage, link them together. It's
really wonderful. So this site's website's running on Digital Ocean, among a bunch of other things
that I'm running as well. So definitely a great place to check out. And you can get started by
going to pythonbytes.fm slash Digital Ocean. So keeping with my theme of things that I want to
talk about that I've been using and really enjoying but haven't bothered to bring up on the show,
to complement the other side of the story with the MongoDB thing is Mongo Engine.
So you've heard of SQL Alchemy, right, Brian?
Yeah, definitely.
Yeah, so that's probably the most popular way in Python to talk to relational databases.
Well, the MongoDB equivalent of SQL Alchemy is this thing called Mongo Engine.
There's a handful of these so-called document, object document mappers, because you don't have
a relational. So it's not an ORM, it's an ODM. And anyway, there's a handful of these for MongoDB,
but this one really is quite needed. Let's you take a class derived from a base class and map
it into MongoDB. And the way
you work with it is very similar to Django's ORM, actually. You map a class, you create an instance
of it, you call save, you can do queries on it, all sorts of stuff, but leveraging the hierarchical
document TV nature of MongoDB. It also adds new features like MongoDB doesn't really have schemas
in general. They're starting to add some of these features, but there's no schema definition.
It's just whatever you put in there.
With Mongo Engine, you have a schema
that matches your classes,
so you have some reliable data structure.
MongoDB has no concept of required fields
or constraints, like this number must be
between 10 and 1,000,
but Mongo Engine does relationships,
all sorts of cool stuff,
so it really provides a lot of nice structure around working with a NoSQL schema-less database.
Oh, that's nice.
Yeah, I'll definitely check that out.
Yeah, so I've found this to be super helpful.
Like actually, a lot of my sites are implemented, including Python Bytes, for example.
The one thing that's a little bit of a pain is the deserialization speed is not awesome.
So if you retrieve 100,000 records from MongoDB,
that's really fast probably,
but then it'll drag on turning those
into Mongo Engine documents.
But if you're getting 50 or 100, it's totally fine.
For whatever reason,
it's just not that fast deserialized and stuff.
And usually it's plenty fast,
but if you do it 100,000 times,
it turns out to like start to show off.
Okay.
Yeah.
All right, what you got first next?
Another Pretty Printer.
So this is, I guess it's called pretty printer um yeah article called introducing pretty printer for python i think it's an extensible one which is kind of nice um so somebody took
it's 3.6 there's three six and above only but uh somebody was taking a look at all the different
ways to pretty print their output for debugging your code and it wasn't really happy with them so added yet another but this is kind of extensible and cool so
you can add on if you have your particular way you want to print your classes or your calls to
different functions you can customize that and it's a pretty simple interface and i like it yeah
it's really nice so if you're in say theL, you can ask it to print out some stuff.
And it'll actually do things like format the dictionaries according to PEP 8 with line breaks and indentation.
It'll colorize all the values.
Like the keys will be one color.
The values will be another.
Things like that.
So it really does.
It is really nice like that. The colorizing is very nice. That's pretty cool. Yeah, other things like that. So it really does, it is really nice like that.
The colorizing is very nice.
That's pretty cool.
Yeah, I would say that the colorizing
and also the reformatting for human readable parts
is quite nice as well.
So it's very nice and has a nice declarative API as well.
So you can even extend it, right?
There's an example in there to be able to extend,
extend your call with comments even,
to have the comments pop out.
So let me ask you a question.
What do you think the most popular database in the world is?
MySQL.
MySQL is definitely a good one.
I'm going to say Excel.
At least in terms of amount of data,
the number of people that create new, quote, databases.
Excel is this weird thing that people who don't quite know programming or technology use it to more or less play the same role, right?
You've got relationships, you've got formulas and whatnot.
So Excel is really super important in the business world all over the place. Well, it so happens that Microsoft is considering replacing the macro scripting language for Excel
that is built into it, VBA, with Python.
Yes, that would be very nice.
So on one hand, I'm like,
whee, I don't really want to program Excel, right?
It doesn't make me super excited or super happy.
On the other, look at what the adoption of Python
in the data science space has done for Python in general.
It's really, really complimented it and made it a better place. I think that's what would happen if
Python became the main scripting language of Excel as well. There's all these people who are not
really working in this space now, and all of a sudden they'd be super interested and contributing
things that who knows what they would actually come up with. I'm imagining, this is actually pretty cool. I'm imagining a manager working with a spreadsheet
and wanting to add some macro capabilities and getting a little bit too deep into it and getting
a little lost, grabbing a developer and say, hey, can you help me with this? And right now,
if it was me, I'd say, no, you're on your own. I don't want to do that.
I would hide under a desk like, no, Michael's busy. Michael can't talk right now. He's looking
for someone to work on PBA. Stay away. But if it's a Python interface there,
yeah, I'd help out. So I think the interaction between developers and managers would
increase dramatically if Python was in Excel. Yeah, I think it would be super cool. Who knows
what the chances of this actually happening are. I tried to get Python into Windows because, you know,
it doesn't ship with Windows. And so Microsoft has this place called UserVoice where you can like,
but maybe it's not even a Microsoft thing, but they use UserVoice, which lets you vote on features
and requests and things like that. So I actually got over 1000 people to vote for shipping Python three with Windows 10 before it actually came out, but no dice. However,
they did put Python into SQL Server. So you can do in process machine learning against your data.
So there's one example of them doing this. And if they'll do it for SQL Server, they may well do it
for Excel, which would be awesome. So everyone listening can have a voice on this.
If you go there, click the link, upvote the little, upvote the item, and there's a survey
you can fill out and tell them Python 3, please. We'll take more of that. That'd be awesome.
Yeah.
So if this sounds cool to you, be sure to go there, at least upvote it so that
they know this is something we all want.
And call your congressman.
That's right. No, that was net neutrality, which didn't really help so much.
But this might.
This might.
So that's awesome.
All right.
Well, that's our items for this week.
How about you?
Any personal news?
We've been getting over, still getting over being sick.
So we have a tree up now, but we haven't decorated it.
So that's our decorating for Christmas.
That's awesome.
That'll be a fun thing.
Kids going to be home from school pretty soon? They're all home now. So that's our decorating for Christmas. That's awesome. That'll be a fun thing. Kids going to be home from school pretty soon?
They're all home now.
So that's fun.
So work is super productive these days.
Yeah, I can actually get to work at a decent time.
So it's good.
Cool.
All right.
Well, for me, I have a webcast that I'm doing.
And I guess one more MongoDB thing.
So I'm doing a webcast that I call Let's Build Something in MongoDB in Python.
I don't have to click the link.
I don't remember when it is.
I think it's February.
It is February 22nd.
So there's a link at the bottom of the show notes
that if you're interested,
you can come attend the webcast.
And it's just free.
It might be fun.
I'll definitely go.
So do you have like an intro to Mongo also?
I do have an intro to Mongo, a free one, actually.
It's at freemongodbcourse.com.
Okay.
I just thought I brought that.
Yeah.
And it uses Mongo Engine and it uses RoboMongo.
How about that?
Nice.
But it doesn't use Excel.
Good.
All right.
Well, Brian, thanks so much for finding all these things and sharing with everyone.
Yeah.
Thanks.
Thank you for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at PythonBytes.fm.
If you have a news item you want featured,
just visit PythonBytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Ocken,
this is Michael Kennedy.
Thank you for listening and sharing this podcast
with your friends and colleagues.