Python Bytes - #57 Our take on Excel and Python

Episode Date: December 21, 2017

Topics covered in this episode: Testing Python 3 and 2 simultaneously with retox Robo 3T / RoboMongo regular expressions MongoEngine Introducing PrettyPrinter for Python Excel and Python Extras Jok...e See the full show notes for this episode on the website at pythonbytes.fm/57

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 57, recorded December 19th, 2017. I'm Michael Kennedy. And I'm Brian Ocken. And we have a bunch of really cool stuff to share with you. As always, we've been looking through all of the news sources and finding all the Python goodies for you. Brian, what's the first goody that you found? Anthony Shaw, or I know him as Tony,
Starting point is 00:00:25 has at work on GitHub. I like his handle, Tony Maloney. But he has been working on a tool called Retox. And this is a tool you can really run anything with it, but the original intent is to run your tests. What Tox does is it talks as a tool that will run your setup. If you have packages and you can turn that off, if you don't want the setup to get tested, but it'll test your setup and then run it in different environments. And the typical example is running multiple Python versions and running all your tests in multiple Python or multiple library versions to, to make sure all your tests pass. And there is a detox version that can distribute that across processors,
Starting point is 00:01:18 can speed it up two to three times or two to four times faster or more if you run it on distributed processors. Or if you spent like, what, $15,000 on a new iMac Pro and got 18 cores? Oh, yeah, yeah. You could be 18 times faster or 16 or something. Anthony put together Retox, which does all this, but also does it with a nice GUI so you can watch your tests run, which is really kind of cool. And he did it for Python 2 at first, and then very quickly within a week ported it to Python 3 with some problems along the way, but he worked through them, which is nice.
Starting point is 00:01:50 Yeah, well done. It also has the cool capability of watching a directory. So you can have this GUI sitting up on a monitor somewhere if you've got a couple monitors, or in the corner of your window, and then you can have it watch your source code, and as you save changes, it tests everything on multiple versions of python or multiple hardware or whatever you want to do yeah that's really really cool you know i'll just describe kind of what it looks like so you run it in terminal or command prompt and you get like an antsy art
Starting point is 00:02:21 type of interactive thing and it shows you different columns for the different tests that are all running like Python 2.7, Python 3.6, Lint, PyLint, all these things running in parallel and their status. It's pretty cool, right? Yeah. And one of the things I was trying to do this this morning, I didn't get a chance to quite get it done with meetings and all, but instead of running Python 2 versus 3 and stuff, what I'm going to use it for is running the same set of tests with the same Python, but against different instruments.
Starting point is 00:02:51 So I can run across multiple hardware. That's cool. So you're parallelizing the same code running, but you're parallelizing the hardware, not the versions of Python. Yeah. Yeah, that's pretty awesome. That'll be good.
Starting point is 00:03:04 Yeah, that'll be good. So I thought this episode I'd talk about some of the tools that I've been using for a little while that I really enjoy, but just haven't thought to bring up on the show. So the first one I want to talk about is a database management tool for MongoDB. It used to be called RoboMongo. It got bought by a company, which is an interesting story in and of itself, by a company called 3T. And so now it's Robo 3T, but in my heart, it will always be Robo Mongo. Anyway, what this thing is, is a shell for MongoDB that you type. So command line type of interaction with it, but all the results appear as a GUI results. So you have this kind of part where you type in and you run commands against Mongo, but all the results show up in maybe a tree view, or you can right click and interact with them once
Starting point is 00:03:50 the results come back and stuff. So it's really, really nice. If you're doing anything with MongoDB runs on all the platforms, check it out. Robo3T. Nice way to visualize your database. That's cool. Yeah. It's one of the best, like simple visual interactions with databases because the majority of what you do is still the CLI. It's still the straight interface for working with it. But all the results that come out of that interaction are GUI-based, which I think is cool. It's also interesting in an open-source way. On one hand, it's number 34, most popular repository on GitHub, which is cool.
Starting point is 00:04:22 It's built with Qt, which makes it look like a really nice cross-platform app, which you could build out with PySide or PyCute, I think, and get the same looking app. So that's also a really interesting example. And it was an open source project that was started somewhere in Eastern Europe. I don't remember where. Became really successful. And this other company bought it. And so here's like also like an interesting business model around
Starting point is 00:04:45 open source stuff. Yeah. Cool. So anyway, if you're doing anything with MongoDB, check out RoboMongo. Cool. Sometimes we talked about in the new version of Django, you don't have to do regular expressions anymore for... For the routes, right? For the route definitions, which I think is a positive thing for sure. Yeah, definitely. But there's definitely sometimes where I've been using regular expressions for as long as I've been programming, almost. So programming's always been hard. Yeah. Yeah, shortly into my CS program,
Starting point is 00:05:14 I got thrown a Perl book and said, learn this. So yeah. Anyway, there's a couple articles that came out recently that I thought were really good for people that need to get a handle on regular expressions quickly. And one of them is Regular Expressions Practical Guide. And it's kind of a nice article that talks through using the RE package to do things like parse email addresses and phone numbers and URLs. And those are good examples, even if you don't have to do that. Everybody knows what
Starting point is 00:05:46 those look like. So it's good to sort of learn some of the regular expressions with that. Yeah, I really like the example driven approach as well. Like here's how you match a bunch of different things you might want to. And also the fact that it's using the Python libraries and not just here's a random regular expression means it's really quick to just drop in directly, which is cool. And then there's another article called Regular Expressions for Data Scientists that do some of this, but then also it's mostly focused around parsing text,
Starting point is 00:06:16 of course, with regular expressions. But it's also a good intro, and it dives a little bit deeper into find all and search. And so I think check both those out if you want to beef up on regular expressions. Yeah, that's really cool. Nice. I like the motivational introduction for the data science one. Sometimes this might include searching massive corpus of text. For example, suppose you're asked to figure out who's been emailing whom in a scandal of the Panama Papers. That's 11.5 million documents. We need regular expressions.
Starting point is 00:06:50 Let's go. Yeah, definitely. Yeah. Yeah. Very exciting. Very exciting. Cool. So before we go on, let me just tell you about the sponsor for this week's episode. That is DigitalOcean. Thank you, DigitalOcean, for sponsoring the show as they are, I would say, the major sponsor for the show. They definitely sponsor it more than anyone else. And with good reason. They have really, really cool things going on over there. You know, there's a lot of places you can get your cloud computing resources, but a lot of them are overly complicated, overly expensive, and so on. With DigitalOcean, go over there, quickly set up a server, set up some storage, link them together. It's
Starting point is 00:07:25 really wonderful. So this site's website's running on Digital Ocean, among a bunch of other things that I'm running as well. So definitely a great place to check out. And you can get started by going to pythonbytes.fm slash Digital Ocean. So keeping with my theme of things that I want to talk about that I've been using and really enjoying but haven't bothered to bring up on the show, to complement the other side of the story with the MongoDB thing is Mongo Engine. So you've heard of SQL Alchemy, right, Brian? Yeah, definitely. Yeah, so that's probably the most popular way in Python to talk to relational databases.
Starting point is 00:08:01 Well, the MongoDB equivalent of SQL Alchemy is this thing called Mongo Engine. There's a handful of these so-called document, object document mappers, because you don't have a relational. So it's not an ORM, it's an ODM. And anyway, there's a handful of these for MongoDB, but this one really is quite needed. Let's you take a class derived from a base class and map it into MongoDB. And the way you work with it is very similar to Django's ORM, actually. You map a class, you create an instance of it, you call save, you can do queries on it, all sorts of stuff, but leveraging the hierarchical document TV nature of MongoDB. It also adds new features like MongoDB doesn't really have schemas
Starting point is 00:08:41 in general. They're starting to add some of these features, but there's no schema definition. It's just whatever you put in there. With Mongo Engine, you have a schema that matches your classes, so you have some reliable data structure. MongoDB has no concept of required fields or constraints, like this number must be between 10 and 1,000,
Starting point is 00:08:59 but Mongo Engine does relationships, all sorts of cool stuff, so it really provides a lot of nice structure around working with a NoSQL schema-less database. Oh, that's nice. Yeah, I'll definitely check that out. Yeah, so I've found this to be super helpful. Like actually, a lot of my sites are implemented, including Python Bytes, for example. The one thing that's a little bit of a pain is the deserialization speed is not awesome.
Starting point is 00:09:23 So if you retrieve 100,000 records from MongoDB, that's really fast probably, but then it'll drag on turning those into Mongo Engine documents. But if you're getting 50 or 100, it's totally fine. For whatever reason, it's just not that fast deserialized and stuff. And usually it's plenty fast,
Starting point is 00:09:38 but if you do it 100,000 times, it turns out to like start to show off. Okay. Yeah. All right, what you got first next? Another Pretty Printer. So this is, I guess it's called pretty printer um yeah article called introducing pretty printer for python i think it's an extensible one which is kind of nice um so somebody took it's 3.6 there's three six and above only but uh somebody was taking a look at all the different
Starting point is 00:10:01 ways to pretty print their output for debugging your code and it wasn't really happy with them so added yet another but this is kind of extensible and cool so you can add on if you have your particular way you want to print your classes or your calls to different functions you can customize that and it's a pretty simple interface and i like it yeah it's really nice so if you're in say theL, you can ask it to print out some stuff. And it'll actually do things like format the dictionaries according to PEP 8 with line breaks and indentation. It'll colorize all the values. Like the keys will be one color. The values will be another.
Starting point is 00:10:42 Things like that. So it really does. It is really nice like that. The colorizing is very nice. That's pretty cool. Yeah, other things like that. So it really does, it is really nice like that. The colorizing is very nice. That's pretty cool. Yeah, I would say that the colorizing and also the reformatting for human readable parts is quite nice as well.
Starting point is 00:10:56 So it's very nice and has a nice declarative API as well. So you can even extend it, right? There's an example in there to be able to extend, extend your call with comments even, to have the comments pop out. So let me ask you a question. What do you think the most popular database in the world is? MySQL.
Starting point is 00:11:14 MySQL is definitely a good one. I'm going to say Excel. At least in terms of amount of data, the number of people that create new, quote, databases. Excel is this weird thing that people who don't quite know programming or technology use it to more or less play the same role, right? You've got relationships, you've got formulas and whatnot. So Excel is really super important in the business world all over the place. Well, it so happens that Microsoft is considering replacing the macro scripting language for Excel that is built into it, VBA, with Python.
Starting point is 00:11:51 Yes, that would be very nice. So on one hand, I'm like, whee, I don't really want to program Excel, right? It doesn't make me super excited or super happy. On the other, look at what the adoption of Python in the data science space has done for Python in general. It's really, really complimented it and made it a better place. I think that's what would happen if Python became the main scripting language of Excel as well. There's all these people who are not
Starting point is 00:12:16 really working in this space now, and all of a sudden they'd be super interested and contributing things that who knows what they would actually come up with. I'm imagining, this is actually pretty cool. I'm imagining a manager working with a spreadsheet and wanting to add some macro capabilities and getting a little bit too deep into it and getting a little lost, grabbing a developer and say, hey, can you help me with this? And right now, if it was me, I'd say, no, you're on your own. I don't want to do that. I would hide under a desk like, no, Michael's busy. Michael can't talk right now. He's looking for someone to work on PBA. Stay away. But if it's a Python interface there, yeah, I'd help out. So I think the interaction between developers and managers would
Starting point is 00:12:58 increase dramatically if Python was in Excel. Yeah, I think it would be super cool. Who knows what the chances of this actually happening are. I tried to get Python into Windows because, you know, it doesn't ship with Windows. And so Microsoft has this place called UserVoice where you can like, but maybe it's not even a Microsoft thing, but they use UserVoice, which lets you vote on features and requests and things like that. So I actually got over 1000 people to vote for shipping Python three with Windows 10 before it actually came out, but no dice. However, they did put Python into SQL Server. So you can do in process machine learning against your data. So there's one example of them doing this. And if they'll do it for SQL Server, they may well do it for Excel, which would be awesome. So everyone listening can have a voice on this.
Starting point is 00:13:45 If you go there, click the link, upvote the little, upvote the item, and there's a survey you can fill out and tell them Python 3, please. We'll take more of that. That'd be awesome. Yeah. So if this sounds cool to you, be sure to go there, at least upvote it so that they know this is something we all want. And call your congressman. That's right. No, that was net neutrality, which didn't really help so much. But this might.
Starting point is 00:14:08 This might. So that's awesome. All right. Well, that's our items for this week. How about you? Any personal news? We've been getting over, still getting over being sick. So we have a tree up now, but we haven't decorated it.
Starting point is 00:14:20 So that's our decorating for Christmas. That's awesome. That'll be a fun thing. Kids going to be home from school pretty soon? They're all home now. So that's our decorating for Christmas. That's awesome. That'll be a fun thing. Kids going to be home from school pretty soon? They're all home now. So that's fun. So work is super productive these days. Yeah, I can actually get to work at a decent time.
Starting point is 00:14:34 So it's good. Cool. All right. Well, for me, I have a webcast that I'm doing. And I guess one more MongoDB thing. So I'm doing a webcast that I call Let's Build Something in MongoDB in Python. I don't have to click the link. I don't remember when it is.
Starting point is 00:14:48 I think it's February. It is February 22nd. So there's a link at the bottom of the show notes that if you're interested, you can come attend the webcast. And it's just free. It might be fun. I'll definitely go.
Starting point is 00:14:59 So do you have like an intro to Mongo also? I do have an intro to Mongo, a free one, actually. It's at freemongodbcourse.com. Okay. I just thought I brought that. Yeah. And it uses Mongo Engine and it uses RoboMongo. How about that?
Starting point is 00:15:14 Nice. But it doesn't use Excel. Good. All right. Well, Brian, thanks so much for finding all these things and sharing with everyone. Yeah. Thanks. Thank you for listening to Python Bytes.
Starting point is 00:15:25 Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Ocken, this is Michael Kennedy.
Starting point is 00:15:45 Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.