Python Bytes - #94 Why don't you like notebooks?

Episode Date: September 6, 2018

Topics covered in this episode: Python Patterns Arctic: Millions of rows a sec (time data) PyCon Australia videos GAE: Introducing App Engine Second Generation runtimes and Python 3.7 I don’t lik...e notebooks PEP 8000 -- Python Language Governance Proposal Overview - TIOBE jump https://www.tiobe.com/tiobe-index/ Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/94

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 94, recorded September 5th, 2018. I'm Michael Kennedy. And I'm Brian Hocken. Hey Brian, how you doing? I'm doing really good. Yeah, you? Excellent. Doing very well. The sun is shining. Summer has not left us yet. It's not that great for productivity, but it's definitely good for keeping the spirits up. Yeah.
Starting point is 00:00:21 You know what else is keeping my spirits up? It's my DigitalOcean servers, the ones running this site and many others. They've been working perfectly. So they've been going really, really strong, and we'll tell you more about them later. But in the meantime, if you want to check them out, pythonbyes.fm slash DigitalOcean, get $100 credit for new users.
Starting point is 00:00:39 Brian, when I was in the C++ world, the C Sharp world, design patterns were like this massive thing. And you had to know all the design patterns. And there was like dependency injection and IOC containers and all this stuff. And I feel like Python doesn't have as much rigor around it because you don't have to jump through so many hoops to make certain things happen, I guess. What do you think? Yeah, I think so. And it's actually something that's interesting because I came from the C++ world.
Starting point is 00:01:08 So C patterns were a thing in C sharp also? Oh, yeah. Okay. Well, I don't even know who the gang of four are, but there were four authors that wrote the design patterns book. Let's see, Eric Gama, Richard helm ralph johnson and i'm not going to try to pronounce that last one john something anyway in uh gosh in the 90s if you were in c++ or c sharp apparently you you read this book or others around design patterns and then when i got into python i did i was a little curious whether that was a thing in Python or not, but I haven't really heard much other than I haven't really needed it. A lot of this stuff isn't really needed.
Starting point is 00:01:49 What I think is interesting is there's those patterns that you see from the Gang of Four and the sort of derivative ones, derivative books and thinking. And a lot of it, like you say, is not needed, but there are other patterns that are really useful and like come in, like for example,
Starting point is 00:02:04 meta classes, for example, or decorators, or there's other patterns that are really useful and like come in like for example meta classes for example or decorators or there's other stuff right generator methods all sorts of stuff that is in here that don't appear in the gang of four because you know c++ or small talk just didn't do that they were highly based on small talk actually their patterns right well one of the things that um that caught my attention today was a tweet by gosh, who's this? Brandon Rhodes. Yep. And he's doing a, he's got a site called python patterns dot guide. And, um, it has, he's sort of going through a lot of, a lot of different, um, I think he's going through the game for a book, but he might be also doing other, pulling together other design pattern things that he's talked. Yeah, he's pulling together information from talks and writing, and I think he's creating more information too.
Starting point is 00:02:55 But there are a whole bunch of these, trying to apply some of these patterns to Python and kind of sometimes different ways to do it. So you can do things in different ways. And so far he's got abstract factory pattern, the builder pattern, factory method, composite, decorator. Yeah, we definitely have decorators. And then things like monkey patches and iterators, things like that. And how that applies, I'm glad that somebody that knows what they're talking about
Starting point is 00:03:24 has tried to figure out how does this all apply in Python. And I haven't really dug too much into this. I just think it's a neat resource to try to read about some of these. Yeah, I definitely think it's a really neat resource. And Brandon has some interesting thinking on design patterns and architectures. He gave a super counter counterintuitive talk called clean architecture. I think it was at Pai, Ohio a couple of years ago. And when I first started watching it, I was like, I just disagree with everything you're saying. This just seems so wrong. And then after 10 minutes, I'm like, but wait a minute, I think it's right. Like, I think I've been thinking about this all wrong. And it really, really caught my attention because I didn't agree with it so much.
Starting point is 00:04:07 But then I'm like, wow, this is really compelling what you're telling me. So maybe I need to rethink what I'm thinking. And whenever I have that feeling, I'm like, whoa, I need to pay attention because I might learn something really good here. Yeah. So that's a good point. I'm not necessarily saying since I haven't really dug through this too much. I'm not sure. I mean, I respect Brandon as a good point. I'm not necessarily saying, since I haven't really dug through this too much, I'm not sure. I mean, I respect Brandon as a smart guy.
Starting point is 00:04:29 I expect that there's some really great stuff in here, but you may not agree with all of it. So we'll try to dig up a link to that clean architecture, too, because that sounds interesting. It's super interesting. Yeah, yeah. It's definitely a good one. Cool. Well, thanks for bringing this up. I love these Python patterns, and I love sort of the how would these traditional, more formalized patterns actually look in our language? And there's a lot of interesting examples there.
Starting point is 00:04:53 Yeah. What do we got next? Well, we got this thing called Arctic. And Arctic is an API framework over top of MongoDB and pandas. And the idea is this is a thing that's been around since around 2012. And its sole purpose is analyzing time series data super fast. So one of their, like their headline is basically, Arctic, millions of rows per second of time data in Python. So that is really quite impressive.
Starting point is 00:05:31 I can tell you a lot of the ODMs and ORMs and stuff, they don't do millions of records per second. So the idea is that it basically bakes in pandas and numpies and all those kinds of things. And it has an underlying data store that's backed by MongoDB. And it actually uses the binary low-level communication. So instead of trying to store all the data and then bringing it back and deserializing each row, I think what it does is it actually just stores the binary data of pandas, and it'll pickle NumPy arrays and stuff like that, and just exchanges the memory structure and
Starting point is 00:06:03 just pulls it straight back and go, yep, here it is. Let look at it and it's it's pretty cool yeah wow yeah definitely i mean there's a lot of applications that use um just huge amounts of time series data so yeah so they say the the two big areas they think it's useful is iot little tiny iot devices that you know maybe python is running on and uh financial. So they're, you know, it's sort of been extracted out of the work that this financial company called man, AHL, I've never heard of them, but I think they're mostly an Asian company, but also in the US around investment and so on. So they've been working on this and they actually have some numbers on how this thing performs relative to other types of projects that they pursued or other things that were available.
Starting point is 00:06:50 So they talk about the different kinds of data that they store and analyze for stock trading and analysis. And they say, look, we have this sort of data that's for one day, a whole bunch of it, maybe 10,000 rows. And they can work with those 10,000 rows in four milliseconds. And they say, compare that to what we were getting out of SQL server, which was 2.2 seconds. So, you know,
Starting point is 00:07:13 500 times slower, which is pretty incredible. And they have this other like tick data, like, you know, the stock ticker type of data. They can say in one second, they can process 3.5 megs worth of that data in Python or 15 megs in Java and there was some other
Starting point is 00:07:27 project that they were trying to improve over called OtherTick which took like 40 seconds versus one. So really, really interesting high performance database backed time series. Neat. Yeah. So if you're into pandas, NumPy and you've got to
Starting point is 00:07:44 store and query a bunch of time series, whatever the reason, this is probably worth checking out. And it's also tested with PyTest, which is pretty cool, right? Well, of course. Any real project is tested with PyTest. That's right. Of course. So one of the things I really like about the Python community is the fact that there's so much sharing of information out of conferences and meetups and things like that so we have a another thing you
Starting point is 00:08:10 found here for uh pycon right yeah so the pycon i don't remember when it was but pycon australia wasn't too long ago and they've already got the uh all the videos, and we have a link to the PyCon Australia videos. And I've got quite a few of them queued up that I'd like to listen to. I'm kind of bad about videos, actually. I often just listen to them and then go back and look at the slide parts of information that I wanted to capture. But I like listening to talks as well. But there's one from Mark Smith, which he always amuses me because his Twitter handle is Judy2k, and he won't tell me why. But his talk is
Starting point is 00:08:53 how to publish a package on PyPI, and that's the one I've watched so far. There's a lot of great talks there, though, but I think this one's a great one. The end punchline is use cookie cutter, but he blasts through not using cookie cutter all this sort of stuff you have to do to get up and you know it's every little piece makes sense and it's not difficult but there are a lot of different little pieces but he gets goes through this entire thing in like less than half an hour and so that's uh pretty impressive to watch him talk about all the different pieces and why they're there and what they're used for so that's a good one to sort of understand what's going on in the packaging world in a very short amount of time.
Starting point is 00:09:31 Oh, that's really cool. Yeah, there's a bunch of cool ones here. A couple in MicroPython, actually. So one writing fast and efficient MicroPython code, and the other is AsyncIO in MicroPython. Both of those are pretty cool, kind of tying to what we were just talking about previously. Yeah, and then there's like, gosh, there's solid APIs, and there's, it looks like a lot of good stuff. And I know that Australia is, since it's a big travel burden to other places,
Starting point is 00:09:59 other PyCons, you'll see some speakers there that you're not going to see other places. So that's cool. Yeah, absolutely. And they have 88 videos, so that's pretty solid. So that's cool. Yeah, absolutely. And they have 88 videos. So that's pretty solid. Yeah. Quite cool. That's a good one.
Starting point is 00:10:09 So before we move on, I'll tell you about another cool thing, DigitalOcean. All right. So a big fan. And so one of the things that they've released, we talked about this just a couple of times, not very much, is this idea of projects. So when you go into your name, your cloud provider, you might have a bunch of servers, a bunch of, you know, ESP storage type things, virtual storage blocks,
Starting point is 00:10:33 load balancers, all sorts of stuff. And it's really hard to know what goes with what do you have a staging environment and production environment, all that kind of stuff, right? So how do you organize that? So digitalOcean has come up with this feature called projects that lets you group things like your droplets, that's virtual machines and floating IPs and back storage like spaces into these different use cases. So you know, yeah, actually, we're done with this project. So we can turn that server off and destroy it and not like the fear of, I don't think we're using this one, but I'm not going to destroy it. I'm not going to delete it because what if I'm wrong? Right? So a very cool feature you can take advantage of for all of their stuff. Check them out at pythonbytes.fm slash digital
Starting point is 00:11:15 ocean, and they'll give you a hundred dollars credit for new users. That's awesome. Hey, let's talk about another cloud provider. I'll write on the back of that. So one of the ways that you can run your code on the internet is like I just described with DigitalOcean, like I do for our stuff, is to create some virtual machines and various other pieces and sort of use it as so-called infrastructure as a service, right? IaaS. But you might also use platform as a service, like here's my code, run it. So Google App Engine, Heroku, those types of things. Okay. So Google App Engine has a pretty interesting announcement, and it's interesting for both
Starting point is 00:11:50 it's good now and like, oh my gosh, I can't believe it was like that. So the announcement is that Google App Engine has released their second generation runtimes, which the Python one is now based on Python 3.7. That's pretty awesome, right? It is. You want to run some code, boom, here's my Python 3.7. That's pretty awesome, right? It is. You want to run some code. Boom. Here's my Python 3.7.
Starting point is 00:12:08 So that's really good. You might think, oh, Michael, what was the previous one? 3.6? 3.5? No, I believe the previous one was 2.7. Oh, no. Until now. Like if you were using Google App Engine,
Starting point is 00:12:18 I believe you had to use a legacy Python, period. Until that was like mid- 2018 that I just said that that wasn't like a statement around 2012 or something that was just now, but let bygones be bygones. And now it's Python three, seven, which is pretty awesome. So apparently it's a pretty big upgrade and you get a bunch of new things. Like for example, it's based on their new sandbox container, sort of Docker-like things. It removes a bunch of restrictions.
Starting point is 00:12:49 Like in addition to only running on the old Python, legacy Python, you could only use a white labeled set of packages. And now in the new Google App Engine, you can use arbitrary packages. Just put them in a requirements file, which is pretty sweet. That's a big change. It is a big a pretty big change. So a lot of cool things like autoscaling and things that are a little bit easier as well.
Starting point is 00:13:09 So anyway, if you're interested in Google App Engine's platform as a service for Python, it just got many, many times better. Yeah. Neat. Yeah, yeah. So Brian, I typically write my code in Python files, not really in notebooks per se. How about you?
Starting point is 00:13:25 Yeah, mostly in files. But I'm trying to learn Jupyter notebooks some and utilize them. They're kind of fun, especially in data science-y realms or looking at plotting data and stuff. Notebooks are fun. But there was a person named Joel Groose that says he does not like notebooks. And Joel is notable because he's not like a random dude on the Internet. But Joel Groos has written a book called Data Science from Scratch. He's done a lot of work in data science, things like that.
Starting point is 00:13:57 I've even had him on TalkPython many moons ago. Yeah, and this wasn't just like a one-off comment. He gave this talk at JupyterCon, and that's kind of hilarious. But the video for that is not available yet as far as I couldn't find it. Because that was just recently or still going on, I'm not sure. But the slides are up. He put the slides up. And for one, it puts me to shame.
Starting point is 00:14:23 This presentation has got so many animations and pictures and stuff plus it's like i haven't even got through it yet it's like 100 pages long or more but it's really good but but it's a serious a serious uh discussion about some of the issues with um with the problems with notebooks that people new to notebooks don't quite get and people old to notebooks just sort of know it and don't really think about it anymore and one of the big ones is that the there's hidden state and so like all and essentially we think of files as uh like you said we normally work in files so they they get run from top to bottom except for you know functions
Starting point is 00:15:03 don't get run they get interpreted as functions and then when they are run they from top to bottom, except for, you know, functions don't get run. They get interpreted as functions. And then when they are run, they're run top to bottom, essentially. And notebooks are not like that. You can jump around and execute different bits of code in different orders if you feel like it. And that stateness can lead to weird, confusing things. So it's just like a gotcha to know about. And then he goes on to talk about some of the issues where if you start learning how to code with notebooks, you may end up, you know, developing some bad habits like importing notebooks instead of just trying to,
Starting point is 00:15:36 I mean, like that's a thing apparently you can do is you can define some functions in a notebook and then import them into another notebook. Well, I mean wouldn't wouldn't be better to just put them in a different library in a package or a library use the package use the library exactly yeah so so some of those um and you know i'm highlighting this not because i think notebooks are evil but because i think it's it's important to start to listen to people saying you know listen to a voice that says they aren't a silver bullet. They have their issues also. And we just need to be careful and make sure you don't fall into those traps. Yeah, these are really interesting.
Starting point is 00:16:13 And these are certainly issues to look out for. And wow, this is a funny presentation. I cannot wait to watch this video. Joel, if you're listening, please let us know when it's out. Or if someone else sees it come out, shoot us a note, either email or Twitter, because this is fantastic. Yeah. Plus, also, like, I can't even imagine how long it took to put together this presentation because it's, yeah, there's a lot of animations in there and it's quite a riot.
Starting point is 00:16:36 It is quite a riot. Yeah. Anyway, there's that. Just the other side of maybe notebooks aren't awesome. Yeah. It's pretty, it's pretty interesting. So we've had a couple of conversations around the various PEPs and stuff that have been maybe causing some kerfuffle
Starting point is 00:16:54 in the community. Obviously, the biggest one was at PEP 572 about the in-place assignments. And that was the thing with all the stress around it that Guido said, hey, after this, this is my last one. I've given my all, I'm out of here. You guys, it's up to you. We actually had Brett Cannon and Carol Willing
Starting point is 00:17:13 on episode 87 to talk all about that, right? And one of the things that we talked about was what comes next, right? If it's not down to Guido to make the final decisions, which is how it has worked, how will the Python community decide what it's up to? So, yeah. So Barry Warsaw has published five peps at least around this. And I don't think this is a decision. It's sort of a structure to further the conversation and make a decision. It's sort of a structure to further the conversation and make a decision. So he just
Starting point is 00:17:47 published not too long ago, PEP 8000, which is Python Language Government Proposal Overview. And I don't know if this is common in PEPs. I haven't seen it that much. But it's like a gathering of other PEPs that are specific details. So there's PEP 8001, 8002, 810, and 811. The first two are about voting and ways in which this government might work. And then the higher ones, the 810s, are actual proposed models. And there's a third one, an 812, that I forgot to put in the notes. And so there's, for the government styles, we have the BDFL governance model as one of the proposed options, which is to elect a new person who is the final decider, right? Basically, you have to step down, who is going to take that place to now participate in that way? We also have the council governance model, which we talked about interesting things like should there be an even or odd number of people in the council? And then the last one,
Starting point is 00:18:48 I think, let me pull that up. I think it is a community. Yeah, the community governance model. And that one's a little more free form. So these are all different ways of possibly arranging and solving that problem. And there's a lot of examples like, let's see how Rust did it. Let's see how OpenStack manages their organization and so on. So there's a lot of concrete stuff there. So anyway, that's pretty cool. If you have a strong thought on this and you want to participate,
Starting point is 00:19:17 get in there, make comments, let people know what you're thinking. Because it's still open. It's not anything decided, right? It's still up in the air. So if you want to have a say, now is the time to make statements. Wow. It's like government working in our own community. What? Incredible. Incredible. Yeah. So this is pretty cool. I don't know where it's going to go, but I like that it's all laid out like this. My guess is it's going to go down the council model, maybe with, I don't know. I think it's going to go down the council
Starting point is 00:19:50 model, but we'll see. Yeah. I think that whatever they do, they need, they should, if there's a council, they should have to like meet together to make decisions and pass around like a talking stick or something. Yes. I love it. Oh, we could come up with something weird that they have to follow. How about the Python staff of power that you were carrying around? Yeah. But then, you know, should it be the blue and yellow one or should it be the green and yellow one? That is a big question.
Starting point is 00:20:23 Yeah, I don't know. Sorry, green and yellow one. That is a big question. Yeah, I don't know. Sorry, green and gold. People in Australia say it's gold, not yellow, but it looks yellow to me. Yeah, I thought that stick was a big hit. I don't know if people don't know what you're talking about. What should they Google to find the stick? I think it's Pythonic Staff of Enlightenment.
Starting point is 00:20:40 I don't know. That's got to do it. How many hits on Google can there be for that? I don't know. Awesome. Yeah, do it. How many hits on Google can there be for that? I don't know. Awesome. So, yeah, they should have to pass that thing around. All right. Well, that's it for our items this week.
Starting point is 00:20:51 You got anything extra you want to share with folks? I don't, actually. Just trudging along. We got a couple more testing codes out. Yeah, very nice. How about you? I've got, of course, some TalkPilot stuff queued up to be released shortly. I have been recording some courses, which are going to be awesome, and I'm very excited about them, doing a bunch of stuff in parallel.
Starting point is 00:21:10 So I'll let you know when that's sort of further along. But I do have two things I want to talk about this week really quickly. One is we got a message on Twitter, and I don't have the name of who sent us. This was John, actually. Thanks, John, who sent us this heads up that Brian Granger, one of the guys behind iPython and Jupyter and all that stuff from the very early days, is giving a free webcast, and it's an ACM-sponsored thing. It says Project Jupyter from computational notebooks to large-scale data science with sensitive data.
Starting point is 00:21:43 So if that sounds interesting to you, I put the link in there. It's this Friday, this episode probably will come out on Thursday. So you got to take action right away. If you're listening, there's probably a recording or something afterwards, you can check that out. The other thing is, you know, we talk sometimes about the popularity of Python. Yeah. So I don't want to beat this one to death too much. It's not really worth its own item, but Python continues to climb yet another ranking. So the TOB index is one of the more well-respected, more long-running ways of ranking programming languages. And I think when we started this podcast, Python was either fifth or sixth. I think it was sixth on this list.
Starting point is 00:22:20 It is now third. Probably because of the podcast. Certainly, partly because of it yeah but that may be a very small part of more maybe it's meaningful but which really interesting is it's now above c++ c sharp javascript it's way above javascript and javascript's going down it's above ruby it's above many many things what it's not above is it is not above Java or C. And not only is it not above them, but it's like half. So it's like 7.6% to C is 15.4%.
Starting point is 00:22:55 It's going to be a long time, if ever, until it gets to a 2 or a 1. But it's definitely doing quite well. Yeah. So, yeah, what is the Tyobi index? I'll have to look into that. Yeah, if you look into it, they talk about their philosophy and, like, where they measure stuff from and so on. It's been a long time since I read it, so I don't remember the details.
Starting point is 00:23:15 But they do lay out where the ranking comes from. Okay, cool. Yeah. All right, well, that's it for this week. Thanks for chatting with me, Brian. Thank you. Bye. You bet.
Starting point is 00:23:22 Bye. Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at PythonBytes.fm. If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Ocken, this is Michael Kennedy.
Starting point is 00:23:44 Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.