Python Bytes - #103 Getting to 10x (results for developers)

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 103, recorded November 6th, 2018. I'm Michael Kennedy. And I'm Brian Ocken. Hey Brian, how you doing? I am great today. Yeah, it's another wonderful day. A lot of cool news in the Python space.

Starting point is 00:00:18 I have something that I think you all will really like. I'm looking forward to sharing that. Cool. Yeah, and I know you got some good ones, So I'm looking forward to talking about those as well. Before we do, let's say thank you to DigitalOcean. Check them out at pythonbytes.fm slash DigitalOcean. Get $100 free credit for new users. I'll tell you more about them later. Right now, you've got kind of this magical C++ Python combination thing queued up for us, right? I do. And we found out about this because a listener, I think it's Sebastian Srebart.

Starting point is 00:00:51 Wait, Sebastian Srizard. Brizard. Sorry about your last name. Thank you, Sebastian, for sending that in. Yeah. Brian, if anyone else wants to send in some news, you can also butcher their name in honor of them sending that in, right? Yeah.

Starting point is 00:01:06 As I would as well. I do a lot too. It's becoming a tradition. Thank you, Sebastian. All right. Tell us about this thing. Well, I think because of the flames in the logo, they actually intended to be pronounced Phoenix. It's F-E-N-I-C-S.

Starting point is 00:01:19 It's a project. I'm glad that Sebastian sort of translated this for us. It's a open source computing platform for solving partial differential equations. This is actually really cool. And I'm going to quote right from their site. Phoenix enables users to quickly translate scientific models into efficient finite element code with the high level Python and C++ interfaces to help you get started. It's got powerful capabilities for experienced programmers, but it's easy to get started. And it runs on multiple platforms from laptops to high performance clusters. And it actually looks really pretty cool for anybody dealing with partial differential equations. It's a num focused backed project.

Starting point is 00:02:06 So there's a money behind it, which is cool. Num focus seems like it's everywhere these days. Yeah. Well, especially in, it's good to, to like highlight that,

Starting point is 00:02:15 to say that this is a backed project because it's a, these sorts of things you, you wouldn't want to depend on and then have them go away. Right. Some open source projects, like depending upon them are kind of like getting a puppy, right? They're cute and fun, but then you got to walk them and stuff like that if they get dropped or don't work the way you want. Yeah.

Starting point is 00:02:35 And Sebastian said, right off the bat, it looks cool, but there's some features inside that you might not know about right off the bat. So I'm going to quote an email that he sent us. He said, Phoenix is in fact a C++ project with a full-featured Python interface. The library itself generates C++ code on the fly and be called on the fly from Python. It's almost magical. Under the hood, it uses Swig and recently moved to PyBind 11. I guess the architecture that was set up to achieve this level of automation might be useful for other situations.

Starting point is 00:03:13 Yeah, so that's crazy. You write Python code, this thing writes C++ code, and then calls it all dynamically at runtime. Yeah, that's amazing. And from the project website, being able to develop the algorithm locally on whatever computer you're on, even a laptop or desktop, and then deploying the same code to run in parallel on thousands of processes, that's just awesome. Yeah, I think there's a lot of cool stuff happening here. I mean, not everyone is solving using finite element methods to solve PDEs, right? I understand that's a limited group, but there's a lot of projects that may find what this project is doing interesting from a performance perspective and a dynamic meets compiled language perspective.

Starting point is 00:03:59 Yeah, and it's also one more example of problems being solved in Python that you wouldn't have thought you could solve with Python because they're just too – take too much high-performance computing. That's a super interesting point. So often I hear people who are not that familiar with Python say, well, Python's slow. I'm like, hold on. What do you mean that Python is slow? Like you have to say doing this operation in python is slow because there's so many variations i mean it can be well python is slow and c python so you can use pi pi but it could be way more interesting like well yes but you would actually use a say

Starting point is 00:04:36 some kind of library that has c level compilation elements like sql alchemy or numpy or something and so when you actually talk about that, like you're doing C, you're not doing Python at the hotspots. And then you get way out there with things like Dask and like this and so on. And it's pretty awesome. Yeah, definitely.

Starting point is 00:04:56 It's neat to see. You're kind of a fan of regular expressions, aren't you? Yes, I like regular expressions. Are you a fan of cursive language, like the fancy calligraphy type stuff? Like, do you write that way often? Well, I like it. I don't write that way, but...

Starting point is 00:05:09 I don't write it that way either. But this next project I want to talk about is... Its goal, I think, primarily is to make regular expressions more easy to indicate their intent and easier to maintain. I like to joke that regular expressions are kind of a write-only language. You know, you write them, but then you can't read them anymore, but they're magic and they do their thing, right? So we got this project called Cursive underscore RE.

Starting point is 00:05:37 And RE is, of course, the regular expression module in Python, right? Yeah. This comes from Chris Paddy of Podcast In-Net Fame, and it's actually created by bogdan popa my term to hopefully take a shot at getting the name not too wrong but this is a library for doing regular expressions in modern python so three six and above and the idea is instead of writing in the the string symbols like you normally would you know bracket nine zero dash nine dot you know close bracket dot that sort of thing you write in this higher level

Starting point is 00:06:15 language of combinators they call them and then that overrides the operators in python to generate a regular expression so this is like all sounds kind of wonky and crazy, but if you see it in an example, it's super clear. So you can go and define, say, like a hex color. And the way you define it is you say, I would like to create a, you say, beginning of line plus this hash symbol plus group repeated hex digit or repeated other hex digit exactly three times, plus end of line. And you write it in these things that are symbolic of what regular expressions do,

Starting point is 00:06:55 and they can be, you know, sort of or'd and and together and added together and so on. And then if you call the string representation of them, you get the actual regular expression. Okay, so it's a library to build regular expressions. Yeah, it's a library to build regular expressions. The way Bogdan describes it is, he says, it's a tiny Python library made up of these combinators to help you write regular expressions that you can read and modify six months down the line. Yeah. Definitely one of the problems with regular expressions is they're terse, and that's good and bad.

Starting point is 00:07:31 They're too terse sometimes. Yeah, sometimes. And then the other problem is there's a lot of special symbols. Like, regular expressions are basically nothing but symbols, right? They're like a symbol, like, exploded, and its guts came out all over the text, right? They're like a symbol like exploded and its guts came out all over this text, right? But the problem is like some of those symbols have to be escaped if you actually want to search for them, but it's always kind of hard to know, well, which ones do I have to escape? So it also does things like if you tell it, I'm looking for the text of like square bracket, square bracket, it'll escape properly in regular expression format, the text representation of that, right?

Starting point is 00:08:05 Because bracket normally means something else, like it's a set of characters or something like that. Yeah. So I really like that it's sort of a safe way as well. Like it's a more, you talk about what you want in it, if it has to be escaped or whatever it does. So yeah, it's a cool example. This is great.

Starting point is 00:08:21 Yeah. Yeah. I do too. I can certainly see myself using this if I'm writing regular expressions. It's a cool example. This is great. Yeah? Yeah. I do, too. I can certainly see myself using this if I'm writing regular expressions. It's great. Speaking of seeing yourself. Seeing yourself, yes. Actually, for a long time, I've been following and paying attention to what Adrian Rosebar.

Starting point is 00:08:37 This is even an easy one, and I even massacred this. Adrian Rosebrock. He has a site called PyImageSearch. And essentially, Adrian is teaching people about OpenCV and Python and actually a lot of AI stuff and doing some really cool things with Python and cameras and webcams and stuff and even on Raspberry Pis and stuff and doing lots of neat things. And I don't have a particular article to point to. We just haven't covered it before, I don't think. And people should know about it.

Starting point is 00:09:10 So Adrian has both paid and free resources to teach people all about computer vision. And I think he's doing a cool job. Yeah, he's doing a real cool job. And there's so many great examples over there. I think OpenCV is great. And this is probably the best resource for OpenCV and Python intersected, right? I get emails from every now and then with boxes, detecting things running around on videos or something like that. It's great. So if you got to do anything with computer vision, and you want to use Python to do it, and why wouldn't you want

Starting point is 00:09:40 that? Then this is a cool place, right? Yeah. And he has some like tons of cool projects that he's done over the years of, you know, hooking up a Raspberry Pi with a camera to detect people come into your door and stuff like that. And it's some cool stuff. So, and then you brought up that he had one of the most successful Kickstarters ever. Yeah, he did a Kickstarter called Deep Learning for Computer Vision with Python with python and yeah it's definitely one of the most successful kickstarters ever so if you're if you want to check out that book that's really good i think he has some videos that are coming along with it but i linked to the kickstarter as well yeah it did uh it did okay okay well speaking of doing very very well i want to tell everyone

Starting point is 00:10:22 about digital ocean before we move on and i decided it's time to think about this a little bit differently. Think about DigitalOcean and your hosting and stuff a little bit differently. So, you know, most of us, I think you brought this up. You're not Netflix. You're not Google. You're not Facebook, right? Remember that? You're not LinkedIn. You don't need these crazy architectures. And yet some of the most popular hosting platforms out there, you know, like AWS or Azure, they are built with thousands and thousands of knobs so that you could be Netflix with 50,000 servers running continuous chaos experiments and all that crazy stuff, right? But if you are actually just building what 99.9% of us are, more standard applications, then all that stuff is just overhead and complexity and cost. So you can join companies like Slack, WeWork, Docker, GitLab, and of course, us over at DigitalOcean and pay like five times less than what you would for AWS or Azure, right? So for

Starting point is 00:11:21 example, bandwidth is one cent instead of nine per gigabyte. Servers are five bucks instead of, you know, 50 or whatever reserved instances over at AWS. It's not just about price. Keep it simple. Use what you need. Don't have all these crazy, crazy features that you probably don't actually need because you're not building Netflix or Facebook or LinkedIn. Anyway, try them out at pythonbytes.fm slash digilution. Get $100 credit for new users and see why we all love it for their infrastructure. Awesome. Thanks. Indeed. So, Brian, this next one, have you watched this video here that I'm about to talk about?

Starting point is 00:11:57 I don't think I have. Maybe I have. All right. So, while we're talking, click this and open it up. Okay. And I'll tell people about it. Maybe mute your YouTube that it's going. Okay.

Starting point is 00:12:07 Okay. So this thing that I want to talk about is a visualization of Python development from original origin way back in the day up till 2012. And it's using this underlying system called GORCE, G-O-R-C, GORCE, I'm guessing. And the idea is GORCE is a visualization library that visualizes trees and stuff like this, like graph trees, not real trees. And what you can do is you can point it at a source repository. That could be SVN. It even supports CVS, but Git and Mercurial.

Starting point is 00:12:39 And you can point it there, and it will do an animated over time visualization of that source library. So not just Python, any of these repositories you can point at, and it'll have the files as they grow, the size of the repository, and these little like animated characters come in and start editing files and interacting with it. So if you play this video that we're linking to, this is a GORST animation of python development up till 2012 and you see it starts out and guido is like cruising around adding a little bit then a few more people come and then if you start skipping further and further ahead it gives just like madness at the end like people are swirling all over the place and it's just a really great

Starting point is 00:13:19 way to like see the growth of python visually through animation at least in terms of who's participating and building it what do you think yeah it is cool and one of the fun things to see is that there's sometimes uh uh some people that just sort of sit around one area which makes sense and then other people that fly around and edit all sorts of stuff yeah it's really interesting right and you can see people appear and then they'll fade away. They'll come and make some contributions and then they'll like leave the scene. So anyway, I think this is really cool.

Starting point is 00:13:51 And it's, you know, there's not a lot to take away from it other than it's just nice to appreciate it. I would say watch the first minute and then just like skip minute by minute and watch a little bit because it's 14 minutes. You don't really want to watch the whole thing.

Starting point is 00:14:02 I think I do. But you could just leave it running for the rest of the show. Yeah. So here's what I, here's what I think. The call out to the audience. One, who wants to build this for a 2012 to present on Python on the, the, you know, again, and I think this would make an amazing lightning talk. If you built that video and then you like went up there and just did a four minute animation, wouldn't that be cool? It'd be cool to just have that going on in the background while you did some other talk. Right. Yeah.

Starting point is 00:14:29 Or run this at like between sessions at a conference. I think it'd be great. Anyway, I think people will appreciate checking it out. Cool. So one of the things, Brian, that you hear a lot in software development is that there's often a wide range of skills and productivity between developers. And I'm, I've done a lot of training and interacted with, you know, literally thousands of people in person. And I think it's something of a myth, but I think largely there's a lot to, to this.

Starting point is 00:14:57 Yeah. Some people that just fly and they're just focused and others that just kind of bounce around the keyboard randomly. What do you think? So I know that it's a bit controversial. There is this notion of a 10x developer. And often there's a backlash against it also. But I think people think of it as like somebody that's really 10 times better than the average good developer. And I don't think that's it at all. I think that it's just a notion that there is sometimes orders of magnitude between the most effective person in an organization and the least effective. And I don't know how you argue against that. And if you've ever been at large organizations, it just is.

Starting point is 00:15:38 At least maybe it's not ten times, but there's definitely a lot to it. It's a multiplicative factor in there, I would say for sure. Yeah. And so this is, regardless of what you want to take away, there's some good advice in this article. There's an article that I'm linking to that's what any developer can learn from the best. And I think these are good things. So one of the things, the idea around it is just this isn't magical and it isn't something that is only, it isn't just about skills and hard skills. It's other stuff too. And there's a clear path to excellence. People are not born

Starting point is 00:16:14 great developers. They get there through focused, deliberate practice. And here's a few traits. They just listed some traits of things that they see in good developers versus not so great. So great developers are a few of the traits are problem solving. They're skilled at what they're doing. They're mentors and teachers. They're excellent learners and passionate about stuff. So the problem solver bit, I think is really interesting because often the 10x or the multiplier doesn't come in from that they do the same work faster. It's sometimes they can just look at things differently because of experience, because of playing around with lots of different things and say, oh, let's just solve this problem differently. And it just gets done faster because they take a different approach.

Starting point is 00:17:04 I was talking with somebody about databases recently and there's some for instance there's some problems that can be solved with graph databases easily that are almost impossible with a relational database it's just using the right tool for the right job sometimes yeah and i think a lot of that is the it's not that knowing that means you're 10 times smarter or 10 times better. That means you are curious enough to keep looking and continuously be on the lookout for, well, okay, I now totally know all about relational databases. What else can I learn? How do I compare it? And I think that's one of the biggest traits that I,

Starting point is 00:17:42 I've noticed among people that have, you know, some number of multiplier of efficiency or skills is not that they are necessarily more skilled or have more natural talent, but they're just continuously learning and really passionate about it. And they're just, you know, they're just always picking up these little things that help you at each step, right? Yeah. And there's an honesty of just like, this isn't just about developers. I think it's in every field. For instance, somebody that's working with tools and stuff, somebody that needs to hammer in a lot of nails might be smart enough to go, hey, I should go get a nail gun. I could do this a lot faster with a nail gun. You know what that makes me think of? You could say, I need to have the rings on my engine or my transmission fixed on my car. And you take it to a regular place. You're like, great,

Starting point is 00:18:35 we'll have it back in three weeks. You go watch something like IndyCar or Formula One, they'll pull in and they'll change the transmission in two laps. Those are both mechanics. Those are not the same. Right, exactly. The deliberate practice is important. And I think some people forget that the difference between being an average or a below average developer and being an above average developer is mostly just deciding that you're going to do that

Starting point is 00:19:01 and setting aside some of your time in your life to pick something you want to improve and then go and go and do it, figure out what you want to learn and go do it. So yeah. Anyway, I guess what I would say is to take away from this article is it's really interesting. It's numerically based, right? There's a lot of, it's a based on a survey that guy did with like a thousand folks or something. And most importantly, it's about a growth mindset. Right. It's not to say, well, there's these people and they're just smarter than you or they're smarter than other people or whatever. It's here's how those people got that way and you can do it, too. And I think that's that's the right message.

Starting point is 00:19:39 Right. And the person that wrote the article said this is from a lot of he's taught a lot of, and there's a lot of people that want to be better, but they don't know how to. So this is just sort of some direction on what things to work on. Yeah, it's a good article. Yep, and a nice find. Let's close this out with a bit of chaos, huh? Sure. Way changed tradition, right? That's right.

Starting point is 00:20:01 So you've heard of, I'm sure you've heard, I'm pretty sure you've been out on your show, if I remember correctly, you've talked about the chaos monkey and things like that, right? Yeah. So tell people what the chaos, yeah, he's from Netflix, right? He used to be, yes. Okay. So I think the idea of the chaos monkey originated there. Tell us what the chaos monkey is. Well, I'm probably going to get this wrong, but there's this notion of taking parts of your system and intentionally breaking parts or shutting down, especially in a distributed system, taking some nodes and just killing them every once in a while and seeing how your system recovers from it. Yeah, exactly. So the idea is if you build an architecture, both in infrastructure

Starting point is 00:20:42 and software, that is supposed to take durability it's supposed to work if this part of your cloud goes down or it's supposed to work if one of your database nodes goes down that's the theory but then there's the reality of how does it actually behave if one of your nodes goes down one of your machines reboots random stuff like that right yeah so the chaos monkey runs around in production killing off off processes, servers, et cetera. And then in production, it does it. And then, you know, you just have to build. So, cause you know, the chaos monkey is coming and it means like, well, a standard failure

Starting point is 00:21:16 is like nothing. Cause the chaos monkey is a madman, right? He's running around all the time. So this is not a bad philosophy for large organizations or large bits of software. However, you're not Netflix, probably, people listening. So how are you going to build that up, right? How are you going to create these things? So there's this cool thing I found called Chaos Toolkit.

Starting point is 00:21:37 So Chaos Toolkit is a library built in Python that will help create these chaos monkey like things cool so chaos engineering is what they call is the discipline of experimenting on distributed systems in order to build confidence and the system's capability to withstand turbulent conditions in production all right so we talked about the chaos monkey and the friends right there's other types of chaos things that Netflix, but here's a way that you can easily build those types of experiments and systems. And it integrates with Kubernetes, AWS, Google cloud, Microsoft, Azure, some other, other things like that. Right? So just to give you a sense of what it can do, like if you look at the AWS API, it'll say you can do things like go to AWS Lambda and call a delete function concurrency

Starting point is 00:22:26 that removes the concurrency limit on any specific Lambda, or you can go just call stop instance on an EC2 instance, or, you know, whatever you want, right? And presumably, it's going to put that back, I'm not entirely sure. But I guess you probably got to call start instance again on it. Or set, you know, add function concurrency or set something like that all right but there's just this infrastructure to help you change these types of settings these types of things around your cloud providers and you know make sure your system can take it nice yeah you cannot plan for the best so you plan for what you can for that's right yeah so you you build it so that it doesn't have to be perfect, and then you're in pretty good shape.

Starting point is 00:23:05 The stuff that the folks at Netflix are doing is insane, though. Like, they take it to another level. Well, yeah, one of the things I remember talking with them about is the reason why they test and do all this in production is because their system is so large you can't – it's essentially the world. You can't have a test bed that's similar enough to the real world environment. So they don't have that luxury.

Starting point is 00:23:32 Right. So they just test it in production. And I've not seen Netflix really go down, so I'm saying they're doing it. Yeah. All right. That's it for our main news. I have a couple of quick ones. You got anything you want to share, Brian?

Starting point is 00:23:41 Yeah. Just on the Python Bytes episode 100, one of the things we talked about was PyProject.toml, and I wanted to take a deep dive, so the last episode of Testing Code is me talking to Brett Cannon and talking about that for almost an hour or so.

Starting point is 00:23:58 Oh, nice. I definitely want to check that out. That's a good one. How about you? I have two other things that I don't think justify a whole segment, but I just want to throw them out there. Remember a while back we talked about that there was potentially some experiments or something where some college researcher had put some malicious, potentially malicious libraries on PyPI? Yeah.

Starting point is 00:24:19 Well, it turns out recently someone actually put malicious libraries on PyPI. So this is October 27, 2018, and apparently 12 packages were discovered with various levels of vulnerabilities and stuff. So I'm linking to that. People should check that out. If you don't know how to spell Django, you're going to have a bad time, by the way.

Starting point is 00:24:39 Yeah. So it's a lot of this typosquatting, like, oh, I forgot the J, so now I have a virus, something like that. Yeah, that's sort of lame. Yeah, it's a lot of this typosquatting, like, oh, I forgot the J, so now I have a virus, something like that. Yeah, that's sort of lame. Yeah, it's upsetting. At least they've done some work over at PyPI to block properly spelled things that are not actually packages, like RE, for example, right? The built-in stuff. Well, yeah, that's one of the things that was an issue, is people trying to pip install things that are in their standard library.

Starting point is 00:25:05 You don't have to do that. Yes, exactly. Exactly. Don't do that. Then the other one was just a quick little Twitter message that someone sent my way. And it's pretty cool. This guy, not the one who sent it to me. I'm sorry.

Starting point is 00:25:21 I don't remember who sent it. I should have written it down, but this person called Xtrek, something to that effect on Twitter decided to go scanning the standard library source code for interesting things. Like what is the longest class name? What is the longest function name? And so on. So that they found out that actually in, in C Python, the longest class name is 200 characters, just the letter a 200 times, which is some kind of test case. But for the real ones, there's one called test mutually exclusive optionals and positions mixed parent as the longest class name is there. This function name they believe is test underscore parser underscore regression underscore special underscore character underscore in parameter column of DockStream first line, which is 84 characters long. And then there's some other examples. Someone says there's actually a test C-types that has 33 million characters in it.

Starting point is 00:26:21 Anyway, it's an interesting thread if you're just wondering what the really long names are. Yeah, it's almost competitive. It almost could compete with Java names. Yeah, exactly. Standard types right there. Awesome. All right. Well, definitely fun to share all this news with you, Brian. As always, thanks for doing it. All right. Thank you. Bye. You bet. Bye. Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Auchin, this is Michael Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.

Python Bytes - #103 Getting to 10x (results for developers)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.