Python Bytes - #197 Structured concurrency in Python

Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 197, recorded August 26th, 2020. Brian, can you believe it's the end of August? Even if I can't say it, it still is true. No, I can't. I don't know where August went. I thought this whole pandemic thing would make the summer seem long and slow. It seems like it just went faster. Yeah, I've got like a Lego kit that I was planning on doing

Starting point is 00:00:26 like the first week of summer vacation and it's still sitting there. Yeah, for sure. Yeah, there's a lot of things I want to get done before the sun goes away and rain starts for six months straight. That's a Pacific Northwest problem, but it's our problem.

Starting point is 00:00:39 All right, now this episode is brought to you by us as well. We'll tell you more about the things that we're doing that we think you will appreciate later. Right now, I want to talk about something that I think we might have covered before, but I don't know if we've ever satisfactorily covered it. Maybe this time we'll get a little closer. And that's AsyncIO. Oh, yeah.

Starting point is 00:00:55 I think that's a new topic. It's a totally new topic. Covered only less than GUIs. No. So there's a new, how should I put it? A new compatibility like layer library that allows you to work a little bit better with async IO and some of the other async libraries that are not directly immediately the same as are built right on top of asynccio, curio from David Beasley, and trio from Nathaniel Smith. So there's an article that talks about this I'm going to mention as part of this conversation. And then say, hey, Python has three well-known concurrency libraries built around async and await syntax.

Starting point is 00:01:38 Asyncio, curio, and trio. True, but where's unsync, people? Unsync is the best of all four of those. I don't know where Unsync is. Anyway, Unsync is not part of this conversation. But Unsync plays a role a little bit like this thing I'm going to talk mention today is Any.io. And it's pretty clever name because the idea is that it provides structured concurrency primitives built on top of Async.io. Okay, right. So one of the the challenges with async IO is you can kick off a bunch of tasks and then

Starting point is 00:02:09 not wait for them and your program can exit or you can do other things. And maybe you've seen runtime warnings like task such and such was never awaited. You're like, hmm, I wonder what that means. Well, that probably means your program exited while it was halfway done or something like that, right? Or your thing returned a value before it waited for it to finish, right? And at the low level, something that's a little bit frustrating or annoying that you've got to deal with is that you've got to make sure that all the stuff you started on the async event loop, that you wait

Starting point is 00:02:37 for that event loop to finish before your program completely shuts down or completely carries on. And so that's basically the idea of this library is it's it's a compatibility layer across those three types those three different well-known concurrency libraries that provides this structured concurrency so if you look at wikipedia they say structured concurrency is a programming paradigm aimed at improving the clarity quality and development time of a computer program by using a structured approach to concurrent programming. The core concept is the encapsulations of threads of execution by way of control flow constructs that have a clear entry and exit points. In Python, this mostly manifests itself through this library as async with blocks or async context managers.

Starting point is 00:03:28 You're like, I'm going to do some async work. So let's create a with block, do all the work in there. And then by the way, when you leave the with block, it's going to have made sure all the tasks that were started and the tasks started by those tasks and so on all finished. Oh, that's nice. Yeah, that's pretty cool. So the way it works is you basically go any IO dot create task group, and then from the task group, you can spawn other subtasks.

Starting point is 00:03:53 And it will keep track of those. If there's an exception, I believe it will cancel the other undone ones, the unfinished ones, and so on. So it's about saying, we're just going to go through this thing, and it's all going to run here, and it enters at the top top and it exits at the bottom

Starting point is 00:04:08 of the with block. That's pretty cool, right? Yeah. So I think that that's pretty neat. It also has other primitives. So that's like a real simple example. Other things it does include synchronization primitives, locks. So if you create a reentrant lock in Python, often called a critical section and things like c++ and whatnot it's never ever going to help you well maybe that's a little bit strong it's likely not going to help you because those mechanisms come from the operating system process level and what they do is they make sure two threads don't run at the same time. Well, with asyncIO, it's all a bunch of stuff that's being broken apart on a single thread, right?

Starting point is 00:04:49 It's all on the one, wherever the event loop.run is, run till complete or whatever, like wherever that's happening, that's the thread. So like the thread locks don't matter. It's all the same thread. Like you're not going to block anything. So having primitives that will kind of function like threads to protect data while stuff

Starting point is 00:05:06 is happening, while it's in temporarily invalid states, that's pretty cool for async IO. Okay. So you need it or you don't need it? You probably need it. I think people often don't really think too much about these invalid states or programs get into. And you think, well, async IO, it's going to be fine. And a lot of times what you're doing with async IO is kind of standalone. Like I'm going to kick off this thing and when it comes back, I'm going to take the data and do something.

Starting point is 00:05:30 But if you're modifying shared data structures, you could still end up in some kind of event loop, a race condition. It's not as bad as like true threading because you're not going to, I don't believe it's like a plus equals, right? Of something that actually might be multiple steps at the lower level runtime. I don't think that it would get broken up to that fine grained.

Starting point is 00:05:50 But if you say like debit this account this amount of money, a weight, debit this account this amount of money, a weight, put that amount into the other one. And some other one is like reading in some kind of loop like that level of higher order like temporarily in valid state that could be a problem for async io and you want some kind of lock so this comes with that it comes with streams which are similar to queues timeouts through things like move on after or fail after a certain amount of time and so on so it's pretty cool little library yeah that's nice my vote still for unsync is the best of the four, even though it was unmentioned. Isn't Unsync built on those also? It's a compatibility layer that takes async IO, threading, and multiprocessing and turns them all into things that you can await.

Starting point is 00:06:35 Oh, yeah. So don't you think there should be like a standard? They should get together like some consortium and have a standard about this. Yeah, well, they probably should. But we're still in the early stages of figuring out what the right api is but that's why they haven't done it there's something else that has uh that could use some standards and that's in a lot of days data science libraries there's an announcement that there's a new consortium for python data api standards so there is one, and it's happening actually quite fast. They're getting started right away, and there's activities to the announcements

Starting point is 00:07:12 right away. Then in September, I believe, they're going to kick off some work on data frames, or no, starting with arrays, and then move on to data frames. And so, okay, I'm getting ahead of myself. Their little blurb says, one of the unintended consequences of the advances in multiple frameworks for data science, machine learning, deep learning, and numerical computing is that there is fragmentation in using the tools,

Starting point is 00:07:40 and there are differences in common function signatures. They have one example that shows what the generally mean function to get the average or mean people are going to like flame me for calling average mean but as a commoner i kind of think of those the same thing but anyway they show eight different frameworks then and some of them are common with other ones and so there's five different interfaces for over the eight frameworks for just the mean function for an array. Yeah, and what's crazy is they all are basically the same. They're so, so similar, but they're not the same.

Starting point is 00:08:15 Code-wise the same, but they might as well be. Yeah, and so one of the issues is people are using more than one framework for different parts of their maybe different parts of their data flow and sometimes you can kind of forget which one you're using and having a lot of these things common actually would just make life easier i think so i think i don't know how far they'll get with this but i think it's a really um so they're not trying to make all of these these frameworks look exactly the same but with uh commonalities in arrays and data frames or and they note that arrays are also called tensors so those are uh trying to make some of those common is i think a really good

Starting point is 00:08:58 idea for some of the easy simple stuff why not it seems like a great idea it seems like a huge challenge though like who's gonna give whose function is gonna be the one that's like yeah the easy, simple stuff. Why not? It seems like a great idea. It seems like a huge challenge, though. Like, who's going to give? Whose function is going to be the one that's like, yeah, we're dropping this part of our API to make it look like everyone else's? Right. And that's why I think that they've went through

Starting point is 00:09:15 a lot of thought on how to go about with this process and try to convince people. So they're working with, they're trying to kind of be in between the framework authors and maintainers and the community and try to do some review process for different APIs,

Starting point is 00:09:32 put a proposal out, have feedback from both from the different projects and from the community to have more of a, you know, more input to try to make it. It isn't just like one set of people saying, hey, I think this should be this way.

Starting point is 00:09:47 Yeah, it's a good idea. It would be great if a lot of these applications or these frameworks may be renamed. If it's the same function, if it's like, for instance, mean in this example, if it's spelled exactly the same, maybe it should be the same API. And if you want a special version of it, maybe have an underscore with an extra,

Starting point is 00:10:05 you know, some reason why it's different. You can have extra different functions. Yeah, it seems like you could find some pretty good common ground here. It's a good idea. And if they make it happen, you know, it'd just be easier to mix and match frameworks

Starting point is 00:10:18 and use the best for different situations. Because I can certainly see you're like, ah, I'm working with pandas here. It would be great if I could do this on cuda cores with qpi but i don't really know that it's close but it's not the same so i'm just gonna keep stroking along here as opposed to change the import statement now it runs there yep i don't know if it's ever really going to be like you can just swap out a different framework but for some of the common stuff it'd really be great and that's why one of the reasons why we're bringing it up is so that people can get on board and start being part of

Starting point is 00:10:47 this review process if they care about it yeah also seems like there might be some room for like adaptive layers like from coupai import pandas layer or something like that where it basically you talk to the in terms of say a pandas api and it converts it to its internal. It's like, oh, these arguments are switched in order, or this keyword is named differently, or whatever. And there's even things like differences, and even if the API looks the same or it's very similar, the default might be, in some cases, the default might be none versus false or versus no value.

Starting point is 00:11:22 I don't know what no value means, but anyway. Yeah, cool. That's a good one. Now, also good is the things that we're working on. Brian, you want to tell folks about our Patreon? Actually, we've kind of silently announced it a while ago, but we've got 47 patrons now, and it's set up for a monthly contribution,

Starting point is 00:11:43 and we really appreciate people helping out because there are some expenses with the show. So that's a really cool. We'd love to see that grow. I don't, we'd also like to hear from people about how we'd like to come up with some special thank you benefits for patrons. And so I'd like to have ideas come from the community.

Starting point is 00:12:00 If you can come up with some ideas, we will think about it. And I'm trying to figure out how to get to it so on our python bites if you're on any episode page it's there on the right okay if you go to an episode page got it yep then it says on the right i believe somewhere and it says sponsors on off the double check i believe it does okay we'll double check it can for sure if it doesn't already. And also, I want to just tell folks about a couple things going on over at TalkPython Training.

Starting point is 00:12:36 We're doing a webcast on helping people move from using Excel for all their data analysis to Pandas. Basically moving from Excel to the Python data science stack, which has all sorts of cool benefits and really neat things you can do there. So Chris Moffitt is going to come on and write a course with us, and he's going to do a webcast, which I announced it like 12, 15 hours ago, and it already has like 600 people signed up for it. So it's free. People can just come sign up. It happens late September 29th. I'll put the link at the extra section of the show notes so people can find it there. And also the Python Memory Management course is out for early access. A bunch of people are signing up and enjoying it. So

Starting point is 00:13:08 if you want to get to it soon, get to it early, people can check that out as well. Very exciting. So this next one I want to talk about has to do with manners. What kind of developer are you? Are you a polite developer? You're talking to the framework. Are you, you always check it in with it to see how it feels, what you're talking to the framework are you you always check it in with it to see how it feels what you're allowed to do are you kind of a rebel you're just going to do what you like but every now and then you get smacked down by uh the framework with an exception i don't want to describe how a developer i am because i don't want the explicit tag on this episode so there's an

Starting point is 00:13:41 article that talks about something i think is pretty fun and interesting to consider. And it talks about the two types of error handling patterns or mechanisms that you might use when you're writing code. And Python naturally leans towards one, but there might be times you don't want to use it. And that is, the two patterns are, it's easier to ask for forgiveness than permission that's one and the other one is look before you leap or please may i all right and with the look before you leap it's a lot a lot of checks like something you might do in c code so you would say i'm going to create a file oh does the folder exist if the folder doesn't exist i'm going to need to create the folder and then i can put the file there do i permission to write the file yes

Starting point is 00:14:30 okay then i'll go ahead and write the file right you're always checking if i can do this if this is in the right state and so on that's the look before you leap style the ask for forgiveness style is just try with open this thing. Oh, that didn't work. Catch exception, right? Except some IO error or something like that. So there's reasons you might want to use both. Python leans or nudges you towards the ask for forgiveness, try accept version.

Starting point is 00:15:00 The reason is, let's say you're opening a file and it's a JSON file. You might check first, does the file exist? Yes. Do're opening a file and it's a json file you might check first does the file exist yes do i permission to read it yes okay open the file well guess what what if the file's malformed and you try to feed it over to like json load and you give it the file pointer it's not going to say sorry it's malformed it's going to raise an exception not going to return it like a value like malformed constant weird thing it's just going to raise an exception like an returner like a value like malformed constant weird thing it's just going to throw an exception and say you know invalid thing on line seven or whatever right and so what that means is even if you wanted to do the look before you leap you probably can't

Starting point is 00:15:35 test everything and you're going to end up in a situation where you're still going to have to have the try accept block anyway so maybe you should just always do that, right? Maybe you should just go, well, if we're going to have to have exception handling anyway, that's just, we're going to do exception handling as much as possible and not do these tests. So that's this article over here. It's on switwaski.com.

Starting point is 00:15:59 Oh yeah, it's on Sebastian Widowski. So yeah, it's his, I didn't realize that it was his article. So it's his article. Anyway, he talks about, like, what is the relative performance of these things and tries to talk about it from a, well, sure, it's cool to think of how it looks in code, but is one faster or one slower than the other? And this actually came up on TalkPython as well.

Starting point is 00:16:30 And so he said, look, if we're going to come up with an example let's have a class and a base class and let's have the base class define an attribute and sometimes let's try to access the attribute and when you don't have the base class it'll or when you only have the base class it'll crash right because it's in the derived class so let's say we have two ways to test. We could either ask, does it have the attribute and then try to access it, or we could just righto access it. And it says, well, look, if it works all the time and you're not actually getting errors and you're doing this, it's 30% slower to do the look before you leap because you're doing an extra test. and basically the try accept block is more or less free like it doesn't cost anything if there's not actually an error but if you turn it around you say no it's not there all of a sudden it turns out the ask the the try accept

Starting point is 00:17:18 block is four times slower that's a lot slower oh really because the raising of the exception figuring out this call stack all that kind of stuff is expensive so instead of just going does it have the attribute you're going well let's do the whole call stack thing every error right and create an object and throw it and all that kind of stuff so it's a lot slower when there are errors and anyway it's a an interesting thing to consider if you care about performance and things like parsing integers or parsing data that might sometimes fail, might not. Sometimes it doesn't fail. Yeah, okay.

Starting point is 00:17:53 Devil's advocate here. His example doesn't have any activity in the ask for forgiveness if it isn't there. That's the way I saw it when I first read it as well. There's two sections. There's like one part where he says, let's do it with the attribute on the drive class, and let's do it again a second time by taking away the attribute and seeing what it's like. Right, but I mean, the code,

Starting point is 00:18:14 if it doesn't exist, it just doesn't do anything. Right. Whereas in reality, you're still going to have to do something to notify the user it's wrong or whatever. Yeah, okay, yeah, for sure, that's a good point. It's just basically a try except pass yeah so what do you think about this so what i think is you're gonna have to write the try except anyway almost all the time and you don't want both like that doesn't seem good that seems like just extra complexity so

Starting point is 00:18:43 when it makes sense just go with ask for forgiveness. Just embrace exceptions. Remember you have a finally block that often can get rid of a test as well. You have multiple types of error. Accept clauses based on error type. I think people should do a lot with that. That said, if your goal is to parse specific data, like I'm going to read this number I got off of the internet by web scraping,

Starting point is 00:19:09 and there's a million records here, I'm going to parse it. If you want to do that a lot, lot faster, that might make a lot of sense. I actually have a gist example that I put up, trying to compare the speed of these things in a mixed case. So the cases we're looking at here are kind of strange because it's like well there's it's all errors or it's zero errors right which and then it doesn't really do anything which are both weird so i have this one where it comes up with like a million records strings and most the time their number their legitimate numbers like 4.2 as a string

Starting point is 00:19:40 and then you can parse it and what i found was if you have more than 4% errors, I think it was 4, like 4.5% or something errors, error in its data, it's slower to use exceptions. The cutoff is 4% errors. And I think if you have more than 4% errors, then the exceptions become more expensive. That's right.

Starting point is 00:19:59 Anyway, it's something that people can run and get real numbers out of and play with it in a slightly more concrete way. But I don't know. What do you think? I think you start out by focusing on the code, making it easy and clear to understand, and then worry about this stuff. Yeah. So I don't actually put either. I don't usually do the checking stuff.

Starting point is 00:20:18 And that is one of the things that's good about bringing this up is that is more common in python code is to not check stuff just to you know to just go ahead and do it and then i write a lot of tests so i write a lot of tests around things yeah and so either case checking for things or like for instance if it is if it is input if i've got user input i'm checking for things yeah i'm going to do it checks ahead of time because i want because the behavior of what happens when it isn't there or when there's a problem, it isn't really a problem. It needs to be designed into the system

Starting point is 00:20:52 as to what behavior to do when something unexpected happens. But in normal code, like what happens if there's not an attribute? You shouldn't be in that situation, right? You shouldn't be in that situation. And I usually push it up higher. I don't have try-accept blocks all over the place. I have them around APIs that might not be

Starting point is 00:21:11 trustworthy or around external systems or something. I don't put try-accept blocks around code that I'm calling on my own code. Things like that. Yeah, I'm with you on that. That makes a lot of sense. The one time that I'll do the test, the look before you leave style, is if I think I can fix it, right? Does this directory not exist? I'm going to write a file to it.

Starting point is 00:21:31 Well, I'm just going to make the directory. Then I'm going to write to it, you know? Those kinds of tests can get you out of trouble. But if you're just going to say this didn't work, chances are, you know, you still need the error handling and exception format anyway. Yeah, and you're probably going to throw an exception.

Starting point is 00:21:46 Yeah. Anyway, cool. So you probably should get your code right, test it, and then just stick it in GitHub. Get in your repository and make sure it's all up to date, right? Oh, I was wondering how you were going to do that transition. So, yeah, that's good. I was following a discussion on Twitter, and I think, actually, I think Anthony Shaw may have started it, but I can't remember. But dealing with different, if you've got a lot of repositories, sometimes you have a lot of maintenance to do or some common things you're doing for a whole bunch of repos.

Starting point is 00:22:17 And there's lots of different reasons why that might be the case or related tools or maybe just your work. You've got a lot of repos, but there's a project that came up in this discussion that I hadn't really played with before. And it's a project called my repos. And on the site, it says you've got a lot of version control repositories. Sometimes you want to update them all at once or push out all the local,

Starting point is 00:22:42 your local changes. You may use special command lines in some repos to implement specific workflows. Well, the MyRepos project provides an MR command which is a tool to manage all your version control repositories. And the way it works is it's on directory structures. And I usually have all of my repos that I'm working with under a common projects directory or something so that I know where to look. And so I'm already set up for something like this might work.

Starting point is 00:23:11 And you go into one of your repos and you type, if you have this installed, you type mrregister. And it registers this under, registers that repo for common commands. And then whether you're in a parent tree, parent directory, or one of the specific directories and type of command, like for instance, if you say MR status, it'll do status on all of the repos that you care about or update or diff or something like that. And then you can build up even more complex commands yourself to do more complicated things. But I would, I mean, I'm probably going to use it right away

Starting point is 00:23:49 just for just checking the status or doing polls or updates or something like that on lots of repos. So this looks neat. Yeah, it looks neat. I like the idea a lot. So basically, I'm the same as you. I've got a directory, maybe a couple of levels, but all of my github repos

Starting point is 00:24:07 go in there right i group them by like personal stuff or work stuff but other than that they're just all next to each other and this would just let you say go do a git pull on all of them that's great yeah or like for instance at work i've got often um like three or four different related repos that if i switch to another project that I'm working on, I need to go through and make sure I'm not sure what branch I'm using or if everything's up to date. So being able to just go through all,

Starting point is 00:24:34 like even two or three, being able to go and update them all at once or just even check the status of all, it'll save time. And then a friend of the show, at least somebody that I interviewed for a test encode at least, Adam Johnson, wrote an article

Starting point is 00:24:49 called Maintaining Multiple Python Projects with My Repos, and we'll link to his article in the show notes. Yeah, perfect. I like this idea enough that I wrote something like that already. You did? Well, what I wrote is something that will go and actually synchronize my GitHub account with a folder structure on my computer.

Starting point is 00:25:09 So I'll go and just say, like, repo sync or whatever I called it. And it'll use the GitHub API to go and figure out all the repos that I've cloned or created in the different organizations, like TalkPython organization versus my personal one, and then it'll create folders based on the organization or where I forked it from and then clone it. And if it's already there, it'll update it within, it'll basically pull all those down. That's cool. I need that.

Starting point is 00:25:36 It was a lot of work. This seems like it's pre-built and pretty close, so it looks pretty nice. The one thing it doesn't do is it doesn't look like, it doesn't go to GitHub and say, oh, what other repos have you created that you maybe don't have here? Maybe you want that, maybe you don't. If you've forked

Starting point is 00:25:50 Windows source code and it's like 50 gigs, you don't want this tool that I'm talking about. But if you have reasonable size things, like I forked Linux, okay, great, that's going to take a while. But normally, I think it would be pretty neat. Another thing that's neat around managing these types of things is Docker.

Starting point is 00:26:06 And did you know that Python has an official Docker image? I did not. I didn't either. Well, I recently heard that, but it's fairly new news to me that there's an official Docker Python image. So theoretically, if you want to work with some kind of Linux Docker machine that uses Python, you can go and Docker run to create the Python one. Right. So it's not super surprising. It's just called Python. Right. But it's just called Python. That's it, I believe.

Starting point is 00:26:38 So pretty straightforward working with it. But I'm going to talk about, like, basically looking through that Docker, that official Docker image. So Itamar Turner Trowering, who was on TalkPython not long ago, talking about Phil, and we also talked about Phil on Python Bytes, the data science focused memory tool. He wrote an article called a deep dive into the official Docker image for Python. So basically, it's like, like well if there's an official docker image for python what is it how do you set it up because understanding how it's set up is basically how do you take a machine that has no python whatsoever and configure it in a python way yeah so this is using debian that's just what it's based on. And it's using the Buster version because apparently Debian names all their releases

Starting point is 00:27:27 after characters from Toy Story. I didn't know that, but yep, Buster. Buster is the current one. So it's going to create a Docker image. You create the Docker file. You say this Docker image is based on some other foundational one. So Debian Buster. And then it sets up slash user slash local slash bin for the environmental path because that is the first thing in the path because that's where it's going to put Python.

Starting point is 00:27:54 It sets the locale explicitly to the ENV language is to UTF-8. There's some debate about whether this is actually necessary because current Python also defaults UTF-8,'s some debate about whether this is actually necessary because current python also defaults utf-8 but you know here it is and then it also sets an environment variable python underscore version to whatever the python version is right now it's 385 but whatever it is that's kind of cool so you can ask hey what version is in this system without actually touching python that's cool and then it has to do a few things like

Starting point is 00:28:25 register the ca certificates like i've had people sending messages are taking courses and they're trying to run the code from you know something that talks to requests whether it is ssl certificate endpoint ss https endpoint and they'll say this thing says the in the the certificate is invalid. The certificate's not invalid. What's going on here? And almost always something about the way that Python got set up on their machine didn't run the

Starting point is 00:28:53 create certificate command. So there's this step where Python will go download all the major certificate authorities and trust them in the system. So that happens next. And then it actually will set up things like gcc and whatnot so it can compile it is interesting downloads the source code compiles it but then what's interesting is it uninstalls the compiler tools it's like okay we're gonna download python and we're gonna compile it but you didn't explicitly ask for gcc

Starting point is 00:29:23 we just needed it so those are gone. Cleans up the PYC files and all those kinds of things. And then it gives an alias to say that Python 3 is the same as Python. Like the command, you could do it without the 3. Another thing that we've gone on about that's annoying is like, I created a virtual environment. Oh, it has the wrong version of pip. Is my pip out of date?

Starting point is 00:29:44 Your pip's probably out of date. Everyone's pip is out of date. Unless you're in a rare two-week window where Python has been released at the same time the modern pip has been released. Guess what? They upgrade pip to the new version, which is cool. Finally, it sets the entry point of the Docker container, which is the default command to do if you just say Docker run this image, like Docker run Python 3.8-slim-buster. If you just say that by itself, what program is going to run? Because the way it works is it basically starts Linux and then

Starting point is 00:30:18 runs one program. When that program exits, the Docker container goes away. And so it sets that to be the Python three command. So basically, if you Docker run the Python, Docker image, you're going to get just the REPL. Interesting. Yeah, you can always run it with different endpoints, like bash, and then go in and like do stuff to it, or run it with micro whiskey or Nginx or whatever. But if you don't, you're just going to get Python three REP python 3 rebel anyway that's the way the official python docker image configures itself from a bare debian buster over to python 3 neat yeah neat i thought it might be worth just thinking about like what are all the steps and you know how does that happen on your computer if you can no that's good yeah, I have been curious about that. I was going to throw Python on a Docker image.

Starting point is 00:31:08 What does that get me? Yeah, exactly. That's what it is. You could also apt install Python 3 dash dev. Yeah, that might be cheating. All right, what's this final one? Oh, so it was recommended by, we covered some craziness that Anthony did an episode or two ago. And somebody commented that maybe we need to only in a pandemic section.

Starting point is 00:31:32 Oh yeah, that sounds fun. So I selected Nanermost. No, sorry, Nanernest. It's an optimal peanut butter and banana sandwich placement. So this is kind of an awesome article by Ethan Rosenthal, talks about during the pandemic, he's been sort of having trouble doing anything. And so he really liked peanut butter sandwich,

Starting point is 00:31:55 peanut butter and banana sandwiches when he was just still, even he got picked this habit up from his grandfather, I think. Anyway, this is using Python and computer vision and deep learning and machine learning and a whole bunch of cool libraries to come up with the best packing algorithm for a particular banana and the particular bread that you have so you take a picture that includes both the bread and the bananas or the banana you have and it will come up with the optimal slicing and placement of the banana for your banana sandwich wow this is like a banana maximization optimization problem so if you want you gotta see the pictures together so like if you're going to cut your banana into

Starting point is 00:32:38 slices and obviously the the radius of the banana slice varies that where you cut it in the banana right is it near the top it in the banana, right? Is it near the top? Is it in the middle? It's going to result in different sized slices. On where do you place the banana circles on your bread to have maximum surface area of bananas relative to what's left of the bread, right? Something like that? Yes, he's trying to make it so that you have almost all of the bites of the sandwich have an equal ratio of banana, peanut butter, and bread. Oh, yeah.

Starting point is 00:33:08 Okay. It's all about the flavor. I didn't understand the real motivation, but yeah, you want to have an equal layer, right? So you don't want that spot where you just get bread. You actually learn quite a bit about all these different processes, and there's quite a bit of math here talking about uh coming up with arcs for um you have to estimate the banana shape as part of a an ellipse and uh using the radius of that determined banana slices and estimates for because you're looking at a banana sideways you have to estimate what the what the shape of the banana circle will be.

Starting point is 00:33:46 And it's not really a circle. It's more of an ellipse also. Yeah, there's a lot going on here. Some advanced stuff to deliver your bananas perfectly. I love it. Actually, this is really interesting. This is cool. I mean, it's a silly application, but it's also a neat example. Yeah, actually. And this would be, I think, a cool thing for to talk about difficult problems and packing for like a teaching like in a in a school setting.

Starting point is 00:34:14 I think this would be a great example to talk about some of these different complex problems. Yeah, totally. Well, that's it for our main items. For the extras, I just want to say I'll put the links for the Excel to Python webcast and the memory management course down there, and we'll put the Patreon link as well. Let's see if you have anything else you want to share. No, that's good. Yeah, cool.

Starting point is 00:34:33 How about sharing a joke? A joke would be great. So I'm going to describe the situation, and you can be the interviewer slash boss who has the caption, okay? Okay. So the first, there's two scenarios. The title is job requirements. This comes to us from Eduardo Orochana. Thanks for that. And the first scenario is the job interview where you're getting hired. And then there's the reality, which is later, which is the actual on the job day to day. So on the job interview, I come in, I'm an applicant here,

Starting point is 00:35:05 and Brian, the boss, says... Invert a binary tree on this whiteboard. Or some other random data structure, like quick sort this, but using some other weird thing, right? Something that is kind of really computer science-y, way out there, probably not going to do, but kind of maybe makes sense, right? All right, now I'm at the job, and I've got my computer. I have a huge purple buy button on my website that I'm working on.

Starting point is 00:35:29 And the boss says, make the button bigger. Yep, that's the job. Yeah, very nice. Good, good. All right, well, I love the jokes and all the tech we're covering. Thanks, Brian. Yeah, thank you. Yeah, bye.

Starting point is 00:35:44 Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. On behalf of myself and Brian Ocken, this is Michael Kennedy.

Starting point is 00:36:05 Thank you for listening and sharing this podcast with your friends and colleagues.

Your Ad Here

Python Bytes - #197 Structured concurrency in Python

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.