Python Bytes - #197 Structured concurrency in Python
Episode Date: September 5, 2020Topics covered in this episode: Structured concurrency in Python with AnyIO The Consortium for Python Data API Standards Ask for Forgiveness or Look Before You Leap? myrepos A deep dive into the of...ficial Docker image for Python “Only in a Pandemic” section Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/197
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 197, recorded August 26th, 2020.
Brian, can you believe it's the end of August? Even if I can't say it, it still is true.
No, I can't. I don't know where August went.
I thought this whole pandemic thing would make the summer seem long and slow.
It seems like it just went faster.
Yeah, I've got like a Lego kit
that I was planning on doing
like the first week of summer vacation
and it's still sitting there.
Yeah, for sure.
Yeah, there's a lot of things I want to get done
before the sun goes away
and rain starts for six months straight.
That's a Pacific Northwest problem,
but it's our problem.
All right, now this episode is brought to you by us as well.
We'll tell you more about the things that we're doing
that we think you will appreciate later.
Right now, I want to talk about something that I think we might have covered before,
but I don't know if we've ever satisfactorily covered it.
Maybe this time we'll get a little closer.
And that's AsyncIO.
Oh, yeah.
I think that's a new topic.
It's a totally new topic.
Covered only less than GUIs.
No. So there's a new, how should I put it? A new compatibility like layer library that allows
you to work a little bit better with async IO and some of the other async libraries that are not
directly immediately the same as are built right on top of asynccio, curio from David Beasley, and trio from Nathaniel Smith.
So there's an article that talks about this I'm going to mention as part of this conversation.
And then say, hey, Python has three well-known concurrency libraries built around async and await syntax.
Asyncio, curio, and trio.
True, but where's unsync, people?
Unsync is the best of all four of those.
I don't know
where Unsync is. Anyway, Unsync is not part of this conversation. But Unsync plays a role a
little bit like this thing I'm going to talk mention today is Any.io. And it's pretty clever
name because the idea is that it provides structured concurrency primitives built on top
of Async.io. Okay, right. So one of the the challenges with async IO is you can kick off a bunch of tasks and then
not wait for them and your program can exit or you can do other things.
And maybe you've seen runtime warnings like task such and such was never awaited.
You're like, hmm, I wonder what that means.
Well, that probably means your program exited while it was halfway done or something like
that, right?
Or your thing returned a value before it waited for it to finish, right? And at the low level,
something that's a little bit frustrating or annoying that you've got to deal with is that
you've got to make sure that all the stuff you started on the async event loop, that you wait
for that event loop to finish before your program completely shuts down or completely carries on.
And so that's basically the idea of
this library is it's it's a compatibility layer across those three types those three different
well-known concurrency libraries that provides this structured concurrency so if you look at
wikipedia they say structured concurrency is a programming paradigm aimed at improving the
clarity quality and development time of a computer program by using a structured approach to concurrent programming.
The core concept is the encapsulations of threads of execution by way of control flow constructs that have a clear entry and exit points.
In Python, this mostly manifests itself through this library as async with blocks or async context managers.
You're like, I'm going to do some async work.
So let's create a with block, do all the work in there.
And then by the way, when you leave the with block, it's going to have made sure all the tasks that were started and the tasks started by those tasks and so on all finished.
Oh, that's nice.
Yeah, that's pretty cool. So the way it works is you basically go
any IO dot create task group,
and then from the task group,
you can spawn other subtasks.
And it will keep track of those.
If there's an exception,
I believe it will cancel the other undone ones,
the unfinished ones, and so on.
So it's about saying,
we're just going to go through this thing,
and it's all going to run here,
and it enters at the top top and it exits at the bottom
of the with block. That's pretty cool, right? Yeah. So I think that that's pretty
neat. It also has other primitives. So that's like a real simple example.
Other things it does include synchronization primitives, locks.
So if you create a reentrant lock
in Python, often called a critical section and things like c++ and
whatnot it's never ever going to help you well maybe that's a little bit strong it's likely not
going to help you because those mechanisms come from the operating system process level and what
they do is they make sure two threads don't run at the same time. Well, with asyncIO, it's all a bunch of stuff that's being broken apart on a single thread, right?
It's all on the one, wherever the event loop.run is,
run till complete or whatever,
like wherever that's happening, that's the thread.
So like the thread locks don't matter.
It's all the same thread.
Like you're not going to block anything.
So having primitives that will kind of function
like threads to protect data while stuff
is happening, while it's in temporarily invalid states, that's pretty cool for async IO.
Okay. So you need it or you don't need it?
You probably need it. I think people often don't really think too much about
these invalid states or programs get into. And you think, well, async IO, it's going to be fine.
And a lot of times what you're doing with async IO is kind of standalone.
Like I'm going to kick off this thing
and when it comes back,
I'm going to take the data and do something.
But if you're modifying shared data structures,
you could still end up in some kind of event loop,
a race condition.
It's not as bad as like true threading
because you're not going to,
I don't believe it's like a plus equals, right?
Of something that actually might be multiple steps at the lower level runtime.
I don't think that it would get broken up to that fine grained.
But if you say like debit this account this amount of money, a weight, debit this account this amount of money, a weight, put that amount into the other one.
And some other one is like reading in some kind of loop like that level of higher order like temporarily in
valid state that could be a problem for async io and you want some kind of lock so this comes with
that it comes with streams which are similar to queues timeouts through things like move on after
or fail after a certain amount of time and so on so it's pretty cool little library yeah that's
nice my vote still for unsync is the best of the four, even though it was unmentioned.
Isn't Unsync built on those also?
It's a compatibility layer that takes async IO, threading, and multiprocessing and turns them all into things that you can await.
Oh, yeah.
So don't you think there should be like a standard?
They should get together like some consortium and have a standard about this.
Yeah, well, they probably should.
But we're still in the early stages of figuring out what the right api is but that's why they haven't done it there's something else that has uh that could use some standards and that's in
a lot of days data science libraries there's an announcement that there's a new consortium for
python data api standards so there is one, and it's happening actually quite fast. They're getting started
right away, and there's activities to the announcements
right away. Then in September, I believe, they're
going to kick off some work on data frames, or no, starting
with arrays, and then move on to data frames. And so, okay,
I'm getting ahead of myself.
Their little blurb says,
one of the unintended consequences of the advances in multiple frameworks
for data science, machine learning, deep learning, and numerical computing
is that there is fragmentation in using the tools,
and there are differences in common function signatures.
They have one example that
shows what the generally mean function to get the average or mean people are going to like flame me
for calling average mean but as a commoner i kind of think of those the same thing but anyway they
show eight different frameworks then and some of them are common with other ones and so there's
five different interfaces for over the eight frameworks for just
the mean function for an array. Yeah, and what's crazy is they all are basically
the same. They're so, so similar, but they're not the same.
Code-wise the same, but they might as well be. Yeah, and so one of the
issues is people are using more than one framework for different parts of their
maybe different parts of their data flow and sometimes you can kind of forget which one you're
using and having a lot of these things common actually would just make life easier i think so
i think i don't know how far they'll get with this but i think it's a really um so they're not trying
to make all of these these frameworks look exactly the same
but with uh commonalities in arrays and data frames or and they note that arrays are also
called tensors so those are uh trying to make some of those common is i think a really good
idea for some of the easy simple stuff why not it seems like a great idea it seems like a huge
challenge though like who's gonna give whose function is gonna be the one that's like yeah the easy, simple stuff. Why not? It seems like a great idea. It seems like a huge challenge, though.
Like, who's going to give?
Whose function is going to be the one that's like,
yeah, we're dropping this part of our API
to make it look like everyone else's?
Right.
And that's why I think that they've went through
a lot of thought on how to go about with this process
and try to convince people.
So they're working with,
they're trying to kind of be in between
the framework authors and
maintainers and the community
and try to do some
review process for different APIs,
put a proposal out, have
feedback from both from
the different projects and from
the community to have
more of a, you know, more input
to try to make it. It isn't
just like one set of people saying,
hey, I think this should be this way.
Yeah, it's a good idea.
It would be great if a lot of these applications
or these frameworks may be renamed.
If it's the same function, if it's like, for instance,
mean in this example, if it's spelled exactly the same,
maybe it should be the same API.
And if you want a special version of it,
maybe have an underscore with an extra,
you know, some reason why it's different.
You can have extra different functions.
Yeah, it seems like you could find
some pretty good common ground here.
It's a good idea.
And if they make it happen,
you know, it'd just be easier
to mix and match frameworks
and use the best for different situations.
Because I can certainly see you're like,
ah, I'm working with pandas here.
It would be great if I could do this on cuda cores with qpi but i don't really know that it's close
but it's not the same so i'm just gonna keep stroking along here as opposed to change the
import statement now it runs there yep i don't know if it's ever really going to be like you
can just swap out a different framework but for some of the common stuff it'd really be great
and that's why one of the reasons why we're bringing it up is so that people can get on board and start being part of
this review process if they care about it yeah also seems like there might be some room for like
adaptive layers like from coupai import pandas layer or something like that where it basically
you talk to the in terms of say a pandas api and it converts it to its internal. It's like, oh, these arguments are switched in order,
or this keyword is named differently, or whatever.
And there's even things like differences,
and even if the API looks the same or it's very similar,
the default might be, in some cases,
the default might be none versus false or versus no value.
I don't know what no value means, but anyway.
Yeah, cool.
That's a good one.
Now, also good is the things that we're working on.
Brian, you want to tell folks about our Patreon?
Actually, we've kind of silently announced it a while ago,
but we've got 47 patrons now,
and it's set up for a monthly contribution,
and we really appreciate people helping out because there are some
expenses with the show.
So that's a really cool.
We'd love to see that grow.
I don't,
we'd also like to hear from people about how we'd like to come up with some
special thank you benefits for patrons.
And so I'd like to have ideas come from the community.
If you can come up with some ideas,
we will think about it.
And I'm trying to
figure out how to get to it so on our python bites if you're on any episode page it's there on the
right okay if you go to an episode page got it yep then it says on the right i believe somewhere
and it says sponsors on off the double check i believe it does okay we'll double check it can
for sure if it doesn't already.
And also, I want to just tell folks about a couple things going on over at TalkPython Training.
We're doing a webcast on helping people move from using Excel for all their data analysis to Pandas.
Basically moving from Excel to the Python data science stack, which has all sorts of cool benefits and really neat things you can do there. So Chris Moffitt is going to come on and write a course with us,
and he's going to do a webcast, which I announced it like 12, 15 hours ago,
and it already has like 600 people signed up for it.
So it's free.
People can just come sign up.
It happens late September 29th. I'll put the link at the extra section of the show notes so people can find it there.
And also the Python Memory Management course is out for early access. A bunch of people are signing up and enjoying it. So
if you want to get to it soon, get to it early, people can check that out as well.
Very exciting.
So this next one I want to talk about has to do with manners. What kind of developer are you?
Are you a polite developer? You're talking to the framework. Are you,
you always check it in with it to see how it feels, what you're talking to the framework are you you always check it in with it to see how it feels
what you're allowed to do are you kind of a rebel you're just going to do what you like but every
now and then you get smacked down by uh the framework with an exception i don't want to
describe how a developer i am because i don't want the explicit tag on this episode so there's an
article that talks about something i think is pretty fun and interesting to consider.
And it talks about the two types of error handling patterns or mechanisms that you might use when you're writing code.
And Python naturally leans towards one, but there might be times you don't want to use it.
And that is, the two patterns are, it's easier to ask for forgiveness than permission
that's one and the other one is look before you leap or please may i all right and with the look
before you leap it's a lot a lot of checks like something you might do in c code so you would say
i'm going to create a file oh does the folder exist if the folder doesn't exist i'm going to
need to create the folder and then i can put the file there do i permission to write the file yes
okay then i'll go ahead and write the file right you're always checking if i can do this if this
is in the right state and so on that's the look before you leap style the ask for forgiveness
style is just try with open this thing.
Oh, that didn't work.
Catch exception, right?
Except some IO error or something like that.
So there's reasons you might want to use both.
Python leans or nudges you towards the ask for forgiveness, try accept version.
The reason is, let's say you're opening a file and it's a JSON file.
You might check first, does the file exist? Yes. Do're opening a file and it's a json file you might check first
does the file exist yes do i permission to read it yes okay open the file well guess what what
if the file's malformed and you try to feed it over to like json load and you give it the file
pointer it's not going to say sorry it's malformed it's going to raise an exception not going to
return it like a value like malformed constant weird thing it's just going to raise an exception like an returner like a value like malformed constant weird thing
it's just going to throw an exception and say you know invalid thing on line seven or whatever right
and so what that means is even if you wanted to do the look before you leap you probably can't
test everything and you're going to end up in a situation where you're still going to have to have
the try accept block anyway so maybe you should just always do that, right?
Maybe you should just go,
well, if we're going to have to have exception handling anyway,
that's just, we're going to do exception handling
as much as possible and not do these tests.
So that's this article over here.
It's on switwaski.com.
Oh yeah, it's on Sebastian Widowski.
So yeah, it's his,
I didn't realize that it was his article.
So it's his article.
Anyway, he talks about, like, what is the relative performance of these things
and tries to talk about it from a, well, sure, it's cool to think of how it looks in code,
but is one faster or one slower than the other?
And this actually came up on TalkPython as well.
And so he said, look, if we're going to come up with an example let's have a class and a base class and let's have the base class define an attribute and sometimes let's try to access the attribute
and when you don't have the base class it'll or when you only have the base class it'll crash
right because it's in the derived class so let's say we have two ways to test. We could either ask, does it have the attribute and then try to access it,
or we could just righto access it. And it says, well, look, if it works all the time and you're
not actually getting errors and you're doing this, it's 30% slower to do the look before you leap
because you're doing an extra test. and basically the try accept block is more
or less free like it doesn't cost anything if there's not actually an error but if you turn
it around you say no it's not there all of a sudden it turns out the ask the the try accept
block is four times slower that's a lot slower oh really because the raising of the exception
figuring out this call
stack all that kind of stuff is expensive so instead of just going does it have the attribute
you're going well let's do the whole call stack thing every error right and create an object and
throw it and all that kind of stuff so it's a lot slower when there are errors and anyway it's a an
interesting thing to consider if you care about performance and things like parsing integers or parsing data that might sometimes fail, might not.
Sometimes it doesn't fail.
Yeah, okay.
Devil's advocate here.
His example doesn't have any activity in the ask for forgiveness if it isn't there.
That's the way I saw it when I first read it as well.
There's two sections. There's like one part where he says,
let's do it with the attribute on the drive class,
and let's do it again a second time
by taking away the attribute and seeing what it's like.
Right, but I mean, the code,
if it doesn't exist, it just doesn't do anything.
Right.
Whereas in reality, you're still going to have to do something
to notify the user it's wrong or whatever.
Yeah, okay, yeah, for sure, that's a good point.
It's just basically a try except pass yeah so what do you think about this so
what i think is you're gonna have to write the try except anyway almost all the time
and you don't want both like that doesn't seem good that seems like just extra complexity so
when it makes sense just go with ask for forgiveness.
Just embrace exceptions.
Remember you have a finally block that often can get rid of a test as well.
You have multiple types of error.
Accept clauses based on error type.
I think people should do a lot with that.
That said, if your goal is to parse specific data,
like I'm going to read this number I got off of the internet by web scraping,
and there's a million records here, I'm going to parse it.
If you want to do that a lot, lot faster, that might make a lot of sense.
I actually have a gist example that I put up,
trying to compare the speed of these things in a mixed case.
So the cases we're looking at here are kind of strange
because it's like well there's it's all errors or it's zero errors right which and then it doesn't
really do anything which are both weird so i have this one where it comes up with like a million
records strings and most the time their number their legitimate numbers like 4.2 as a string
and then you can parse it and what i found was if you have more than 4%
errors, I think it was 4, like 4.5%
or something errors,
error in its data, it's slower
to use exceptions.
The cutoff is 4% errors. And I think if you have
more than 4% errors, then the exceptions
become more expensive. That's right.
Anyway, it's something that people can run and get real numbers
out of and play with it in a slightly more
concrete way. But I don't know.
What do you think?
I think you start out by focusing on the code, making it easy and clear to understand, and then worry about this stuff.
Yeah.
So I don't actually put either.
I don't usually do the checking stuff.
And that is one of the things that's good about bringing this up is that is more common in python code is to not check stuff just
to you know to just go ahead and do it and then i write a lot of tests so i write a lot of tests
around things yeah and so either case checking for things or like for instance if it is if it
is input if i've got user input i'm checking for things yeah i'm going to do it checks ahead of
time because i want because the behavior of what happens
when it isn't there or when there's a problem,
it isn't really a problem.
It needs to be designed into the system
as to what behavior to do
when something unexpected happens.
But in normal code,
like what happens if there's not an attribute?
You shouldn't be in that situation, right?
You shouldn't be in that situation.
And I usually push it up higher. I don't have try-accept blocks all over the place.
I have them around APIs that might not be
trustworthy or around external systems or something.
I don't put try-accept blocks around code that I'm calling on my own code.
Things like that. Yeah, I'm with you on that. That makes a lot of sense.
The one time that I'll do the test,
the look before you leave style,
is if I think I can fix it, right?
Does this directory not exist?
I'm going to write a file to it.
Well, I'm just going to make the directory.
Then I'm going to write to it, you know?
Those kinds of tests can get you out of trouble.
But if you're just going to say this didn't work,
chances are, you know,
you still need the error handling
and exception format anyway.
Yeah, and you're probably going to throw an exception.
Yeah.
Anyway, cool.
So you probably should get your code right, test it, and then just stick it in GitHub.
Get in your repository and make sure it's all up to date, right?
Oh, I was wondering how you were going to do that transition.
So, yeah, that's good.
I was following a discussion on Twitter, and I think, actually, I think Anthony Shaw may have started it, but I can't remember.
But dealing with different, if you've got a lot of repositories, sometimes you have a lot of maintenance to do or some common things you're doing for a whole bunch of repos.
And there's lots of different reasons why that might be the case or related tools or maybe just your work.
You've got a lot of repos,
but there's a project that came up in this discussion that I hadn't really
played with before.
And it's a project called my repos.
And on the site,
it says you've got a lot of version control repositories.
Sometimes you want to update them all at once or push out all the local,
your local changes.
You may use special command lines in some repos to implement specific
workflows. Well, the MyRepos project provides an MR command
which is a tool to manage all your version control repositories.
And the way it works is it's on directory structures.
And I usually have all of my repos that I'm working with
under a common projects directory or something so that I know where to look.
And so I'm already set up for something like this might work.
And you go into one of your repos and you type, if you have this installed, you type mrregister.
And it registers this under, registers that repo for common commands.
And then whether you're in a parent tree, parent directory, or one of the specific directories and type of command,
like for instance, if you say MR status,
it'll do status on all of the repos that you care about or update or diff or
something like that.
And then you can build up even more complex commands yourself to do more complicated things.
But I would, I mean, I'm probably going to use it right away
just for just checking the status or doing polls or updates
or something like that on lots of repos.
So this looks neat.
Yeah, it looks neat.
I like the idea a lot.
So basically, I'm the same as you.
I've got a directory, maybe a couple of levels,
but all of my github repos
go in there right i group them by like personal stuff or work stuff but other than that they're
just all next to each other and this would just let you say go do a git pull on all of them that's
great yeah or like for instance at work i've got often um like three or four different related
repos that if i switch to another project that I'm working on,
I need to go through and make sure
I'm not sure what branch I'm using
or if everything's up to date.
So being able to just go through all,
like even two or three,
being able to go and update them all at once
or just even check the status of all,
it'll save time.
And then a friend of the show,
at least somebody that I interviewed for
a test encode at least, Adam Johnson,
wrote an article
called Maintaining Multiple Python Projects
with My Repos, and we'll
link to his article in the show notes.
Yeah, perfect. I like this idea enough that I
wrote something like that already. You did?
Well, what I wrote is something that will
go and actually
synchronize my GitHub account with a folder structure on my computer.
So I'll go and just say, like, repo sync or whatever I called it.
And it'll use the GitHub API to go and figure out all the repos that I've cloned or created in the different organizations,
like TalkPython organization versus my personal one,
and then it'll create folders based on the organization
or where I forked it from and then clone it.
And if it's already there, it'll update it within,
it'll basically pull all those down.
That's cool. I need that.
It was a lot of work.
This seems like it's pre-built and pretty close,
so it looks pretty nice.
The one thing it doesn't do is it doesn't look like,
it doesn't go to GitHub and say,
oh, what other repos have you created that you
maybe don't have here? Maybe you
want that, maybe you don't. If you've forked
Windows source code and it's like 50 gigs,
you don't want this tool
that I'm talking about. But if you have reasonable
size things, like I forked Linux,
okay, great, that's going to take a while.
But normally, I think
it would be pretty neat. Another thing that's neat
around managing these types of things is Docker.
And did you know that Python has an official Docker image?
I did not.
I didn't either.
Well, I recently heard that, but it's fairly new news to me that there's an official Docker
Python image.
So theoretically, if you want to work with some kind of Linux Docker machine that uses
Python, you can go and Docker run to create the Python one.
Right. So it's not super surprising. It's just called Python. Right. But it's just called Python. That's it, I believe.
So pretty straightforward working with it. But I'm going to talk about, like, basically looking through that Docker,
that official Docker image. So Itamar Turner Trowering, who was on TalkPython not long ago,
talking about Phil, and we also talked about Phil on Python Bytes, the data science focused
memory tool. He wrote an article called a deep dive into the official Docker image for Python.
So basically, it's like, like well if there's an official
docker image for python what is it how do you set it up because understanding how it's set up is
basically how do you take a machine that has no python whatsoever and configure it in a python way
yeah so this is using debian that's just what it's based on. And it's using the Buster version because apparently Debian names all their releases
after characters from Toy Story. I didn't know that, but yep, Buster.
Buster is the current one. So it's going to create a
Docker image. You create the Docker file. You say this Docker image is
based on some other foundational one. So Debian Buster.
And then it sets up slash user slash local slash bin
for the environmental path
because that is the first thing in the path
because that's where it's going to put Python.
It sets the locale explicitly to the ENV language
is to UTF-8.
There's some debate about whether this is actually necessary
because current Python also defaults UTF-8,'s some debate about whether this is actually necessary because current python also
defaults utf-8 but you know here it is and then it also sets an environment variable python
underscore version to whatever the python version is right now it's 385 but whatever it is that's
kind of cool so you can ask hey what version is in this system without actually touching python
that's cool and then it has to do a few things like
register the ca certificates like i've had people sending messages are taking courses
and they're trying to run the code from you know something that talks to requests whether it is
ssl certificate endpoint ss https endpoint and they'll say this thing says the in the the
certificate is invalid.
The certificate's not invalid. What's going on here?
And almost always
something about the way that Python got set up
on their machine didn't run the
create certificate command.
So there's this step
where Python will go download all the major
certificate authorities and trust them
in the system. So that happens next.
And then it actually will set up things like gcc and whatnot so it can compile it is interesting downloads the source
code compiles it but then what's interesting is it uninstalls the compiler tools it's like okay
we're gonna download python and we're gonna compile it but you didn't explicitly ask for gcc
we just needed it so those are gone.
Cleans up the PYC files and all those kinds of things.
And then it gives an alias to say that Python 3 is the same as Python.
Like the command, you could do it without the 3.
Another thing that we've gone on about that's annoying is like,
I created a virtual environment.
Oh, it has the wrong version of pip.
Is my pip out of date?
Your pip's probably out of date. Everyone's pip is out of date.
Unless you're in a rare two-week window where Python has been released
at the same time the modern pip has been released.
Guess what? They upgrade pip to the new version, which is cool.
Finally, it sets the entry point of the Docker container,
which is the default command to do if you just
say Docker run this image, like Docker run Python 3.8-slim-buster. If you just say that by itself,
what program is going to run? Because the way it works is it basically starts Linux and then
runs one program. When that program exits, the Docker container goes away. And so it sets that to be the Python three command. So basically, if you Docker run the Python, Docker image, you're going to get just
the REPL. Interesting. Yeah, you can always run it with different endpoints, like bash, and then go
in and like do stuff to it, or run it with micro whiskey or Nginx or whatever. But if you don't,
you're just going to get Python three REP python 3 rebel anyway that's the way the official
python docker image configures itself from a bare debian buster over to python 3 neat yeah neat i
thought it might be worth just thinking about like what are all the steps and you know how does that
happen on your computer if you can no that's good yeah, I have been curious about that.
I was going to throw Python on a Docker image.
What does that get me?
Yeah, exactly. That's what it is.
You could also apt install Python 3 dash dev.
Yeah, that might be cheating.
All right, what's this final one?
Oh, so it was recommended by,
we covered some craziness that Anthony did an episode or two ago.
And somebody commented that maybe we need to only in a pandemic section.
Oh yeah, that sounds fun.
So I selected Nanermost.
No, sorry, Nanernest.
It's an optimal peanut butter and banana sandwich placement.
So this is kind of an awesome article by Ethan Rosenthal,
talks about during the pandemic,
he's been sort of having trouble doing anything.
And so he really liked peanut butter sandwich,
peanut butter and banana sandwiches when he was just still,
even he got picked this habit up from his grandfather, I think.
Anyway, this is using Python and computer vision and deep learning and machine
learning and a whole bunch of cool libraries to come up with the best packing algorithm for a
particular banana and the particular bread that you have so you take a picture that includes both
the bread and the bananas or the banana you have and it will come up with the optimal slicing and placement of the
banana for your banana sandwich wow this is like a banana maximization optimization problem so if
you want you gotta see the pictures together so like if you're going to cut your banana into
slices and obviously the the radius of the banana slice varies that where you cut it in the banana
right is it near the top it in the banana, right?
Is it near the top? Is it in the middle?
It's going to result in different sized slices.
On where do you place the banana circles on your bread to have maximum surface area of bananas relative to what's left of the bread, right?
Something like that?
Yes, he's trying to make it so that you have almost all of the bites of the sandwich have an equal ratio of banana, peanut butter, and bread.
Oh, yeah.
Okay.
It's all about the flavor.
I didn't understand the real motivation, but yeah, you want to have an equal layer, right?
So you don't want that spot where you just get bread.
You actually learn quite a bit about all these different processes, and there's quite a bit of math here talking
about uh coming up with arcs for um you have to estimate the banana shape as part of a an ellipse
and uh using the radius of that determined banana slices and estimates for because you're looking at
a banana sideways you have to estimate what the what the shape of the banana circle will be.
And it's not really a circle.
It's more of an ellipse also.
Yeah, there's a lot going on here.
Some advanced stuff to deliver your bananas perfectly.
I love it.
Actually, this is really interesting.
This is cool.
I mean, it's a silly application, but it's also a neat example. Yeah, actually. And this would be, I think, a cool thing for to talk about difficult problems and packing for like a teaching like in a in a school setting.
I think this would be a great example to talk about some of these different complex problems.
Yeah, totally.
Well, that's it for our main items.
For the extras, I just want to say I'll put the links for the Excel to Python webcast
and the memory management course down there, and we'll put the Patreon link as well.
Let's see if you have anything else you want to share.
No, that's good.
Yeah, cool.
How about sharing a joke?
A joke would be great.
So I'm going to describe the situation, and you can be the interviewer slash boss who has the caption, okay?
Okay.
So the first, there's two scenarios. The title is job
requirements. This comes to us from Eduardo Orochana. Thanks for that. And the first scenario
is the job interview where you're getting hired. And then there's the reality, which is later,
which is the actual on the job day to day. So on the job interview, I come in, I'm an applicant here,
and Brian, the boss, says...
Invert a binary tree on this whiteboard.
Or some other random data structure, like quick sort this,
but using some other weird thing, right?
Something that is kind of really computer science-y, way out there,
probably not going to do, but kind of maybe makes sense, right?
All right, now I'm at the job, and I've got my computer.
I have a huge purple buy button on my website that I'm working on.
And the boss says, make the button bigger.
Yep, that's the job.
Yeah, very nice.
Good, good.
All right, well, I love the jokes and all the tech we're covering.
Thanks, Brian.
Yeah, thank you.
Yeah, bye.
Thank you for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at pythonbytes.fm.
If you have a news item you want featured,
just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Ocken, this is Michael Kennedy.
Thank you for listening and sharing this podcast with your friends and colleagues.