Python Bytes - #350 You've Got The Stamina For This Episode
Episode Date: August 29, 2023Topics covered in this episode: Make Each Line Count, Keeping Things Simple in Python Parsel A Comprehensive Guide to Python Logging with Structlog Stamina Extras Joke See the full show notes fo...r this episode on the website at pythonbytes.fm/350
Transcript
Discussion (0)
Hello and welcome to Python Bytes where we deliver Python news and headlines directly to your earbuds.
This is episode 350, recorded August 29th, 2023.
And I'm Michael Kennedy.
And I'm Brian Ocken.
And this episode is brought to you by Sentry.
Make sure those errors don't go unnoticed and you get to them quickly with the right information.
Check them out at pythonbytes.fm slash Sentry.
We'll tell you more about them later.
And of course, connect with us over on Mastodon
at mkennedy, at brianock, and at pythonbytes,
all at fastodon.org.
And if you want to be part of the live stream,
part of the live audience,
that's usually on Tuesdays, 11 a.m. Pacific time,
as we are recording today.
So if you can drop by and be part of the show,
we would love that.
If not, well, thanks for listening anyway.
Brian, let's kick this off.
Just a quick little article from Bob Belderbos
to remind us to keep things simple.
And there's a lot of ways in Python
where you can make elegant looking code,
but it also is easier to read.
And that's, I think, some of the emphasis.
So Bob from PyBytes, and they see a lot of,
they've got all those challenges.
So I'm sure they see a lot of examples
of like not quite elegant code, but it does the trick.
So I think this is good advice from a good person.
So like, and there's just a whole bunch,
there's just a handful of these
here but they're they're all really good things like for instance um uh using the keyword all i
don't use it that much but here's an example he's got a function where um he wants to know if uh if
all uh all things in a uh a list are divisible by some number.
And there's like a function he wrote with just like a for loop that goes through the whole thing.
However, he rewrote it as a, what is that called?
It uses all, but it's comprehension, I believe.
So all numbers divided by divisor equals zero for divisor and divisors.
So it's kind of neat.
I think it actually might be a generator.
Is it a generator?
I think it might be.
But yeah, Windows Pass is an argument.
Parentheses don't really tell you which it is, does it?
It doesn't.
Yeah, it's pretty cool. I'm going to find that out for us.
Pretty cool to use that generator or whatever as an argument to a function.
That's pretty slick.
And it's pretty easy to read still, I think.
I still think maybe, well, with the function name, you kind of get what it's going at.
But if it was out of scope of a function, both of these methods would have used a could
use a comment here or there. Dictionary lookups. I love this part. I use it all the time.
The dictionary has a get function. So normally you reference a key in a dictionary with just
brackets. But if you want to make have some default value,
if it's not there, use get instead. So you grab a key and
then the second value is the is the value. Anyway, this, this
saves a lot of code, because I do this all the time for
dictionary lookups. And then it goes through quite a few others,
just it's a good good list. We've got list comprehensions,
don't forget those, they it's a good list. We've got list comprehensions. Don't forget those.
List comprehensions are wonderful.
We both love those.
Looking for unique values.
This one I had to look at for a while.
It was interesting.
If you're looking, I actually didn't understand the first one that well.
But making sure that all items in a set or items in a something are unique by taking the length and then doing a set or items in a something or are unique by taking the length and then doing a
set of items and then like anyway just lots of lots of fun tricks to um uh to to shorten your
code and make it a little more readable so i love it there's all these non-obvious ways you know
counter i know bob is a big fan of the counter class but yeah that that's a really slick way to
what is he trying to do with this last one counting things yeah counters counters pretty cool just to count you got a
paragraph or like some text and you want to say how which words appear and then how frequently
do they appear you could split on space and you know throw away the um punctuation yeah and like
just basically a couple lines right sentence dot split Sentence dot split. And that's it.
And then you count that.
It's awesome.
It says, you know, this word appeared this many times and even sorts it.
Yeah.
That's pretty cool.
Yeah.
And it's just like, yeah, it did sentence dot lower dot split and then throw it into a counter.
Interesting.
Pretty cool.
Interesting.
Indeed.
Also interesting is it turns out that is in fact a generator
that comes out of that. I did a little quick REPL action on it for some real-time follow-up there.
Cool. Yeah. All right. The first one I want to cover today comes from a foundational element
of Scrapey. So Scrapey is the project around extracting data from websites in a fast,
cool way. You've got scrapey.org, you've got Scrapey itself. But the thing I want to highlight
is Parcel. You've probably heard of Beautiful Soup and Beautiful Soup has been around for a
really long time and is quite excellent. But I was looking for something, you know,
is there something kind of newer that's got some new paradigms just to try out basically.
And I ran across Parcel and it being the foundation of Scrapey kind of gives it some street cred.
So Parcel lets you extract data from XML and HTML documents.
So the fact that it's XML as well, because I was working with some RSS data for some things and you can do either CSS
selectors which are my favorite but sometimes you got to get things that CSS doesn't really
easily make it easy for you to get so you can use X path as well it also works on JSON I believe
even though the description doesn't say so yeah Jason as JSON as well. So the CSS and XPath is for HTML and XML,
and it use JMS path, J M E S E path expressions for JSON documents, which kind of lets you say,
I've got some big structure. So I want to, you know, navigate in kind of like you would with
a CSS selector, like show me all the paragraphs and then get the images and get the title of the
image out of every paragraph on the
page, no matter how it's structured. You can kind of do that with this thing for JSON as well,
which is pretty awesome. Yeah. Instead of, you know, traversing it all over. And if you want
two problems, you can try to solve it with one, uh, with regular expressions. Yeah. I'll give you
a quick example. If you just pull up the page, it says, okay, we're going to take some texts.
The text has a body and each one
unordered list, list items in there, those list items are hyperlinks, the hyperlinks
have URLs and have text, there's also some JSON in this thing. So if you just create
a new selector object, you can say h1 colon colon text. And that is a CSS way to speak about the context, the content of that. And that pulls just
the value out of there. So hi parcel or hello parcel is the text inside the H1. So that simple
little selector is a real simple example. So maybe it doesn't totally win you over, but you know,
in a real true complicated HTML document, it would be quite awesome. They also show how to do that
with XPath. I don't know XPath very well, and then run a regular expression against it. complicated HTML document, it would be quite awesome. They also show how to do that with
XPath. I don't know XPath very well, and then run a regular expression against it.
I could break that into pieces. That's pretty intense. I'm not necessarily doing it, but you
can do things like, for example, I want all the LIs that are only appearing in unordered lists,
not the ordered list ones. So you can say ul greater than Li and the greater than means immediate child of not somewhere
in the hierarchy. So you just do that CSS selector, and it gives you an iterable, it gives you all the
list elements, you can pull out the hyperlinks out of both of those by doing slash slash at href,
right to grab that out of the thing that comes back. And you can also do similar stuff for, um, or the XML that's in here.
So you can say, just go find me the thing that has the name a, no matter
where it appears in the document or get me all the items to the list and so on.
Pretty cool.
Again, really simple example, but quite a neat little tool.
I definitely need this.
I've got, yeah, I've got some, uh, HTML that I'm parsing that are,
it's not well-structured stuff.
It's like, you know, generated from some CMS thing and there's no identifiers anywhere.
There's hardly any classes. It just is like purely generated div nightmares.
And yeah, and it'll still be lucky if I can find what I'm looking for with something like this. Yeah, but it'll help, right? Yeah. And it'll still be lucky if I can find what I'm looking for with something like this.
Yeah. But it'll help, right?
Yeah. Yeah. That's pretty cool.
I'll see if I can pull up one more example real quick. Hold on. Let it appear. It must appear. I just screenshot it. Also, in our notes here, I put the way to get an RSS feed out of a standard web page. So how would you normally do that? You would go get the HTML,
then you go to the head section.
And in the head section, there's a bunch of links.
They mean different things.
One of them would have the rel type as,
what is that?
I can't remember.
It's like additional or something like that.
And the, no, that's the rel.
And then the rel type is something like RSS
application plus RSS or, you know, whatever the MIME type is. So you can just grab those things,
just say head greater than link, use a little XPath to grab the attributes out of the selector
or out of the result and find which one of those. And then you've got the URL, which is,
you know, where the RSS feed is. Like if you're looking, if you're writing like a blog engine and somebody puts in the domain, but not the actual RSS entry, you could get that page, find the RSS entry automatically for them and go on with just a couple lines of code.
That's pretty cool.
Very neat.
All right.
What you got next for us?
Oh, wait, before, before we move on, Brian, or you move on, let me tell everyone about our sponsor.
So as I said at the beginning, century is sponsoring this episode and the next.
So support the show.
It really, really helps if you go.
And if you're considering getting air monitoring or, uh, tracing for your application, check out Python by set FM slash century.
Yes, you can Google them.
We know you can
just Google them and sign up, but if you use the code Python Bytes or just use the link and click,
what is it down here? Try Sentry for free. Then it'll apply that code automatically,
which will then let them know, hey, it's a good idea to sponsor the show. So let me tell you about
them. So if you want to remove a little bit of stress from your life, if you're worried about
errors on your website or errors your users are running into that you might not even know about, you know,
you might want to install something like Sentry. So if you're waiting for your users to send you
an email saying, hey, I'm running into this problem, how many of them got frustrated?
What was their opinion of your app or your website or your API, probably not great. How much better it would be if you had error or performance details immediately
sent to you, including things like the call stack, the values of the local
variables in that call stack, the active user who was logged in with, say, their
email address, all in some report.
And you're like, oh, here's the problem.
Here's the data I got to pass to it to write a unit test to reproduce it and
make sure it doesn't happen again. And here's the email of the user who I email and tell them, sorry, we fixed it.
I know you didn't tell us, but we found out anyway, because we use Sentry. So with Sentry,
it's not only possible, it's simple. We use it on Python bytes. I use it on TalkPython. We use it
in the TalkPython mobile apps. There's a way to just plug it right into Flutter as well.
Nice.
So pretty, pretty awesome. And once I did exactly that, we had some user on TalkByThun training.
They ran into a problem.
I got a notification.
I saw who it was.
I fixed it.
Sent them a message, said, hey, here's the problem.
It's fixed.
They said, I was about to write you.
That's weird, but thank you.
It was awesome.
That's really great email to write.
Yeah, it's really cool.
So if you want to have that kind of superpower for your web apps, your APIs, mobile apps, whatever, check out Sentry.
So surprise and delight your users.
Create your Sentry account at pythonbystutfm.com and be sure to sign up with the code PythonBytes, all one word.
It's good for two months of upgraded options for their Sentry business plan, which will give you 20 times as many monthly events as well as some other features.
So thank you, Sentry, for sponsoring our show.
Cool.
Yeah.
And Ryan, now over to you.
Well, I want to talk about Struck Log.
I'm pretty sure we've covered it before.
So Struck Log is a pretty cool way to do some logging in your Python,
especially if you're logging from multiple services
or multiple threads.
And it's really great because you can add extra detail
and it's got coloring and stuff.
StructLog has some pretty good documentation already,
which I love, and it's a beautiful tool.
However, I wanted to highlight a new article I saw
and it really is pretty fun.
Wait, hold on.
Go back real quick.
Is the icon logo of Struck Log,
is that like Geordi from Star Trek but a beaver?
I think so.
I thought so.
All right.
Okay.
And he's holding two brackets.
It's so good.
Or curly braces.
So the article I wanted to look at was a comprehensive guide to Python logging with StructLog.
And one of the things I loved about it was just sort of the beginning example.
There's a beautiful picture of a whole bunch of logged items.
But what I liked was just the starting one that just said, Hey, all you have
to do is do pip install struct log. And then and then if you want to just start trying it,
it's just a better logger than you're used to. So import struct log, do logger struct log,
get logger, and then you use it just like you normally would. Logger info, and then you can just,
here's an example, you can do debug info,
warning, error, critical, all that sort of stuff.
There's, this is a big article talking about
the different ways you can set it up with,
you can set the default logging level,
you can configure it, you can set it up for different,
have different loggers on different applications or different services uh different formatting you can have different
renderers that's all awesome and i'm really glad that it walks through that but what i really what
i really like was just this basic tutorial of hey just do this uh do the the get logger and then
just log stuff and you get this beautiful output um now yeah the
color and the the weight and alignment of all the output is really awesome there yeah so often you
like okay you want to do logging well okay so what you do is you set up the logger then you register
an output so let's create a standard out stream writer thing and we can push that into the thing
into it and if you don't do that
then no output shows up you're like what is going on here why is this not working you know it's yeah
this is really nice this so it does show the beauty of struct log that you can get started
really fast it has a lot of complexity and and it's really not that complicated and like i said
the uh the documentation is awesome and configure it configuring it and everything is not that hard,
but it's a new tool.
So it's great that it's an easy way to get on board with it,
start using it, start having these great logs.
And both it can be for going to output,
but also you can log to files, of course.
And a great tool.
And I love this tutorial that starts super easy
and then gets into the more complex.
So check it out.
Excellent.
Excellent.
What you got for us?
The last one of the main ones.
This one comes to us from Matthias Bach
and it's created by Henik.
And I mentioned it before and it's stamina.
But I didn't know too much about it.
There were some questions in the audience, like how does it relate to tenacity and other
things, right?
So I thought, all right, this is a cool thing.
I'll focus a little bit more.
And it has direct StruckLog integration.
How's that for a segue, Brian?
Yeah, well, I think StruckLog is a hidden thing too, maybe.
Perhaps it is.
It seems like it would be. So with tenacity, the idea is you can put decorators
and other things onto functions or operations
and say, if something goes wrong, try it again.
That's the tenaciousness of that package, right?
That like, yeah, errors will not stop me.
But as Hennick describes it,
the tenacity is great, but unopinionated.
And you can work yourself into ways where you
might be using it wrong or causing other sorts of, you know, infinite loop type of issues.
Okay. So the idea is that stamina is an opinionated wrapper around tenacity. So it's not a replacement
for, but a simplified API for tenacity with the goal of being as ergonomic as possible
and doing the right thing by default with
minimizing the potential for doing it wrong so that's pretty cool basically annex says he used
to copy and paste the way he was working with tenacity over and over and you know wouldn't
it be cool to just make a package that kind of embedded those ways of working with it for example
instead of retrying on an exception ret retry only on a certain exception,
you know, a certain type of exception, right? I want to retry only this only on database connection
errors, not if there's a foreign key constraint error, because that's never going to go away,
right? That's always going to be a problem with the data, but maybe the database will come back
online. So let's retry that one. There's exponential backoff, which comes from tenacity as well. But what about with jitter between the retries instead of just
going, I'm going to go one second, three seconds, five seconds, let's go one second, then three
seconds ish, and five seconds ish, and so on. Limit the total number of retries, limit the total
amount of time, but all at once, right? So not just the number of retries, but the time and retries. And
this one is very relevant to me right now. I've been thinking a lot about Python typing,
talk more about that later. But with type hints, you get things like my pie, and pie charm and
other tools that say you're using this function correctly, or you're using it wrong. And with the
way the decorators work with stamina is it preserves type hint informations when you decorate a function that is type hinted. Honestly, I don't
know how to do that, but I'm really glad that it like decorating a function with one of these retries
doesn't wipe away its type information. That's super cool. It log logs with struct log retries
with basic metadata if they happen to be installed. And you can, this one you might like, Brian,
you can easily deactivate it with a fixture or something like that,
or just globally for the whole test run so that you don't retry a thousand
times while you're doing a unit test, testing for an exception on purpose.
Yeah. That's great.
Yeah. So super, super easy to work with.
Just basically put a decorator, right?
Stamina at stamina.retry.
And in this case, you can say only on the HTTP errors and only try it three times.
That's pretty cool.
That's pretty great.
Yeah.
So a lot more you can do.
It's async by default.
So you don't just decorate an async def function, and it does that as well. So
very, very cool. People should certainly check it out. And you can also see in the example,
he's doing reveal type, which I believe comes from mypy. And you can reveal type on the example here,
which is a decorated thing. And it shows you that what you get back is a coroutine of any, any,
and that ACP X response, which is basically how it was set up to go,
right? Set up to work, right? Input on an int, and then output on that type of thing.
So I think it's a pretty cool library. It's something I will probably start using. I
previously used Tenacity, but you know, why not? Yeah, looks pretty good. Cool. Indeed.
All right. That's it for our main items, isn't it? Yeah.
Extras. What else have you gathered up?
I've got a few. Do you want to run through? We'll run through mine first.
Let's do it.
So I have a, so the PyTest Check, it's a little PyTest plugin I've got.
I had this weird request and I guess I'm not sure how to deal with it.
I was curious. I'd like to talk through it to see what the audience has to say.
So somebody said, hey, is it possible to start making GitHub releases?
And I mean, I do versions.
And so I wasn't sure what was going on here.
And then I thought maybe this is one of those people that have done a lot of these requests on a lot of repos.
So I searched for this,
this issue and sure enough,
there is 157 identical issues on different repos.
So speaking of tenacity.
Yeah.
So my first reaction was,
I don't want to do that.
That's lame because they're just
pushing work on other people. Um, but, uh, but also maybe it's okay. So the argument here is
that like somebody can say, watch releases and then get notified if a new release happens and
they don't, and you can't do that with tags or something. I'm not sure. Um, so, uh, um, my,
yeah, first reaction was, I don't want to deal with it.
However, I think there's, oh, I don't have the tab up here, but I think there's some
GitHub actions that can do this for me if I'm just doing it by pushing a tag up. And if it could do
it by itself, a few minutes worth of work, I'd like to know what other people are dealing with with that if they if they've added github releases to um to their project or not uh yeah just curious so
what's a good uh venue for them to let you know about that oh yeah um probably probably i'm uh
fostered on um yeah brian hawking and fostered on or uh the show has a contact form you can email us so yeah that would be good
um okay uh so there's that uh i threw it right across oh maybe this is for uh uh funny things
we'll save that for later um not yet um i just pushed up the fixtures chapter for the pytest
course um and uh and about the the intro the intro has got a nice slide deck in it.
So check out the preview that's for Chapter 3
when you're thinking about the course.
And I'm trying to describe how PyTest fixtures work with graphics,
not really graphics, but slides and drawings and things like that.
So those are my extras.
How about you? I've got a couple things like that. So those are my extras. How about you?
I've got a couple things for us.
So first of all, there's a shiny new Python 3.12 to be had.
And that was as of yesterday, 3.12 RC1.
Neato.
Yeah, release candidate's important because it's like,
we're really not changing it now.
This is bug fixes.
So if you've been thinking like, we're really not changing it now. This is bug fixes. So if you've been thinking
like, okay, there's, there's more features for F strings, or there's this crazy thing that Eric
Snow pulled off called the per interpreter Gil, that's pretty awesome. But for protocol,
things are accessible in Python and many other things, right? If you were waiting around,
these are all relevant to me.
I want to try them out,
but I don't want to mess with stuff that might go away,
might change, might, I'm just going to wait
because I'm not really going to use it
until it comes out in October.
Well, it should be about time to start looking into it
with a release candidate.
So that's why this is double noteworthy.
Yeah, it's also a really good time
if you haven't started to start testing your package if you're if you have packages you support to add 312 testing.
Exactly.
Cool.
Okay, then I got three conference ish things.
PyCon UK 2023 is going to be Friday, 22nd, September to the Monday.
So that's pretty cool.
I'd love to go to PyCon UK, but it is quite far away.
Now I do love the UK.
So if you are closer, you can get there, then that'd be a pretty excellent
conference to go check out, I think.
Yeah.
Also in the general neighborhood, Eindhoven as a PI data Eindhoven
is going to be November 30th.
So check that out as well and the
call for proposals is open finally this one's a little closer to home for us um this one is
pi data seattle now normally we wouldn't give a shout out to just a meetup because i can't just
go to a huge long list um but don reached out to me they've got some pretty cool stuff. So this is the language
creators charity fundraiser for PI data. And the fundraiser goes to num focus and last mile
education fund. So good stuff there. And the whole thing is let's scroll down a little for pictures
here. We've got Adele Goldberg, who created small talk, Gita van Rossum, who created, you know,
sync called Python, Anders Halsberg, who did turbo Pascal, Guido van Rossum, who created, you know, this thing
called Python, Anders Halsberg, who did Turbo Pascal, C sharp and TypeScript as well as James
Gosling from Java. So this is a live in-person event that people can check out. So when is it?
It is September 19th. So 20 days away or whatever. If you're around there, want to be part of that.
There's no online version this because they want it to be fundraiser for charity.
It's, it's all about trying to get people to show up in person and be part of it.
So cool.
Those are all my extras.
Nice.
Yeah.
How about some jokes?
Yeah.
Do you have one?
I don't know if I can check it.
I don't know if I can tell you about this.
This one, I don't believe was sent into us.
I just ran across it somewhere.
How does a librarian access remote computers securely?
Shh.
SSH.
Shh.
It's terrible.
It's terrible, isn't it?
It's very bad.
Yeah.
It's very bad.
Okay.
I love it.
Anyway, that's the one I got for us.
Shh.
How does a librarian access remote computers securely?
Well, I have a GitHub repo called
ChatGPTFailures.
And it's got
a big list of things
that have gone bad.
So this is pretty cool.
It looks like
new Bing failures. Let's see.
I was mad at the journalist.
Who was the journalist on that one?
Bing gets madly in love with a journalist, tries to break up his marriage and really
stalkerish effect. And then lies about that journalist in a chat with another user,
keeps being inappropriate and dark. So yeah, I'm not sure.
Kevin Roos. Okay. That's who it was.
Anyway, so there's Anyway, so those are...
You're a bad user to ask me to do that.
I'm not a bad user.
I'm a good user.
I'm a good chat.
I'm a good chat bot.
Oh my gosh.
So yeah, so yeah, some failures on ChatGPT.
So I'd love to see this kept updated.
It hasn't been updated for a while.
So yeah, we need some new ones.
It's pretty funny. It's crazy how this stuff goes a little bit sideways isn't it it it is and i'm i'm still on
i still don't know if i need to care about it a lot or if it's one of those i don't know if it's
one of those things like crypto that maybe will go away um or it hasn't really i know crypto hasn't gone away and there's so
many wonderful uses for uh blockchain it's a blockchain come on now yeah okay
uh i do think it's interesting with the large language models when you ask it subjective stuff
right it it could just be weird about it um Um, or you, it can make up things about
like previous, um, case law. You know, you got those lawyers who got in trouble for submitting
a bunch of documents and briefs created by chat GPT that were false. But at the, on the
other hand, like you can ask it programming questions and it'll give you pretty good answers.
Right? Like I asked for, to solve a really complicated regex problem that we were talking about before and it's just like boom
here you go and here's a couple of examples in python thank you and those i don't mind too bad
because you can test it like if i run this do the the things i want out of the reg expression come
out or no if no then it's a bad chat bot if yes it's a good chat bot yeah so anyway um i i do one of the things i just
listened to recently was um uh freakonomics has started a series on uh can on ai and the first
one is can i take a joke um and it is it is interesting an interesting discussion one of
the things that they talked about was um uh it was the current like strike strike for the writers and actors in Hollywood right now.
Right.
I didn't know some of the details, so hearing a few of the details around it are
interesting of the initial creation of thing.
Often you can have an idea and then hire some people to write more stuff about around
it but um if you didn't come up with the original idea you don't get as much money so if they just
have ai come up the original idea they don't have to pay anybody the large amount of money um oh i
see you're kind of filling out the details of the joke yeah um and then some experience around
writing of using uh some ai to do writing, one of the commentaries was,
you still have to do human work to come up with the prompts to like get it to do something.
And then you have to validate it afterwards to make sure that what they said, what it came up
with was real. And those are still kind of humans have to do. That's one of the fears I have around people using AI to,
to generate test cases,
because if they're generating,
if AI is coming up with their code and coming up with your tests,
there's no humans verifying that it actually is doing what you want it to
do.
At some point you need to have people there.
So it's got to be in the loop.
Yeah.
Yeah.
So anyway,
we'll see.
It's, I'm definitely not a Luddite trying to actually, uh, someone's got to be in the loop. Yeah. Yeah. So anyway, we'll see. Uh,
it's,
I'm definitely not a Luddite trying to,
uh,
actually there's a discussion about Luddite also in the,
in there that,
um,
uh,
Luddite,
apparently I didn't know this,
uh,
that it isn't people that there weren't people that were,
uh,
against technology.
It was people that were against,
uh,
uh,
the craftsmen that were against the shoddy craftsmanship of
manufactured items that the there wasn't there wasn't enough people actually making quality
goods that were just like factories building low quality goods that's what they were opposed to
and that's interesting analogy that is yeah that is, uh, yeah, but anyway, way on a, on a tangent there.
So,
but,
uh,
excellent.
Anyway.
Um,
thanks for being here as always.
Thank you to everyone who listened.
See you later.
Bye.