Python Bytes - #192 Calculations by hand, but in the compter, with Handcalcs
Episode Date: August 2, 2020Topics covered in this episode: Building a self-updating profile README for GitHub Handcalcs The (non-)return of the Python print statement FastAPI for Flask Users Tweet deleting with tweepy Clingi...ng to memory: how Python function calls can increase your memory usage * No local variable at all* * Re-use the local variable* * Transfer object ownership* Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/192
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 192, recorded July 22nd, 2020.
Had to look that one up.
I am Brian Ocken.
And I'm Michael Kennedy.
And I can't believe we're heading close to 200. This is crazy.
Oh yeah. Been at this for a while. That's going to be like four years almost.
Yeah. Again, this episode is sponsored by us, and we'll tell you a little bit more about other things that we're doing a little later in
the show. But first, some of the ways that people tell each other what they're up to is their
personal GitHub readme on their GitHub profile, right? Yeah, that's right. So I was impressed
by something that I saw recently. Simon Willison, he's the co-creator of Django.
He posted something called a blog post saying how to do a building a self-updating profile readme for GitHub.
So at the top of it, I'm going to quote this.
It says, GitHub quietly released a new feature at some point in the past few days, profile readmes.
This is news to me.
Yeah.
So if you create a repository with the same name as your GitHub account,
so in Simon's case, it was Simon W.
So GitHub.com slash Simon W slash Simon W.
So you go too deep.
And then add a readme.md or readme markdown file to it.
GitHub will render the contents at the top of your personal profile
page. So that's neat. In that case, it's just one up. So if you go to github.com slash Simon W,
you see his, but his looks really awesome. It's got a whole bunch of cool stuff in it
because he took it one step further. It's not a static markdown file. He's got another article
that talks about it, but this article here walks through exactly
what he does. And also it's all open source. So you can see his code. He uses a GitHub actions.
There's both a button that he can push to make it happen, but there's also any post to his own
Simon W repo will cause this to happen. But the GitHub actions run. He contributes to a lot of open source projects. So he takes
a certain set of repos that he has and pulls the latest releases
and have latest release notes using the GitHub GraphQL
API. So there's an example of that. There's an example
of using Feed Parser to pull blog entries off of his blog and an
example of using a SQL query to grab.
I guess he's got a site called TAIL for Today I Learned.
Grabs a few links off of that.
So he's got a three-column setup for a readme
that is kept up to date using GitHub Actions.
How cool is that?
That is awesome.
Yeah, so normally you go to your GitHub repo
and you have your picture on how many followers you have and whatnot.
Some other cool stuff we'll talk about later.
But then it has your pinned repositories
and that green-ish heat map of how frequently you contribute
to various projects or just to GitHub in general.
But now you can have right at the top whatever you want to write,
which that's pretty awesome.
I think I might have to do this. Yeah, I mean, you still get right at the top, you know, whatever you want to write, which that's pretty awesome. I think I might have to do this.
Yeah.
I mean, you still get all that other stuff, but it's just that other stuff is below this readme info.
That's pretty neat.
Yeah.
Very cool.
And it's super simple, right?
Anyone can write a readme.md file.
Yeah.
And one of the reasons why I brought this up is I think there's a lot of people trying to utilize, I mean, of covid and quarantine and stuff i'm glad i'm not
looking for a job and i think that if you are looking for a job making your github profile
look professional and show the content that you want to show off and having things like you know
blog posts on your github profile that's pretty cool it is really cool and just you know you know
that people employers say they check people's GitHub profile accounts, right?
So how many people are going to have these unique special ones
that show they care, right?
Not too many.
Well, the people that listen to our podcast.
Exactly.
Yeah.
All the awesome people.
Okay.
So that's really cool.
I definitely didn't know about that.
Thanks for sharing that.
It looks neat.
So we got this next one from Connor Furster,
and he works in engineering, but also does data science-y things. And he sent over this project that he works on that is incredibly cool. math that you might write out by hand, turn it into Python code through Pandas and NumPy and whatnot,
Scikit-learn, or Scikit in general,
and then run it through Jupyter and get an answer. But he says
he works in design engineering
and you have to do a lot of calculations
and those have to be kept
as part of legal records to
show the project design history.
And
one thing you can do is do them by hand. That's kind of crazy. A lot of people use Excel. And one, yeah.
One thing you can do is do them by hand.
That's kind of crazy.
A lot of people use Excel.
That's a nightmare.
Like Excel is like unbounded go-tos you can't see,
which is always tricky.
So you could do it with Jupiter, but then you just got this pile of code and here's the answer and so on.
Right.
So you want to like the theoretical view to verify the formula you're using.
Right.
So he created this thing called hand calcs and C A L C S like calculus or
calculations anyway,
hand calcs.
And the idea is you type in Python code into a Jupiter cell,
and then you can do a percent percent render from the hand calcs project and
it will turn it into symbolic math
this is beautiful yeah as if you had written it out by hand yeah with like as an example in the
little video demo we've said before we like those and everybody does but it has um has like square
root symbols with a bunch of symbols underneath it and all sorts of symbols that yeah yeah looks
like like what you would have
had to show if you were in math class right yep exactly and it will show steps like symbolic steps
from step a step b step c and you can say show it shorthand or expand it out longhand and show me
all the the steps you use to like solve these problems and all kinds of cool stuff wow yeah
the reason it looks so good is it basically converts symbolic Python math over to LaTeX. And LaTeX is like the de facto math representation language for academic papers. So, you know, you want to have like integral signs, you want to have infinite summations, all that kind of stuff. No problem.
This is really cool. Isn't it cool? And then you can also use the symbolic tag to get it to do other,
like show more symbolic stuff.
You can do longhand, shorthand.
You can have it do units.
They'll put units like millimeters cubed or whatever,
and it'll carry the units through the calculation symbolically.
Yeah, but looking at all these formulas, it's giving me nightmares.
Don't look anymore.
Okay.
Well, I guess the thing you would want to think about the trade-off
is would you rather look at them in their proper mathematical form or in like programming
meaning like you know where you turn it into like star star pow instead of you know proper
exponents and stuff no no i was just kidding this is beautiful stuff but when we got into
integrals that's where basically that's where my brain left and i never really caught it
yeah yeah cool all right so people have to take programming math but they want to represent it
more nicely check out hand calcs looks awesome yeah nice oh i'm next. Actually, I'm not. I'd like to talk to all of us about our sponsor. And our sponsor is TalkPython Training and Testing Code today. Tell me about TalkPython Training.
I'll tell you about what I'm working on. This week, I started writing a new course. We have a couple of new courses that are fun that are coming. And the one that I started working on is called python memory management and tips tell me more yeah so if you ever wondered like what happens
like how does it free up memory what algorithms make like work better with python memory and what
algorithms can make it more expensive or slow what are some of the tips and tricks you can do to like
dramatically decrease the memory consumption like two or three times with almost exactly the same
code type of thing well i'm writing a course on that oh that's neat yeah especially for people
like talking about uh doing some more but we can get python on smaller operating architectures like
circuit python and stuff that's important so yeah that's a really good point that on the small
memory constrained pieces you might care a lot for sure.
How about testing code?
Well, I was interviewing somebody recently, David Lord.
His actual interview will come out sometime in August.
But he said, I was looking at testing code,
and a lot of the recent episodes really haven't been about testing.
What's up with that?
And I said, yeah, it's and code, test and code.
But, yeah, so there is a lot of testing focus primarily because I think that software
engineering doesn't talk about testing enough. But I do cover a lot of
stuff. I'm going to highlight a few of the last episodes. I talked
to Sebastian Ramirez on episode 120 about Fest API and
Typer. Talked with Brett Cannonirez on episode 120 about Fest, API, and Typer.
Talked with Brett Cannon on episode 119 about packaging and PyProject.toml and what's going on there.
121 is a diversion.
It's a completely different sort of talk.
I talked to somebody about 3D printing and finite state machines and stuff,
and it's just sort of a fun people doing Python and cool things. Very cool. And then
again, thinking about
people possibly looking for jobs,
in episode 122, we
talk about better resumes for software
engineers. So there's
a lot of stuff for everybody, even if you
cringe when you think about testing, please
check out Testing Code. We are
still putting out episodes. And if you want to
hear more, I'd love to hear what you want to hear more i'd love to hear
what people want to hear about yeah it makes our job so much easier when we get suggestions yeah
suggestions and questions and things that can flow into things but yeah like a suggestion to
return the print statement so you don't have to put the parentheses yeah so this is crazy and i
don't really have much of a comment here but i saw the thing by Guido and then I saw this article by Jake Edge on lwn.net.
I don't know what lwn stands for, but it doesn't matter. Anyway, the non-return of the Python print
statement. So this is odd, I thought. We have talked about the new peg parser in Python that's
going on, but one of the things that happened with that is,
I guess one of the reasons why Python 2 to 3,
they went from a print statement to a print function,
was it made the parsing easier.
But with the peg parser, you can do all sorts of crazy things,
and you can have functions that syntactically look like statements
and have it work, just work, sort of.
So as an example example we could use a
print statement instead instead of uh having to be put the parentheses in you could avoid the
parentheses anyway he put he just put it out there as an idea and uh essentially people said no yuck
what do you think about this it's interesting it would be one fewer things that
has to happen to move to the next stage for from a two to three conversion but on the other hand
this looks like one of the easiest conversions for that step to me i'm not a fan of having
statements and functions in the language because it looks to me like you know functions basically
solve the same problem with a little more clarity you know they're a little
more functional you can span the multi-line if the arguments are super long you need to
like with this the print statement you'd have to use like a continuation backslash and other
weirdness like that so you know just just because you can doesn't mean you should i guess that's
probably how i feel about it but yeah i wouldn't use it if it were in the language let me put it
that way i yeah i'm i'm for the no yuck camp.
I think that print statement shouldn't have been a statement in the first place.
And I think Python 3 fixed it.
Having it be a function is the right thing to do.
I wish there were more statements that were functions instead.
Also, I wish assert was a function instead of a statement.
Because people thinking that it's a function
and putting parentheses around assert causes problems.
But that's not what this is about.
It's interesting I brought it up
because people should know about this weird, wacky discussion.
Yeah, that's funny.
I'm glad that it got thumbed down.
And I don't think it's going to happen.
You're willing to make a statement about it?
Yes.
All right.
Well, I'm going to make a statement about Flask.
I think Flask, you just had David Lord on the show, right?
That's not out yet, but pretty cool.
Yeah.
And he's lead maintainer of Flask these days.
So Flask is, at least at the API level,
got to be the most popular web framework there is
because it's slightly more popular than Django
if you look at some of the
recent surveys but if you look at the other frameworks many of them are flask-esque if you
will right things that are like responder or scenic or whatever they have this idea of like
sort of the same style right so there's an article called fast api Flask users. And I'm actually a big fan of Fast API.
I'm hoping to have some opportunity to use it soon.
Like the APIs that I've worked on, they've been around for a while.
They predate Fast API.
And I don't really want to go create a whole new site just so I could use a different framework.
That sounds like maintenance to me.
So I haven't got a chance to use it in production yet.
But Fast API looks awesome.
So there's an article called FastAPI for Flask users, and it says, look, you probably know the Flask API.
Here is the equivalent for FastAPI.
Okay.
Yeah, and so there's talk about some of the advantages, and they're pretty awesome.
So automatic data validation in FastAPi doesn't exist in flask
generally speaking automatic documentation generation built-in best practices like
type annotations and pydantic scheme schemas and whatnot it comes ships or recommend i guess that's
terms of uh like a requirement you have to have a ASGI server. So it comes with
UVicorn, which is one of the, it's like GUCorn plus UV loop for async stuff. And in a lot of
ways, it's super similar. So if you want to create a view method, instead of app.route, you would say
app.get. And so fast API, would you imagine the name indicates it's mostly for building APIs?
Yes.
All right.
So when you talk about functions and what they're going to do, you say not just here's
a URL, but here's a URL and an HTTP verb.
So app.get forward slash or app.put slash account or something like that, which is pretty
cool.
In the route, you can specify variables.
So in Flask, you would,
you could have a user ID and in the string route, you would say int colon user ID. If you want Flask
to convert that to an integer, right? That's fine. It works. Okay. But that the rest of the tooling
doesn't help, you know, it's an integer just because Flask knows it's going to be an integer,
right? So in fast API, you put the variable up there as well.
But then in the function, you put the variable name as a type, and then it will actually convert that to an integer using the Python language tools or specification rather than the string
API thing. That's handy. If you want a query string and flask, you just have a URL,
you can go to request.args, and you can get the value out of the query string.
In FastAPI, you just put the query string values, or sorry, the keys as arguments, and they just get passed in.
That's pretty cool.
If you have an API that takes a JSON post, like it's accepting a JSON document document you can just say it takes a dictionary and that
gets posted in but you can go way way further which is awesome you can define a pydantic model
which is a class that has types and validation on the class right yeah and then you can say my
view method or my api method takes like in the example they have is a sentence that has got like
various components nouns verbs and whatever you can say here's a
function and it has an argument called sentence and it'll take that json document parse it into
the pydantic model and pass it to you pre-validated that's definitely one of the benefits of fast api
is this data validation yeah this data this is like built-in data validation because how much
how many times do you spend like effort
oh i got a string but i got to convert it to an integer i got to make sure that this value is here
i got to make sure that this one is you know like whatever like it matches some some set of
sub strings or whatever just let let the framework handle it also has the equivalent of blueprints
which it calls routers and this automatic validation I talked about. So anyway, there's a nice article that says, you know, flask,
let's teach you fast APIs real quick by just doing a, this equals that.
Yeah.
I love this because there's a lot of people that have been writing APIs in
flask for a long time.
And so it's just second nature to them and having something to say, Hey,
I want to try fast API, but is the learning curves going to be a problem?
Well, with something like this, it's Decoder Ring.
And it is set up for, you can just sort of skim through it and go,
well, how do I do URLs?
Oh, this is how you do it.
And URL variables and different things.
It's set up really nice.
Yeah.
Yep.
Definitely fun.
Definitely useful.
So do you use Twitter?
I do use Twitter. Sometimes happily. Sometimes I Yep. Definitely fun. Definitely useful. So do you use Twitter? I do use Twitter
sometimes happily. Sometimes I get dragged into stuff. Sometimes I use it in a right only mode
where I want to make a statement, but I don't really want to go read it. But yeah, definitely.
Yeah. Well, I have a, this is probably common, sort of a love hate with relationship with Twitter.
I use it a lot in like keeping track of other people, but sometimes I don't
really like that it's a pain to delete old stuff because I think of it as a current conversation.
I don't really look at what somebody wrote a year ago and I don't really care what I wrote a year
ago. So I have used Twitter deletion tools before. They seem kind of weird that I have to go out
and give my credentials to some other website or something, though.
But I know how to.
I'm sure that'll be fine.
It'll be fine.
Don't worry about it.
It'll be fine.
Yeah.
Anyway, but there's APIs, so you could use the Twitter API.
But how?
And so I thought it was really cool that Chris Alban
is somebody that tweets about data science a lot and he posted a little snippet that he said he uses at least he did it first at one point but
it's a it's a cool little example of using a library called tweepy to interact with twitter
and to delete old tweets for your account so So it's just this really short little Python script,
but it deletes tweets.
There has defaults, but you can change those.
Obviously, it's just a Python script,
so you can change whatever you want.
But it's set up to delete tweets that are older than 62 days
and that have likes less than 100 people
and that you haven't liked yourself.
So the idea being, if you go through some of your old tweets
and the ones that you're like, oh, yeah, that was cool.
I want to keep that around.
Just like your own tweets, and then run this script,
and it'll delete some old stuff.
I would definitely have to change that 100 count to something else
because I don't think I've ever had a tweet liked by 100 people.
That's a big number.
You know, Twitter used to show how many tweets you actually had,
and I don't think it shows it anymore.
On my profile, at least, I don't see it immediately,
how many tweets I had.
Just followers and following and likes and stuff like that.
But, yeah, pretty cool.
It's like keep the highlights, right?
Just keep my highlights.
I don't need every random thing of,
oh, I went out and had a hamburger today.
People don't need that as a piece of history.
Yeah, and you don't know what's going to stick
and what's going to not.
And I was actually reading an article recently
about Twitter, about what that says to you
if somebody, like for instance, you're trying to get a job,
and somebody looks at your Twitter account,
having the junk in there that nobody really related to,
having that automatically called out and just having the highlight reel,
that's not a bad idea for some of the old stuff.
Yeah, and you could turn it way down.
You could say, look, if there's no likes or no retweets, just drop it. Yeah, it might
even be good for me just to go back a couple days. But if nobody's liked
it in a couple days, maybe just take it away. That didn't happen.
Yeah, that didn't happen. So people could end up
clinging to their old tweets, but they probably shouldn't. Right. Yeah. So I want to talk
about an article by Itamar turner trowering now we spoke about him sort of not by name i don't think but
we talked about phil the data science memory profiler a little while ago okay all right so
he's the guy who wrote that i actually had him on talk python on episode 274 as well talking about
that so that was pretty cool but he independent of that, he wrote this article that I came across
that I liked called Clinging to Memory,
How Python Function Calls Can Increase Your Memory Usage.
And this is part of my research for working on that course
that I was talking about, that Python memory management stuff.
So he talks about, hey, we're going to have this thing it's going to load
up some numpy data and then it's going to pass it to a function the function is going to make some
changes take the return value that pass it to another function it's going to make some more
changes so basically three steps and said look we'd expect that we've loaded two gigs of memory
and yet when you run fill against it you end up with three gigs of maximum memory
usage, which is a little bit weird. And the reason is those initial like intermediate values that
you're working with on step one and step two, the way Python decides a variable goes out of scope
is in this case, the function returns, not like it's never used again, but it's just the function
returns, in which case it's going
to hang on to all the intermediate copies all the way to the end interesting right like some
languages they determine that and they get rid of it like it's c-sharp the jit compiler will notice
like okay a variable is not used after half the way so we're going to make it eligible for gc
basically unless it's
in debug mode then keep it around in case somebody sets a break point they want to see it so there's
a lot a lot of the tricks that things can do python doesn't do them so it's going to stick
around for this length of the function so what can you do to make it not stick around as long
because maybe you only have two gigs and you don't want to use three gigs or whatever
right so he talks about three different solutions one is to don't hold on to the intermediate variables and just chain into one
massive function call like pass the results of one to two to step two pass the results of step two to
three and there's no variables holding on so they'll be gone right that's an option another
one is to like iteratively change the variable say like data equals load data from
first step data equals step two of processing of data data equals step three of processing of data
and that way you're dropping the reference count to the first to the intermediate steps along the
way right so that's an option and then there's a third one that's more complicated about creating
like a sort of a ownership management type of thing that people can check out as well. But I just thought it was interesting to think about how long do these things stick around and what techniques might you use that are incredibly simple, like just reuse the variable name, problem solved, in terms of having too much memory usage. Interesting. Yeah. When I look at these, they all look kind of like the same,
but having the answer be that they use different amounts of memory is not obvious.
Right, it's not obvious at all.
But you could easily look at this one where you're iteratively changing the variable
and say, oh, you shouldn't do that.
You should name it more clearly because maybe the type is changing along the way
and it would be weird.
But you could say, yeah yeah but this one works because it
will fit into ram and the other one won't so we're willing to accept this like slightly imperfect
imperfect code because it works better anyway there's a lot of interesting trade-offs you can
make here but i just it's it's only the tip of the iceberg for things like this you could do i think
but it's interesting to just put it on your radar. Yeah, that is interesting.
Yeah, actually, like we said, I think that more and more
as we start using Python for other applications
or non-desktop kind of things, like when we're in non-server things,
if we're using it for, there's a couple ends of it.
If you're using small devices like in CircuitPython or something,
you're going to care about this stuff. But also if you're using small devices like in circuit python or something you're going to care about this stuff but also if you're using very large data sets then we care about it again
and it doesn't matter how much memory your computer has having multiple copies of gigabytes
of data when you don't have to will slow things down yeah for sure or even if it's like an api
and you just happen to be doing is not extreme, but you happen to be doing a thousand of them at a time. Same story.
Yeah, exactly. And as we use Python more and more in more applications, we're going to start caring about that again.
Yeah, absolutely.
That's the end of our six. I actually have been just so swamped with stuff. I don't have anything extra to talk about. Do you have any extra items?
I do.
And this is just a follow-up email we got from a listener named Adam.
Thank you, Adam.
And you had talked about pickling things.
Apparently you're a fan of dill pickles and, no wait, pickled strings, pickled lists,
pickled dictionaries.
No, we were talking about pickling and how it didn't make sense most of the time.
But there might be some use cases and you're like like what might be a use case that we really need right
so adam said hey i got a use case that worked for us he worked on an api that spoke to a third
party service that was wonky and it was over raw sockets so you'd have to create these byte arrays
and send them along and the thing was also not available 24-7. It would sometimes crash, things like that. So what they would do is they details and whatnot so it was like uh oh we gotta save this exactly as we would have sent it let's
just pickle it that's pretty cool yeah it seems like a pretty good one yeah and there was a feature
flag they could turn on and off which was kind of cool yeah they could also do that for the messages
they got from the service pretty cool real quick python 384 is out i've already brew upgraded mine so that's all good and big news
big news i can't believe it i've been selected i if i go to my github repo i don't have the cool
readme thing that you're talking about but under my picture it says i have a pro account because
i had to pay for some stuff but i noticed that i'm an arctic code vault contributor wow so remember
we spoke about the Arctic Code Vault
where GitHub is taking a bunch of the popular projects
and then like sticking them over in some vault in Norway
or somewhere like that, Greenland, to preserve it.
And if the code that you've contributed to GitHub
was selected, then now you get this cool little highlight
that's like a snowflake that says Arctic Code Vault Contributor.
And you can hover over it and it'll say why.
So yeah, I've contributed apparently to a couple of things.
And you might be as well.
Well, yeah, you, the listener might,
but I just checked mine and I am too.
So that's neat.
Yeah, awesome.
Neato.
Yeah, so.
I think we covered that once, the Code Vault thing.
Yeah, we definitely covered the Code Vault.
But yeah, I think this you are a contributor thing,
little badge is new.
And I don't know, it makes me happier than it probably should.
Yes, it's so cool.
The else is cool.
Yeah, it's super neat.
Testing is cool.
I love testing.
And having good code coverage is cool
yeah so i've got a joke for you a cartoon if you will okay from this place called geek and poke
they have all sorts of good stuff there and uh you can click on the picture and it'll take you to the
the actual comic so there's a two people a woman developer and a man developer staring at each
other and the woman is the more senior one they're
looking at each other and says qa best practices she's looking looking at the guy and says never
just remove a failing test the guy stares back blankly for a second says uh only remove the
assert statements yep says how to sustain a decent code coverage. Yeah, you can fix a test.
You can make a test not fail.
Remove the cert statements.
It's good.
Yeah, that's funny.
You said you actually test for failure, though, on yours,
that they potentially could fail.
Yeah, well, I think that's one of the reasons why we do code coverage on all,
or not code coverage.
We do code coverage, but we also do-
Code review.
What's that word again?
Review.
Yes, code reviews.
Yes, we do code coverage, but we also do... Code review. What's that word again? Review. Yes, code reviews. Yes, we do code reviews on tests
because we have had tests show up before that exercise.
With test equipment, we do a lot of complicated things.
You set up everything, run some stuff,
and then we often have people forget to check anything at the end.
Passes.
So it is important to look at the end to see,
is there any way this can actually fail?
Or is it just exercising things?
I mean, actually exercising things isn't a bad thing
because you can get asserts in your code or accept.
Or an exception, yeah.
You still test something, but you're not testing very much.
Yeah.
You're testing it runs, basically.
Yeah.
Awesome.
Well, yeah, just never remove a filling test, only the search statements.
That's terrible.
We should not give that idea to people.
No, we should totally delete this joke.
It didn't happen.
It wasn't funny.
Yeah, it wasn't funny.
Thanks a lot, Michael.
Yeah, you bet.
Great to see you, as always.
Thank you for listening to Python Bytes.
Follow the show on Twitter at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at pythonbytes.fm.
If you have a news item you want featured,
just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
This is Brian Ocken,
and on behalf of myself and Michael Kennedy,
thank you for listening and sharing this podcast
with your friends and colleagues.