Python Bytes - #58 Better cache decorators and another take on type hints
Episode Date: December 26, 2017Topics covered in this episode: Instagram open sources MonkeyType cachetools Going Fast with SQLite and Python * The graphing calculator that makes learning math easier.* Installing Python Packages... from a Jupyter Notebook Videos from PyConDE 2017 are online Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/58
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 58, recorded December 19th, 2017.
I'm Michael Kennedy.
And I'm Brian Ocken.
And we have a bunch of awesome stuff to share with you.
First, I want to say this episode is brought to you by DigitalOcean, so thank you, DigitalOcean.
Yeah, thanks, DigitalOcean.
Yes, thank you, DigitalOcean.
Indeed. Love their stuff. Tell you more about it later.
Let's start with monkeying around a bit.
I like them.
There's so much monkey.
I'm not sure why in Python.
Here's a lot of monkey.
You're right.
Yeah, we got monkey patching and stuff.
But this is monkey type from Instagram.
And in episode 54, we talked about pyannotate, which is a way to add type annotations to your code while it's running,
but it was from Dropbox, but it was only Python 2 at the moment. This one is a similar sort of
thing, but it's from Instagram and it's Python 3 only. And it doesn't do the comments. It does the
Python 3 style type annotations. So I'm kind of really excited to try this out.
Yeah, that sounds really cool.
You know, I'm definitely heartened to see
a lot of people who have large code bases,
Dropbox, Instagram, and so on,
making these tools.
They're going to bring everybody along
to modern Python really nicely.
It's very good.
And I like the way the types are moving.
I was kind of lukewarm on types for Python at first,
but using it to try to solidify the quality of your code base
for large code bases makes total sense.
And I like what they're doing with it.
I really like adding the type hints.
Just in a couple of places where the autocomplete falls down,
like last week we talked about Mongo Engine.
You do a query for Mongo Engine,
and as far as the tooling,
it's just like that's a random thing that came back.
It has no idea what it is.
But if you add just a few type hints,
the rest of your application can automatically,
well, editor can detect the rest of your application
is working with one of these concrete types.
So just a little type hint here and there
will go a long ways.
And I guess that's probably for new code, right?
But for older code, you want to switch to Python 3 from Python 2,
having that as a solid foundation.
So you really know what you're working with as you move it around.
I think that that's even really valuable right there on its own.
The article we link to is one of the things it talks about is
what happens is you're actually running your code
and it pays attention to which types
are going through different parts of your code.
And the little, you can run it through
while you're running tests,
but it did have a note,
which I thought was interesting,
to say if you have the style of testing
where you're using a lot of mock objects,
the types are going to be all messed up.
So be aware of that. And you may want to generate your types some other way.
Oh, right. Because it will see the mock. And I don't think that's what's supposed to go there.
Not the thing it is mocking.
Yeah, exactly.
Interesting. Cool. So sometimes you can make your code fast by optimizing it. Maybe it talks to a
database and you're going to put the
right kind of indexes in there. And of course that should be fast, but other times you're working
with things that are out of your control. You need to call the web service to get some kind of stock
quote or whatever. And you can only be as fast as that web service or whatever, unless you're
willing to hang on to it for a little bit and do some caching, right? Yeah.
So Python, I don't know how many people know,
but there's something called functools in Python.
And in there, there's a decorator called lru underscore cache.
And you can put that onto any function.
And it will look at the arguments going in there.
And if it sees that argument series again, it can have multiple arguments.
If it sees all the same arguments it's seen before, it will just return that value instantly.
So that's pretty cool, right? I did not know that. That is cool. Yeah. Yeah. So suppose I'm calling
like a weather API and I'm doing it from my website and I've got all these different people
coming in and calling it and it turns out to be slow, I could actually throw that onto it,
and it would say if two people ask for the same zip code,
potentially it's just going to return that instantly, just out of memory.
So that's really cool, but it only works in certain ways.
For example, if that method takes a list,
well, lists are not hashable.
LRU cache thing, decorator requires all the stuff it's going to hold
to be cacheable, for example.
So it's cool, but it's kind of limited in a certain way.
So there's this other project
that's kind of inspired by this idea,
and it goes much further
and has a lot more options called Cache Tools.
I'm guessing you probably haven't heard of Cache Tools
if you haven't heard of LRU Cache.
I have not.
So this is a project
that has a bunch of different cache implementations as well as a more flexible decorator. Actually,
a couple of decorators you can use in exactly the same way. So it defines like a basic cache,
an LFU cache, so least frequently used because eventually your cache may run, you know, get full
and run out of memory, or you can say only hold 100 items.
Well, when you get to 101,
which one do you throw away?
Well, you can use the LFU cache
and throw away the one that was least frequently used
and LRU cache, the one that was least recently used
or have what's called a TTL cache,
which is a time to live,
like cache everything for five minutes.
I'm sure we have the memory for this.
How could that be bad, right?
Yeah. And then you get that call in the middle of the night. Server's down.
The TTL cache, that seems like, for instance, your example of grabbing some weather data or something. I mean, you could hold onto the weather data for at least a few minutes before
refreshing that. Right. And what's really cool about the TTL one is it naturally expires the data
in a way that you can predict and might understand.
So if you're like, look,
the weather's not gonna change that much in 10 minutes,
just cache everything for 10 minutes
and it automatically will go get a new one
after the 11th minute, right?
So that's a really nice way to do it.
So these are cache implementations
and then there's what's called a memoizing.
So memoization is this concept that we've been talking about.
Memoizing decorators, cacheutils.cache, which is like the one we talked about before.
But you can plug in all sorts of stuff.
You can plug in any of the cache implementations we've talked about or even a straight dictionary.
It takes a function that will generate the hash so it can cache non-hashable things because you could generate some kind of indicator like an ID out of a database object or something.
You could pass interesting things like a weak value, a weak ref dot weak value dictionary so you don't actually hold on to the memory of the things, which is pretty cool.
And it even has like a locking object you can pass for thread safety if you got to recreate stuff in the hash
and stuff so really it's like the idea of the lru cache and funk tools but way more flexible
and configurable oh this is nice it's insanely easy to use them right you just throw a decorator
on a slow function and now it's a fast function just make sure you understand what that means
yeah so definitely um in conjunction with a safe measure before you prematurely optimize.
Yeah, it's cool though.
Speaking of going fast.
Yeah, speaking of going fast.
One of the things that people often start with with a new project is to use for a database,
instead of deciding which one they're going to use down the road, they'll throw SQLite in or SQLite.
I don't know how you pronounce that.
But since it's built into Python
and you don't have to install anything extra,
I guess Python calls it SQLite 3.
I don't know, were there two of them before that?
Yeah, I guess so.
That's something folks use
and then they sometimes migrate to something else.
And sometimes there's a lot of applications
that stick with it.
There's an article called
Going Fast with SQLite and Python
that talks about some of the ways this fellow came up with to make it quickly.
Make it run faster.
Yeah, that's great.
So SQLite is really awesome.
Like it's an embedded database that ships with Python.
You don't have to do anything to have it.
You just have it.
It runs in process.
So there's like zero latency over the network or overhead or
anything like that. It's actually really powerful if you're willing to, you know, have a sort of
in process databases. Yeah. And I had the impression that it was simpler than it is, but it
does quite a bit of cool things as I was reading through this. Like I didn't know you could do
user defined functions and you've got control over transactions
and auto commits and things like that. It's gotten to be pretty feature-rich. It's cool.
It's a pretty dense article, but I think it's a good throw yourself into the deep end if you want
to jump into SQLite 3. Yeah, yeah, it's definitely cool. Certainly a good way to get started so you
don't have to worry about extra servers and network connectivity and keeping that safe and all that. Very cool. So before we get on to the next item,
I want to talk quickly about DigitalOcean. This website that hosts the podcast, delivers the
podcast feed, and a lot of the other stuff that I'm doing online runs on DigitalOcean.
Very, very nice experience. They've got incredibly fast, reliable, and cheap servers. $5, $10.
You can have servers
based on SSDs. Really
up and running in 30 seconds.
You just SSH in, get
them all set up and ready to roll. So,
if you want nice, fast servers,
check them out at digitalocean.com
and let them know that we sent you their
way. And they've been really good about sponsoring the show
and I really appreciate that. It's great.
Yeah, definitely. Thank you. Thank you, DigitalOcean.
One of the things
I think many people
got their first programming experience
on is a graphing calculator.
Right? Remember back when you were in middle
school or something and you had like a TI, whatever
you could make it do dumb things?
Well, yeah. I even started programming
my HP calculator.
It didn't even graph.
So I was programming that.
Yeah, exactly.
Yeah, so HP or TI instruments or whatever.
So one of the really cool finds we've dug up recently
is this graphing calculator by this company called NumWorks.
And you might think, okay,
why is this graphing calculator very interesting? Well, the way that you program it is you program it in Python.
Yeah, that's cool.
It's really cool. So the programming language literally is Python and it, you know, will do
sort of visual math. It's got even a free emulator, so you can run it on your Mac or your PC and check it out
there. It does graphing, all kinds of stuff. And then it even has a way for you to work with the
hardware through, gives you like some stats on 3D printing and things like that if you want to
do other more hardware oriented things with it. But definitely this concept of a full-blown
graphing calculator where you program it in Python, that's awesome.
And hackable and all. And yet it's still supposedly it's going to be okay for
use on the SAT even in August next year.
Yeah, that's actually a big deal. Like I know some of the graphing calculators are banned
because people use them to cheat or they do too much or whatever. At least for now, until they figure out how pip install SAT helper.
Yeah, exactly.
Nice.
Yeah, it's pretty fun.
Check that out.
It's nice to see that showing up in the calculator space because that really is like the first programmable thing lots of kids really have to interact with.
We don't really see it too much anymore, but there's, well, in consumer things, but in test and measurement, we see
some programmable features show up in different sorts of devices and having the programmable
language be Python in more places is good. Yeah, definitely. We talked about last week,
having Python be the programming language of Excel, right? For example, it seems like a really,
really great choice for if you want to add a little programmability
to whatever it is you're doing. Python seems like a great choice for that language. So nice to see
it here as well. One of the problems you might have with the data science base, if you work with
the Jupyter notebook and you just have access to the notebook, but you need a library that's not
on the server, what are you going to do? I didn't know this was a problem, actually. I haven't been using Jupyter Notebooks enough to run into this issue.
But a lot of people get their Python and Jupyter from installing a conda package or some other bundled thing.
And you can't just go off and pip install.
I didn't know you couldn't just go off and pip install. I didn't know you couldn't. But Jake Vanderplass,
he wrote up this article
on installing Python packages.
And there's a couple,
I'm not even going to try to write this,
but I pulled out some of the cheats
for pip and conda
and on how to install from within it.
And there's some magical incantations.
But the article also goes through
about all the different reasons why you have to do this.
Yeah, and they're not obvious at all.
Like I would have never figured those out.
Yeah.
So it's good that somebody figured it out.
Yes.
Thank you, Jake.
That's awesome.
So if you're doing data science and Jupyter Notebooks, this is really, really cool.
So the last thing I want to share for us is the videos from pycon de as in deutschland 2017 are online
so miroslav uh let us know on twitter about this and there are a ton of interesting talks
over here so quite a bunch of of cool talks i'm not sure how many but i would i don't know guess
50 or so are they all in in German? Here's the thing.
They're as far as I can tell, I've only seen English ones and I looked through a bunch of
them. So there's a cool talk called technical lessons from Pythonic refactoring at Yelp by a
woman named Jenny Chung and a bunch of other ones. It's kind of hard to read all the titles right
here, but I've looked through and I'm definitely filling up my playlist with stuff that I need to start watching because
there's a lot of cool stuff here. Eve Hilpisch, who was on my podcast, talks about why Python
has taken over in finance, for example, right? And we don't even have it in Excel yet. So there's
lots of cool stuff here. It was in Karlsruhe in Germany, which is a lovely place.
I wish I could have gone to the conference, but second best thing, watch it online.
I'm really glad that we have those.
Can't wait.
Yeah, all the PyCons do such a good job of getting their content online straight away,
you know, within a day or two of the presentation.
So it really makes a big difference, especially since they sell out.
Yeah, definitely.
You were definitely going to be at, I'm going to be at the one in Cleveland, PyCon US.
Are you going to make it?
Yeah, I think so.
That's the plan.
That's the plan.
It's going to be fun.
All right.
But for now, we'll enjoy the ones in Germany.
Any news?
End of the year?
End of the year.
Yeah.
No, I'm trying to come up with some fun Python projects to work on in my free time.
Get a Raspberry Pi and do something with it.
Maybe plug it into Home Assistant.
Things like that.
But I haven't done anything.
Yeah, me either.
I've got two of them sitting there, but yeah.
If you could just put some code on them and make them do some cool stuff.
Right on.
All right.
Well, Brian, thanks again for sharing the news with everyone.
It's great to chat with you as always.
Yeah, thank you.
Thank you for listening to Python Bytes. Follow the show on Twitter via at Python Bytes. That's Python Bytes as in B-Y-T-E-S. Yeah, thank you.