Python Bytes - #75 pypi.org officially launches
Episode Date: April 28, 2018Topics covered in this episode: numba pip 10 is out! * Pandas only like modern Python* Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/75...
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 75, recorded April 25th, 2018.
I'm Michael Kennedy.
And I'm Brian Harkin.
And we've got a bunch of awesome stuff lined up for you, like always.
But before we get to it, I want to say thank you to Datadog.
They're sponsoring this episode like they have many, and they're a big part of why the show keeps going.
So thank you, Datedog. So Brian, people often
refer to the thing that runs your Python code when you type Python as the interpreter, but only
sometimes does it actually interpret. Sometimes it compiles, sometimes it jet compiles, depending on
what underlying thing you're using to run it, right? PyPy versus CPython versus, say, Python.net.
You found it interesting for one like that for this week, right?
I had to go look it up.
I was surprised we hadn't covered it yet.
There's a package called Numba,
and it's built on top of NumPy,
but it's a fairly easy way to speed up a bunch of your code
by adding a just-in-time compiler to it.
And I've also linked,
so I've got a link to the library itself
and also to an introduction article
that I found that's pretty nice.
It's pretty easy to use,
and the article we're linking to
has some speed-up tests
where you just put a little JIT decorator
on a function that you want to speed up.
And there's a bunch of flags to it.
You can try to do it ahead of time or not,
but it'll compile down some of your code to C code as good as it can.
And you can specify data types too if you want to even allow it to be even faster.
And I'm pretty blown away with how trivial it is
to add some of this stuff to your code and speed it up.
Some of the speed ups are like 78 times faster and stuff,
depending on your algorithm.
That is really awesome.
And this thing lives in such an interesting place.
Like when I set the stage, it's not CPython,
but it's also not a full compiler.
It's not a JIT compiler either.
It doesn't attempt to do what, like be a general solution.
Anything like PyPy, P-Y-P-Y, is a general purpose JIT compiler trying to make all your Python fast.
So they say that this is a compiler for Python arrays and numerical functions, period.
That's pretty awesome, right?
There's a lot of people working with that sort of stuff that a lot of the code doesn't have to be
really fast, but then they've got a big blob of data that they've got to deal with and run some
algorithm over it. And in those cases with a big array, that's a great place to apply this.
Yeah, it's really cool. And so if you have tight loops that are slow because of some kind of
NumPy thing or arrays or other numerical operations, slap a decorate on it, make it go fast.
Pretty sweet. Yeah. And we had a comment from somebody that didn't know whether or not we
talked about libraries. They thought we were just talking about articles and stuff. And
I do want to include libraries. And so I've been trying to make sure going back and some of the
ones that people really
ought to know about if they don't know already, we'll try to get those in here and there.
So it's good.
Yeah, sure.
Absolutely.
If we haven't talked about it, it's definitely a game for being a cover on the show.
So speaking of libraries that people care about, what's the most common way to get them?
PIP install.
PIP install.
Well, as of recently, a couple of now you pip install with pip 10 so if you
haven't yet you should pip install dash dash upgrade pip maybe put a pip 3 on the front
depending on what os you're on but yeah so there's a brand new version of pip and a major version as
well that's pretty cool right yeah so what do we get new do you know yeah so we no longer have python 26 support it is out if you're living i don't know
way way in the past like 2006 it was good enough for me it's good enough for my code then you
should stick on pip 9 because it's no longer going to support it it supports uh this new feature
that's described by pep 518 which allows you to specify what packages are required
to build from source.
So that's kind of cool.
You have like a,
you can say these are requirements
to start from source
and these are other requirements.
It improves Unicode on Windows.
It has a pip config command.
I've never used pip config.
Have you?
I don't think so.
I think it's new.
So apparently you can set up
like default behaviors
and stuff like that.
And it can be based on virtual environments or it can be based on users or it can be based
on machines.
So this is probably going to be a nice handy thing to dig into.
Maybe we should do a whole section on just leveraging pipconfig.
Also, if you pip install dash dash upgrade a thing that used to try to upgrade all the
dependencies, if I remember correctly, now the default upgrade strategy will be only to upgrade dependencies if needed.
Right?
So if the thing says I require doc opt six and you have doc opt four, it'll upgrade it.
But if it doesn't say it requires that version, then it'll just leave it alone.
So that could be for better or worse.
I'm not sure.
And then a bunch of bug fixes.
Okay, neat.
Yeah.
So if you're out there and you're not living super far in the past,
be sure to upgrade your pip. It's nice. And there's even been a point release since then. One of the things I really like pip 10, this is a minor thing, is that when you type pip list,
it lists your packages. And it used to, in pip like nine, it would tell you, give you a warning
that there's some, there's a table configuration and you can do the old style or the new style.
And that's gone. And in pip 10, it just gives do the old style or the new style, and that's gone.
And in PIP10, it just gives you the table, and I like that.
Yeah, that's really awesome.
I didn't like that warning there because if you're doing screencasts or videos or anything, it looked like something was wrong, but it wasn't actually wrong.
It was just like, hey, we're changing how this works, so I'm glad that that's over.
Yeah. So when I have functions that just have a bunch of positional arguments,
right? And it's like true, true, false, true. Like what the heck does that mean? Who knows,
right? So you've got a nice way to fix that. That's a modern Python only thing, right?
I'm chuckling because I'm just leaving Michael hanging on the transitions between
articles today, but I'll try to work on that.
Sorry, I got it.
This is actually pretty awesome.
There's a, I don't know how I missed this.
There's an article from Trey Hunter called Keyword Named Arguments in Python and How to Use Them.
And basically, I think he's right.
I was talking to him earlier and a lot of people looked at this article and went, yeah, I know how to use keyword arguments.
So keyword arguments are very useful.
And there are things you always have to name your variables or argument names for a function.
And the caller of the function can use those.
You can specify them.
And if you don't specify them, you have to send them in order.
But if you do specify them, you can rearrange the order if you want. You can also, one of the things that's often used, how they're often used is you've got like a whole bunch of default values for arguments and a caller just needs to override
like one or two. You can just name it that way. But one of the things in Python three is the
different use of the asterisks in all of this. And there's a way to kind of separate your
positional arguments from your named only arguments.
And if you stick a star in as one of the variables, it forces everything to the right to be only named.
So a caller has to provide those with names.
And that's something that I completely missed.
And it's pretty cool.
I like it.
I really like this feature and i implore people who are using star star kw args just and that's what you pass to the
function maybe a star args thrown in there for additional pain and suffering to look at this
and think about uh how you might be able to restructure your code like this so a lot of
things you like you'll have a function and code like this. So a lot of things you're like,
you'll have a function and you want to take a variable amount of things.
And so the way they do it is they say star star KW args.
You look at that signature and you're like,
what do I do here?
I have no idea.
I sure hope the documentation tells me
because the function says I take anything,
you make up a name, I'll take it.
But there's usually like a handful of things
it actually
takes yeah right there's a bounded set of like these are the five things you can pass they're
not all required put them in as keyword arguments right but if it starts our kw args you're just
saying to everybody you know good luck go hunting hopefully i documented this right yeah or you find
something on stack of airflow so you could say star, comma, and then keyword, the argument name
equals the default value, argument name equals default value. And then it has exactly the same
behavior, except you can see both in your tooling or editor, as well as in looking at the function,
what it takes, and you call it exactly the same way. So I really love this feature.
I love it too. And I like your explanation of other reasons to use it.
I'm with you.
The star star, the KW args, it just makes it so the user of that function has no idea what they're supposed to pass in.
The worst offender of this is the AWS API.
It's super frustrating.
Like sometimes it's dictionaries of dictionaries.
Sometimes they're not.
It's just, don't get me started.
And the documentation doesn't cover it, right? The documentation will say, well, this sub part is this name,
but it's this type, which is really a dictionary, but there's no mention or description of what goes in that sub dictionary. It's just, they can be totally avoided by doing cool stuff like this,
which has the same effect. Other stuff that's cool, by the way, is Datadog. So if you have an app that spans multiple processes,
maybe is using different services,
it's a pieced together larger app,
you should definitely check out Datadog.
It's a monitoring solution for providing deep visibility
and it tracks down issues across distributed Python apps.
So it's pretty awesome.
Within just a few minutes,
you'll be able to investigate bottlenecks in your code
by exploring graphs and rich dashboards. So if you want to check that out,
if you think having better insight into how your overall app is working, visualize your Python
performance today. Get started with Datadog, do a free trial, and they'll give you a free
Datadog t-shirt. It's very cute and gray and white, and I like it. So check it out over at
pythonbytes.fm slash Datadog.
One of the things that we haven't talked about for a while is how to install packages.
There's a really great new way with PIP 10 to PIP install them, but there's more to it than that, right?
It's not just PIP install the thing and the client-side behavior is different.
For the first time in a
very long time ever maybe i don't know it goes farther back than i know the history of it
that means something different on the server side so finally finally finally pi pi.org
officially launches yay more importantly legacy pi pi the one at pypi.python.org slash PyPI, that one is now over at
legacy.pypi.org. And that's like an old stick around version, and they're actually shutting
it down. So the pypi.python.org will redirect over to what's called warehouse. That's the code name
for the new implementation running at PyPI.org.
That's nice. I like it.
Yeah, it's super awesome.
This is no longer based on a sort of prototype code
that grew before the web frameworks even existed.
This is based on Pyramid.
It's based on a lot of the common programming APIs
and styles that we know today.
It even uses Kubernetes and Docker
and all sorts of amazing stuff.
It uses Elasticsearch, maybe Postgres.
I'm not entirely sure about that.
So yeah, if people are thinking about adding features to PyPI
and then they've thought,
oh, but this is some pretty gnarly code.
I'm not going in there.
Well, the new version is the official version
and it's much easier to add features to.
You talked with some of the people involved with this, right?
I did.
So I talked with three of the folks that worked on it.
And we talked about a lot of the stories and when it launched and stuff.
And that's coming out on TalkPython this week.
So it's a race.
Does Python Bytes beat TalkPython episode 159?
I'm not sure.
But one of them is going to be just about the same time.
So if you're
interested in this, go listen to it. You'll hear a lot of interesting stories. Like Brian,
do you remember when it used to be there was PyPI.io? Do you remember that in the early days?
And then later it became PyPI.org. I figured out what the story was around that. I thought it was
indecision, right? Oh, we'll do IO. No, no, actually we should make it more organizational we'll do.org right
it turned out that what had happened was pi.org was owned by someone else the psf didn't own it
and they had to try to get a hold of it so in the meantime these pi.io but their ultimate goal was
always to have pi.org it just took them a while to actually acquire the domain so if stories like
that are interesting we got like a whole hour of it on TalkPython.
So people can check that out.
So packages in Python.
We got a theme going on here.
I think we do have a theme.
I got one more package after this, by the way.
So one of the things I forgot about is when you're kind of coming into, there's a lot
of people that, as we know, come into Python from other languages and they'll
rush in and figure out how to do something quickly. And then they think, yeah, I'm a Python
expert because I wrote a script. Isn't people are trying to become experts or anything. But
one of the things that tripped me up right at first is getting my head around what is a module,
what is a package, really how to, how do users make those and, you know,
dealing with dunder and nits and dunder and nit files and on where do they go and all that stuff.
And so there's a nice tutorial now. I think it's at real, yeah, it's at real Python,
Python modules and packages and introduction. And I just really liked this. It's a good rundown.
It's pretty simple.
And this is something I'm bookmarking.
A lot of the people I work with that are new to Python and trying to figure this stuff out,
I can point them to this.
And it's a good introduction.
Yeah, it's a great introduction.
Yeah, and it's definitely that kind of stuff that when you're new,
it's a little bit weird to figure out the difference, right?
Like use packages and modules kind of the same,
but they're not the same.
What's the story?
And especially learning how to package up your project
to make it accessible to other programs is tricky.
This isn't going into packaging itself,
which is a bummer that we have those as two different games.
Really, this is talking about a directory
with a dunder init in it and subdirectories and stuff.
We're talking packages and subpackages.
But it is the initial part.
Once you have this down,
then you can jump into packaging
and distributing code to share.
But this is a good starter.
So it's a good learning place.
Yeah, it's definitely a good learning place.
So remember last week, we spoke about the joint project that the PyCharm team and the Python Software Foundation did with their 2017 Python survey?
Yeah.
Right. data scientists do more Python 3, modern Python that is, than as a percentage of their projects
than say web developers or other types of Python developers. It's partly because they're starting
from scratch, more of them, it's like a more new type of thing. So why start with the old, right?
Well, the big news today is that pandas, one of the major, most important foundational data science libraries,
is going modern Python only at the end of the year.
That's a big thing. Yeah, definitely.
Yeah, I would say this is as big of news as Django 2 going Python 3 only on similar timescales.
Well, they're already out with their new stuff.
So I guess just in terms of their old support ones.
So that's pretty cool, right? One more major, major building block that it used to
be, well, if I moved to Python 3, I won't be able to get my libraries. It's now the stories. If you
don't move to Python 2.3, you don't get the best newest libraries. So pretty awesome.
And it isn't just getting the newest, it's also security updates and things like that.
Yeah. Like part of their announcement, on December 31st, 2018,
pandas will drop support for Python 2.7,
presumably stuff before that's not even supported.
This includes no backports of security or bug fixes, period.
They are open to letting somebody take that
to become their job,
but they're like, we're not doing that.
So we're moving on.
Is pandas like a volunteer thing?
Is it open source just run by open source people?
Do you know?
I think so.
I think it might be part of the HiSci stuff.
HiSci, sorry.
Part of the support thing is, I mean, you can kind of expect some companies to have back,
like, you know, security and bug fixes on old releases for a while.
But, you know, open source projects, these are just people volunteering their time.
I think it's completely reasonable for people to say,
hey, we're done trying to support 2.7.
Let's move on.
Yeah, it makes sense.
So I looked while you're talking.
Pandas is a NumFocus-sponsored project.
So I think it's sort of donation-supported,
but through a more formal science organization.
But still, those people are contributing
features they can either focus on bug fixes and making python 2 stuff work or just moving forward
and getting the latest greatest stuff going so that's pretty cool yeah it's a good focus so
brian that's it for this week for our items got anything in particular you want to talk about with
everyone i've got a couple uh episodes coming out on TalkPython,
and I'm getting ready for PyCon.
I'm looking forward to that.
Yeah, you have a talk going there, right?
That's going to be pretty awesome.
Yeah, I'm getting a little nervous for that, but it'll be good.
Don't worry, it's not like it's recorded,
and then also there's a bunch of people there.
Oh, yeah, it's going to be forever.
No stress.
Just kidding.
Yeah, no stress. Thanks for that.
Yeah, I'm just teasing you because I know you'll do great. We also have a booth there,
you and I and a few others. So people can come visit us at the booth like the last year.
I'd like people to try to reach out if they want to just come on and if there's something they want
to record. I think I'm going to take some recording equipment so we can do some short
recordings.
Oh, that'd be awesome. Yeah, I'm definitely bringing my mic and going to do some stuff
like that as well. What's up with you? Well, remember last time Matt Harrison was on the
show and we were joking about trying to get some course or something other done?
Yeah. Well, it's taken me a ton of work the past seven days, but nonetheless,
we now have a brand new course to announce, Python 3, an illustrated tour.
Nice.
So the idea is it basically only covers the features that were new to Python as a Python 3.
So it says here are a couple of peps that talk about, say, type annotations.
And here's some graphics and other sort of walkthrough of what that means for you.
So if you feel like you are not using all of the features of Python 3,
or you're coming especially from Python 2,
and you're like, I need a quick show me the new stuff sort of in a practical way,
then check out the Python 3 and Illustrator tour.
That's done by Matt Harrison, and it came out really well.
So check it out at talkpython.fm slash illustrated.
Okay, I'll check that out.
Yeah, yeah.
I learned a lot.
It's really, really fun.
We keep learning things.
Like, for example, you talked about the star at the beginning of the function arguments, right?
That kind of stuff is in there.
And I didn't know about that until too recently either.
Okay, cool.
All right.
It's been a good talk today.
Thanks.
Yeah, you bet.
Very nice one.
We really dug into the packages this time, didn't we?
And we don't plan this.
It just happens.
It just happens.
That's right.
All right.
Well, thanks, Brian.
And talk to you all later.
Bye. Thank you for talk to you all later. Bye.
Thank you for listening to Python Bytes.
Follow the show on Twitter via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at PythonBytes.fm.
If you have a news item you want featured, just visit PythonBytes.fm and send it our way. We're always on the lookout for sharing something cool.
On behalf of myself
and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with
your friends and colleagues.