Python Bytes - #266 Python has a glossary?
Episode Date: January 13, 2022Topics covered in this episode: Python glossary and FAQ Any.io Vaex : a high performance Python library for lazy Out-of-Core DataFrames Django Community Survey Results * Extra, Extra, Extra, Extra:...* Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/266
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 266, recorded January 12th, 2022.
I'm Michael Kennedy.
And I'm Brian Ocken.
So great to be here again.
And we had this whole survey about having guests, Brian.
And this week, we don't have a guest. It's just you and me, which I think is cool.
That's all right. It's good.
People out there listening, if they really want to be a guest they can uh shoot us a message for now we've got
so much cool things so many cool things to speak about we're gonna need like a glossary or an faq
or i mean something yes well um actually i don't know how i missed this uh along for so long but
um there was a a tweet by um who was it? Trey Hunter had a tweet that
mentioned and actually referred to the glossary. And I'm like, what? We have a glossary? I never
checked it out before. So on the python.org website, there at docs.python.org, there's a
glossary and it's actually pretty cool there's a whole bunch of stuff like
if you if you forget what abstract places base classes are it's there and there's so there's
python stuff there's programming stuff and even defines what the three arrows mean yeah like the
the three errors that's the first one default python prompt but also the dot dot dot what is
that the ellipses.
And two to three.
See, this threw me off once when I first started.
I was like, what's this two to three thing?
Is this a third party package?
And it wasn't obvious to me that it was built in.
So that's kind of neat.
But it shouldn't be an issue anymore because everybody's on Python 3 now, right?
So anyway, so the glossary, just a shout out that this is here
it's fun so check it out um the other thing that this it refers to the other documentation in python
a lot um and uh one of the things it refers to sometimes is the faq and also didn't know that
was there and we have an faq yeah um and it's split into a whole bunch of stuff like general Python and programming and history and design and stuff.
And I ran across it because one of the things I looked up when I followed from the glossary was this question of what's the difference between arguments and parameters?
And it's something that I've always messed up.
And now I think I have it um parameters are the names
of things that appear in the function definition and the arguments are the values get passed in
neat um don't know why sometimes people use them interchangeably but they kind of talk about
different um ways of working with that data yeah but like let's say you're new um either new to
python or new to programming.
Some of these perusals,
some of these are great things.
Why did my changing list Y also change list X?
Well, this will help you understand
why there's the naming system in Python
and stuff like that.
So it's pretty great.
Yeah, it talks about references
and all sorts of stuff.
Yeah, quite cool. I didn't know that we had it, but yeah, that's, that's cool.
You did know it was there? No, I did not know it was there. That's great.
Yeah. I mean, I didn't, I didn't know anything about it. So I want to talk about something else.
I want to talk about any IO as I'm sure you and a lot of listeners know, I'm a big fan of
async IO and async in a wait. I think it really
unlocks a lot of potential when you're waiting on things. There's been a lot of analysis saying,
oh, I did this computational thing and it didn't make it any faster. It made it slower. It's like,
yeah, because it only scales waiting and you're not waiting. So when you're talking about waiting,
it usually has to do with IO with external systems, right? I'm waiting on the file system.
I'm waiting on the database.
I'm waiting on whatever.
So there's this cool library called Any.io. So I indirectly learned about this from Sebastian Ramirez from FastAPI
because he talked about this thing called Asyncr,
which extends a few things that are ultimately probably going to make it back to Any.io.
So Any.io is an asynchronous networking and concurrency library that works on top of either Async.io, which is the one we all
know and love, or Trio, which is similar to Async.io, but it has a larger, it has more of an
understanding of dependencies between tasks and things, how you can say, I'm going to create a
set of work that
is made up of these tasks. And this task is actually a child of that other task. So if I
cancel the top level one, cancel its children, it's a little bit more complicated, but it solves
this structured concurrency story that people sometimes need. So you can use this to get some
libraries that will do nice things with stuff you might wait on, right? So some of the
features include there's task groups. That's the thing I was describing with parent-child
relationship type of things. With Trio, it has high-level networking, TCP, UDP, an API for byte
streams and object streams, inter-task synchronization and communication, like locks and conditions and events and semaphores,
worker threads, sub-processes, all kinds of stuff.
So you go over and you can sort of see
some real simple ways for it to run.
So one of the things that's sometimes not entirely obvious
is how do you run something on AsyncIO?
Because you've got to make sure
you've got an Async IO event loop running.
And if there's already one, you should call get loop.
But if it's not one, you should create one.
And so this is just, you know, I have an async method, which can be a task and just say,
you know, trio.run.
Or you can run and just say the back end is trio, which is pretty cool.
So all sorts of cool stuff like that.
And it just sort of simplifies working with these different things.
If we go and look at the sockets example, you can just say await async with await connect TCP.
And that's allow you to do like await receive, await send, and so on.
So some nice libraries that come out of NAIO for doing TCP, UDP, all that kind of stuff.
You know, the things you would wait on.
Yeah.
So if you know you're going to use async IO,
would this buy you anything?
I think that it has those additional higher level libraries
for like talking to TCP and byte streams
and stuff like that.
And also the subprocess thing.
So I think it does have like some utility stuff on top
of it but it's pretty cool you can say like a wait run sub process which is pretty cool that's
actually that's really cool yeah that's i've not seen this one before and that one kind of makes
me excited now yeah that's cool yeah nice cool so not a whole lot more to say about it than that
but if those are the types of things you're doing then you know come check it out it's a cool library
do you know what else is cool?
I do not.
Tell me about it.
Oh, I thought we were doing something else.
Wait.
Oh, yes.
I've got one more thing to talk about before we move on because we have a different number of things.
I'm not sure what we're slating.
I'll slot it in here.
So what else is cool is that this episode
is brought to you by Datadog.
Thank you, Datadog, for supporting the show.
They've been big supporters of Python Bytes
for a really long time.
So that's fantastic.
Plus really great t-shirts.
Exactly.
They've got cool t-shirts.
I mean, I definitely want to get one of those.
So Datadog does a lot of things.
One of their things they're focusing on now
is real-time monitoring.
So they have a real-time monitoring platform
that unifies metrics, traces, logs into one tightly integrated platform. Their APM empowers developers to identify
anomalies, resolve issues, and improve application performance. We just finished the TalkPython
episode talking about running production and everyone there on the panel was like,
you need to make sure you're monitoring in production for things that change
in your performance profile, because you get too much data as your infrastructure changes,
as the way your app is being used changed. It could hit these scenarios and run into problems
that you would just never see in testing. So if you had Datadog APM, you would have caught it.
So you can begin collecting stack traces, visualize them as flame graphs, organize
them into profile types, such as these are the CPU metrics, these are IO and so on. Teams can
search for specific profiles, correlate them with distributed traces if you're doing microservices
and identify slower underperforming code for analysis and optimization. Plus with Datadog's
APM live search, you can perform searches across the full stream of
ingest traces generated by your app over the last 15 minutes. So try Datadog APM for free with a
14 day trial. And if you do, you get that t-shirt that Brian mentioned. So just go to
pythonbytes.fm slash Datadog or click the link in your podcast player show notes or in this chapter.
Remember we talked about chapters and links. I'll have this have a chapter as well so thank you datadog for supporting the show now
let's talk about your next item brian yeah i think it's vax vax vax i don't know
oh um there's there's people are gaining traction for the idea of putting a pronunciation
on a github repo for projects that are not obvious.
I saw this on Twitter.
Let's do it.
Let's make it happen.
So this was suggested by Glenn Ferguson.
This is a library that's a high-performance Python library for lazy,
out-of-core data frames.
Hmm.
I don't know what out-of-core is.
So I looked it up in a glossary.
After the FAQ.
Yeah.
Out of core typically refers to processing data that is too large to fit in the computer's memory.
So, yeah, that's what this is.
So for data processing, often you're trying to do some analysis, do some statistics, maybe explore the data a little bit, but you don't
want to read it because they're huge data sets and you've got like maybe a limited computer.
And so that's what this is set up to do. The main features of it, so you've got like big data sets,
it has statistics like mean and sum and count and standard deviation, etc. But it also has some visualizations that are sped up from how they've sped things up and not kept things in memory.
And they're using memory mapping and some tricks inside to try to avoid any memory copies and try to do it as lazy as computation as possible.
And this is actually pretty impressive. I was watching some of this, some of the demos.
So there's a SciPy 2019 video where it's the person that started this library,
which is now a company also, but does a demo of this.
And it's really impressive how fast things are.
He's pulling things up.
Because of the memory mapping, you can even have multiple, you know,
multiple Jupyter notebooks.
Yeah, that's it, multiple Jupyter notebooks looking at the same huge data set,
and it doesn't slow things down even when things are working on it.
It's pretty neat.
So I definitely think this is worth checking out one of the things on
the readme that i like is the key features so it's a instant opening of huge data files because
it's memory mapping the data file it actually doesn't read it doesn't do any reads when you
read it but when you pull some data out it does lazy reads um jumps ahead and it's it's pretty
impressive so uh this also has an expression system so that it's kind of there is a
little bit of a, so you can lazy transforms of data.
So that's neat out of core data frames, like we said,
fast group by an aggregations a whole bunch of the,
the fast and efficient joins are interesting.
I was watching looking at another comparison of pandas and dask
and other things versus vax and it uh the joins of huge tables are pretty fast and seamless with
here and those will blow up some projects so um yeah this is yes it is similar to Dask. Somebody asked, lazy like Dask?
Yes, but.
That's a good thing.
Yeah.
Oops.
But it, yeah, a bunch of fun things.
It's good to have, it isn't the same as Dask,
so it's worth checking out to see if maybe this one
might be a good fit for you.
Yeah, it's cool.
It's the lazy that makes the magic, right?
You don't have to load it all from disk.
You can distribute it.
There's all kinds of interesting things.
In the a billion sample per row operations per second,
that sounds pretty good.
Yeah.
Watching the demo, it's incredible how fast he's
popping up things and loading, even to be able to visualize things by pulling out samples out of the set.
Wait a minute, Brian.
I heard people told me that Python was slow, so it didn't make sense to do this kind of stuff with it.
What's going on here?
No, no, no.
Python's fast.
I know.
Pick the right libraries.
All right.
One of the things that is definitely well known
in the Python world is Django. I've even had people tell me I came to become a Django developer.
And so I had to learn Python, which is a really interesting perspective. So I want to talk about
the Django developer survey results for the 2021 survey because that just recently happened. So
I'll highlight a couple of things that are interesting over here. One of the questions was, what is the main reason you use Django?
Is it both for work, personal or for work? Only 15% said just for work. Does that seem like a
lower number than you expect? Yeah. Yeah. I thought more people would just like, they'd go
to work and do Django and they'd go home and they'd, I don't know, watch Game of Thrones or
something. But Django developers love it. And they use it a lot for all sorts of things.
So by far, the biggest group here, 66% is for using it for both.
So that stood out to me.
Another one that's interesting is how many people are on the latest version.
So web apps often sort of get stuck in the past
because once you get them up and running, people don't want to touch it.
But 75% of the people are using the 3.2, which at the time of asking, I believe was the latest version.
Okay. I'm like, I thought we were up to four now. What's going on there?
Four is in beta. I'm not 100% sure. I don't think it's totally released, but yeah, this is still,
remember it's from 2021, 2021. And then also Django has this concept of the latest stable
release and then a long-term support release. So if you this concept of the latest stable release and then a long-term
support release.
So if you go to just the latest stable release and it's not LTS, you may have to upgrade
sooner if you want security fixes and so on.
And yet 71% of the people use the latest stable release because they're upgrading frequently,
I'm guessing.
And then 27% are on the latest LTS and 2% are just like, how do I upgrade this again?
I don't know,
but that's, that's pretty interesting. And then the next question was how often,
so 44% of the people upgrade every stable release, other people less so, and it kind of
breaks down 5%. I use an unsupported version of Django. I'm okay with that. Databases for people
doing Django is a very strong bias to use a relational database because
much of the magic of Django depends upon the Django models, right? Like the admin section
is driven by that and so many things, and those are all relational. So with that in mind,
the most common database, 77% of the time is PostgreSQL, which is cool. And then does number
two there surprise you, Brian no not really sqlite
yeah if you got very simple deployment stories you're just going to put it on one server um
not much data you just want to need something relational sqlite well a lot of internal tools
and stuff too exactly i mean you wouldn't run like a major tech company on sqlite get away with it
without scars and tears
but you know for simple internal apps that might just be what you need you're gonna make some
sqlite enemies by saying that but yeah but but if you had a hundred thousand users concurrently using
sqlite that might be bad oh somebody else said um uh possibly because sqlite is the default setting. Yeah, certainly that's a big push.
The other one is, do you do caching?
So caching is another layer between the database and your web app where you get the database stuff back and then you stash it in the memory somewhere so that you don't have to do queries again.
So they said, do you do that?
And if so, what do you use?
47% Redis, 43% you do you do that and if so what do you use 47 redis 43 i don't do that and then the only other really notable thing is memcached uh so interesting there and i guess
people if they're really interested they come through and look there's a lot of i don't want
to go through it because there's so many details but but it's like, what are your favorite components? Like models or admins or auth or what contrib apps do you find most useful?
Like humanized or whatever.
So pretty interesting.
No surprise, people are using Django templates,
not Jinja as their main templates.
And then look, it's a race between PyTest and UnitTest
as the top two most common frameworks.
With PyTest above UnitTest, that's pretty cool. Especially since UnitTest as the top two most common frameworks. With PyTest above UnitTest, that's pretty cool,
especially since UnitTest is the default.
Yeah, yeah, absolutely.
Let's see, I'll just wrap it up with some front-end stuff.
What JavaScript front-end frameworks do you most use?
jQuery, number one.
And I don't mean that with a negative way.
Like sometimes you just got some simple problems
and you don't need a whole CLI to build a spot to like you know focus the text box all right uh react is tied at 37
as well and then view and then angular and then wow htmx made the list look at this that's pretty
cool actually that's brand new shininess getting in there that's pretty cool but still yeah and
then css we got bootstrap way out front and then Tailwind and then pure CSS.
All right.
So that's the survey results.
Pretty interesting.
Nice.
All right.
What do you got next for us?
Next, we've got more extras.
We've got extras.
Okay.
Yes.
Extra, extra, extra, extra, extra, extra.
So I've got so many extras, I decided to make it one of my topics.
Brian, got anything else before I go on another rant?
No, I'm just ready to listen to all these extras.
All right, I got a bunch of good stuff.
So don't let the bad guys into your web apps.
Django just had security releases for 4.01,
sorry, 4.01, 3.2, and 2.66.
Oh, does that mean 4.0 is out?
Yeah.
It does look like 4.0.
Nobody's using it.
Yeah, well, they didn't use it in the past when it wasn't out.
Paul Everett and I teamed up to create a course over at TalkPython called Static Sites with
Sphinx and Markdown.
So this course is free.
Everyone can go take it.
All you got to do is have an account and go here and it teaches you how to do Markdown and Sphinx
and generate static sites.
There's a cool little demo app that we build over here
that you can go and do search and look around
and see how you document your code
and do all kinds of stuff.
It's nothing too complicated,
but sort of neat to see how to use Markdown with Sphinx
because typically Sphinx is about restructured text.
So check out the course over there. I'll put that in the show notes. I'm going to definitely check that out because I've got a project that I wanted to use Sphinx for, but I was a little
intimidated. So cool. Yeah. Paul does a great job with it. So, and it's only an hour and 25 minutes
or something. So it's, it's not a huge investment in time. Something that's bothered me basically ever since USB-C, what is this,
four years or something, is I need more ports on my computer and I want them to be USB-C ports
because I have USB things these days because I want them to go into the ports that I already have.
Until Thunderbolt 4, you've not been able to get a dock that has more than one USB-C or Thunderbolt port, which is super weird to me.
But recently they've come out with Thunderbolt 4
and I just got this thing called
the CalDigit Thunderbolt 4 USB 4 Element Hub.
Oh man, this thing is fantastic.
Brian, I'm talking to you on my computer here
and I have my 4K monitor, my 1080p camera,
my microphone, my stream deck, the lights, keyboard, mouse,
track, like seven different things, including the monitor plugged in with one cable through
this thing.
That's really pretty cool.
And so sweet.
So basically it has on the front, it has three USB-C Thunderbolt 4 and a power in.
And then on the side, it has the Thunderbolt that goes to the computer
and then also four USB, high-speed USB-A,
but the good ones.
So really, really cool
if you need to expand out your new-ish computer.
What are you using to plug into the monitor then?
I have a Thunderbolt 2 DisplayPort adapter.
And so that way, if I come with my new MacBook,
I can just unplug one thing from my mini,
click it over, and then boom, I'm ready to go.
Everything's configured.
I'm going to get one of these then.
Yeah, they're not super cheap.
They've been out for about four or five months,
but they've been sold out supply chains.
You know, what time of,
what's going on with supply chains, everything.
But they finally came out, they're on Amazon.
So I linked to it over on Amazon.
I also linked to this video that by Doc Rock
talking about like, what the heck is this thing?
And why is it different?
All right.
I also tweeted about how we use the stream deck
to do our live stream, which was fun.
So I shared a bunch of pictures of that,
like how we like put the website.
So it says how it's streaming,
how we tweet automatically,
how we do this sharing and all that kind of stuff. I'm now going to be working on how to use that
thing for software development. Like how do you use it for Jupyter notebooks? So every button on
the stream deck, which is 14 free buttons, basically like how, what are the 14 Jupyter
operations you'd like to have? Like run all cells, give you a button or, you know,
format with black could be a button,
all sorts of stuff.
So very cool.
Oh, you just have a black button with no logo.
Yes.
Yes.
That should absolutely be black as well.
So anyway, people are interested in that.
That's there.
I did a talk at PyBay quite a while ago.
Now the talk is out.
Carson was kind enough to retweet that and pointed out that, hey, the talk is out um carson was kind enough to retweet that and pointed out that hey the talk is
actually out so i'm linking to my pi bay talk which was an in-person talk at a conference imagine that
wow um in san francisco that was really fun people can check that out speaking of conferences we are
a media sponsor of python web conference and so uh you can definitely check that out this is one of
the honestly becoming one of the, honestly,
becoming one of the bigger online conferences.
It's five days, all day.
You know, a lot of these online conferences are like,
oh, half day, a little thing here.
So a lot of tracks, a lot of things going on
with the Python Web Conference.
I'm also speaking there as well.
Are you speaking there?
No.
I'm off to get you pi testing something up there.
Well, I should probably do some web stuff.
Yeah, absolutely.
Absolutely. And so there's probably do some web stuff. Yeah, absolutely. Absolutely.
And so there's a code that you can use.
It's in the show notes, Python Bytes at pwc2022,
and I'll give you 15% off.
Also, in our neighborhood, sort of, because it's virtual,
does that still have meaning?
We have PyCascades coming up,
and PyCascades is February 5th and 6th.
So that's going to be remote.
So it's not really local until things settle down.
So people wherever can take it.
Well, it's in our time zone.
So it's good.
That's true.
Time zone still matters.
It absolutely still matters.
All right.
That's it for all of my extras.
And we have Patrick out in the audience
pointing out PyCon Italy is also happening in June so that's fantastic
yeah
awesome so before we get
to the joke I wanted to
like ask you this brain teaser that
like my daughter brought
on the spot so she
yeah I'm totally putting you on the spot
so she came home
I don't know she's in junior high
she came home and she said we have this She's, she's in junior high.
She came home and she said,
we have this cool brain teaser.
I'm just want to ask you,
just tell me what you think.
So it's,
it's a math problem.
So I'm going to go out,
I'm going to buy a baseball bat and a baseball.
The total for the,
both the baseball bat and the baseball are a dollar 10.
It's pretty cheap.
Yeah.
The difference is the baseball. The baseball bat is a dollar more than the baseball so how much is the baseball so i'm not gonna you don't have to answer it right
now um but it tripped me up for a little bit i'm like why is this difficult it turns out it's it's
like five cents uh because of five cents plus a dollar five is dollar ten but my brain went
it's a dollar it's $1.
It's $1.10.
That's what mine said, yeah.
Yeah, but that's a 90 cent difference.
I don't know why this is difficult,
but it's a fun brain teaser to ask people.
Indeed.
Ha, funny.
Very cool.
Well, you know what else is funny?
That feeling of joy that we get as software developers,
but is mixed in with,
I kind of remember myself
screaming at my computer yesterday,
like out loud,
because something was so frustrating.
I was just like,
how is it possible that this is not working?
Like what is going on?
It wasn't actually about programming.
It was with some app or something.
It was, it's always something else.
Yeah.
Yes.
Sometimes it's my fault,
but anyway,
so the joke is expressing that feeling and it's the sticker says, I hate programming. I hate programming. I hate programming. It works. I love programming. This is amazing. It's like, it's like childbirth. Like you forget all the horror and pain. Like, oh, look at my amazing app. Like, do you remember that you cried for two days because like you couldn't get it to query the database right in production yeah and you love right but you love it because now it works so i love this um
i was there this morning i was like fighting jenkins of trying to create a jenkins job with
four repos and different branches and i just like i i hate jenkins but but if it works or when it works, when it works,
I'm like, sweet.
I am the smartest person in the world.
I'm ready to do this all the time.
Fantastic.
All right.
Well, it never ends.
It never ends.
We've been doing this a long time and we still have these feelings, don't we?
Yeah.
So, uh, well, yeah.
So thanks a lot, Michael, for, uh, another great show.
Yeah.
You as well.
It's always fun.
Thank you everyone for listening.
Catch you all later.
Thanks for listening to Python Bytes.
Follow the show on Twitter
via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
Get the full show notes
over at PythonBytes.fm.
If you have a news item we should cover,
just visit PythonBytes.fm
and click submit in the nav bar.
We're always on the lookout
for sharing something cool. If you want to join us for the live recording, just visit pythonbytes.fm and click submit in the nav bar. We're always on the lookout for sharing something cool. If you want to join us for the live recording,
just visit the website and click live stream to get notified of when our next episode goes live.
That's usually happening at noon Pacific on Wednesdays over at YouTube. On behalf of myself
and Brian Ocken, this is Michael Kennedy. Thank you for listening and sharing this podcast with
your friends and colleagues.