Python Bytes - #45 A really small web API and OS-level machine learning
Episode Date: September 29, 2017Topics covered in this episode: pico High Sierra ships, first major OS with machine learning built in? A guide to logging in Python Let me introduce: slots pipenv revisited Stack Overflow gives an ...even closer look at developer salaries Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/45
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 45, recorded September 27th, 2017.
I'm Michael Kennedy.
And I'm Brian Ocken.
And as usual, we got a bunch of cool news items lined up for you.
So, Brian, let's first say thanks to Rollbar.
Yeah, thanks Rollbar.
Yeah, thanks for sponsoring this episode.
We'll tell you guys more about Rollbar if you don't know about them later.
But let's start with something super small.
Like I don't want to start anything big.
This was recommended by a listener, Ivan.
I'm not going to try his last name.
But thanks, Ivan.
A little micro framework called Pico.
And there was a lightning talk given at EuroPython 2017.
And we have a link to it.
But this is just a very simple, very, well, I don't know
how simple the code is. I haven't looked, but it's simple to use. It's a little web framework that
you can use for actual web pages. It does have some CSS and JavaScript serving, I think. But
the main idea of it is a very simple, easy toto-use web framework for people that are not web developers. So let's
say, I think it was developed in a scientific community. So people that can just hook up,
you really, it's really hook up a web endpoint with just a decorator that says pico.expose,
and you've got a function, and there you've got a web service you can use. So it's pretty amazingly simple.
Yeah, it is quite simple.
And one of the things that is unique about it, well, relatively unique, is that it comes
with a JavaScript client that automatically generates a proxy for objects described in
your API.
And that's pretty trick.
Oh, I missed that.
That's cool.
Yeah, isn And that cool. So instead of having to like define the rest call and then actually just do that,
you know, like with direct Ajax calls, whatever framework you're using that, how you do that,
if you have something that has like a hello function and takes a string, you can create
one of these, one of these clients and just say ahello and pass it a string.
And then it gives you a promise,
which is really cool.
I think that's kind of unique.
Yeah.
It's one of the simplest,
very little boilerplate
you have to throw in some code.
I was looking at this
because if I had some services at work
trying to pull out some database objects,
I think non-developers
could maintain it fairly well, okay?
I mean, probably not non-developers,
but not web developers. Right, right. So I think it's, I think it's pretty interesting. I actually
haven't heard of it, so I don't know like how durable it is, how good it is for building rich
applications. I have lots of requirements, but it looks pretty cool to me. It's definitely worth
checking it out and it's small and easy to get started with. So that's always nice. Like there's
not a lot of mental overhead to use the thing, right? Yeah, and the link we have, which thank Ivan for this also,
is to the exact part of the lightning talk.
So it's just a few minutes of one of the maintainers
talking about this project, and it's a really good overview.
That's cool. You always hear the, why did I build it?
Yes, I know Django REST framework and other things exist. I still built it. Things like that, right?
Yeah. And it's specifically not REST compliant, but for a lot of cases, you don't really care.
Yeah. It's interesting. It's almost more like a traditional XML web service
proxy type looking thing. Anyway, pretty cool. Definitely check that out if that sounds
interesting to you. So I got a question for you and everyone. Brian,
have you installed Mac OS High Sierra? No. It came out yesterday. I already installed it. It was a
bit of a risk, but I had stuff backed up, so why not give it a shot, right? And we're talking today,
so apparently it went okay. And we're talking on the same computer I installed it on, so it all
went okay, and it was all pretty smooth and seamless. So super excited to have a new Mac OS,
but this one is actually more like a
foundational release. So there's a bunch of underlying systems and things that have been
changed to make it able to build more cool stuff. So like one of the popular things people will
talk about maybe is APFS, a new Apple file system that is like a modern built in 2017 type file
system, not like, you know, 30 year old file system. So really, really cool type stuff like that. But one thing, the reason we're talking about on this show is it comes with something
that I think is actually a kind of a big deal. It comes with something called core ML. So that's
cores, like all the systems are like, you know, core storage, core, whatever, right? Core ML is
core machine learning. So here with the latest Apple operating system maybe the first major os to come with like
built-in machine learning wow is that crazy or what well the core mel is a set of apis that you
can use and basically it packages up a lot of the stuff that they're already doing anything like
photos where the photos can like identify you know So you can say, show me all the pictures of Brian.
And it would just like find those magically
in all my photos.
Siri, text to speech, all those types of things, right?
So they want to make it possible
for you to use some of those.
So basically with CoreML,
it comes with pre-built machine learning models.
You can create your own models
and then package it up with your app and send it on
so you could train it to do whatever.
And they even offer some default ones.
It's pretty cool.
Yeah.
Yeah, so another thing that's pretty sweet about it is it will use,
basically on any of the Macs from 2012 or later,
it will use a mix of CPU processing and GPGPU processing,
depending on the task, and it will just figure that out for you.
So this whole, do I use them?
I'm guessing that makes it slicker.
Well, how many cores does your Mac have?
I have no idea.
Probably four with hyper-threading, right?
Probably.
So it's either two or four plus hyper-threading,
which would double that, right?
Some of the GPUs have like thousands of cores, thousands.
So if you want to do something in parallel,
which a lot of machine learning is,
if you have either eight cores or 2,000 cores, that's a big difference.
So it's really cool that that's built in.
Yeah, anyway, so I think this might be the first major OS
to come with machine learning built in.
It's just a sign of the times, right?
All right, you probably got to log your code
and figure out what's happening when your machine learning models
don't do what you want, right?
I don't have a list, but we've covered several simple logging modules
on the show so far.
But this right now, we're just talking about plain old logging, the built-in logging library.
Am I getting that right?
I think it's just logging.
Yep.
Just the logger.
Yep.
Logger.
Import logger.
Yeah.
The reason why I haven't really used it too much before, to be honest, is I have had trouble
getting my head wrapped around all the
little pieces. And it's a fairly complex module. And for good reason, it does a lot. But this is
the first I'm pointing to a blog article called A Guide to Logging in Python. And it walks through
using logging very simple, and adding on, changing our mental model
to include all the different pieces
like logging file handlers and memory handlers
and filters and all that stuff.
And it's the first time I've read about logging
where from start to finish, I wasn't lost the entire time.
So it's a good introduction.
Yeah, it's cool.
And it talks about why not just do print,
right? There's all sorts of things like multi-threading support, categorization,
and different logging levels, time rotating files, all kinds of stuff better than just print. So
yeah, this is really cool. I do feel like there's a lot of configuration and stuff in the built-in
logging module that kind of tries to do everything. So it can make it tricky, and this is nice.
Yeah. And there's some things that it does that I didn't even know it did.
I didn't know it does automatic file
rotation just built in. That's cool.
Yeah, that's really nice. Anyway, if you're
trying to figure out the built-in
logging module, check this one out. I can tell you that
time-based rotating file is
important when your website generates gigabytes
of log files. You don't want that to be
one file.
Speaking of websites, it kind of sucks when your websites crash for your users, right?
Yeah. They don't like it, but they might not tell you. They might just go away.
So that's why you want to get Rollbar, right? So like we said, Rollbar is supporting the show.
Visit them at pythonbytes.fm slash Rollbar, and you can install it in just a few minutes.
PIP install Rollbar, few lines of configuration,
and all the errors in your website are captured with tons of detail.
Things like local parameters, arguments passed to methods when it crashed,
all that kind of stuff.
Notifications, Slack, email, whatever.
It's beautiful.
So definitely install that if you're running a web app based on Python.
So speaking of web apps, you might care about memory, right?
A lot of times one of the things that puts a lot of pressure on your web apps
is not the CPU, but it's actually memory.
And I'd say that's true certainly for my web apps.
It seems like memory is more of a pressure than CPU by quite a bit.
So one of the things that I thought was interesting
is somebody wrote an article called Let let me introduce Dunder slots.
So slots are alternative backing store for class data, I guess is maybe the simplest way to put it.
Have you played with these, Brian?
No, I haven't.
This is really crazy. when you create a regular python class and you implement a dunder init and then there you say
self.name equals something passed in self.right email equals some email address packed passed in
that goes into dunder dict right like each instance of that class has a dictionary that
has the name email the name name and then the two values that you passed in and every instance of
the class gets a separate instance of the dictionary they're one-to-one that makes it be super easy to do
lookups right order one it's super easy to make it dynamic like if you just interact with the
class and you try to add new stuff to it it just goes into that dictionary so that's cool
but what's not so cool is if I have 10 million instances of that class,
I have 10 million copies of that dictionary,
which has 10 million strings, each one that says email,
and another 10 million strings that say name.
Why do I need to store those?
I probably don't, right?
If I'm really not going to be dynamic, I probably don't.
So you can use this thing called dunder slots,
and you would say the slots of this class are name and email.
And then that slot is stored on this class are name and email. And then that, that slot is
stored on the type, not the instance. So instead of having 10 million names and 10 million emails
in terms of the field name, right? You just have the two and otherwise they're just stored like
in a, like an array and a positional thing. So super good for performance. Like the test they
did in this article, 57% less memory usage just
by adding that one line. And it's a little bit faster for access, but it's definitely better
on memory. Can you use both? No. Well, you could still do the self dot whatever and assign to it.
But basically if you try to assign to something that's not declared in the slot, it'll say it
doesn't have that property. It wasn't pre-allocated in the type, basically, or predefined in the type.
So yeah, it's pretty cool. I actually go into this in depth in my Write Pythonic Code course,
and you'll see that this is even better in terms of memory than an unnamed tuple. You wouldn't
think you could do better than an unnamed tuple for space, but this is actually even better.
And you get all the type class inheritance behavior that's that you'd expect very cool seems like um more of the mental
model of classes i have in the first place yeah yeah for sure it's very much like uh c++ c sharp
traditional like these are the things that are in here and they never change a static language
type of thought of a class yeah well i'm definitely gonna have to go and re-watch
your seasoned developer course and uh and do these again yeah it's pretty cool yeah it's super easy
like you shouldn't use it all the time but when it makes sense it can save you tons of memory
well that's cool hey a long time ago in episode 11 we covered pep end from kenneth and i always
get his name wrong so you say it kenneth Kenneth Wright. Okay. Maybe that was one of like the 10 things he did that week. I don't know.
Yeah. So he's been doing a lot, but the, uh, one of the things that the first time I took a look
at this, I gave it a honest college try and it just, to just be honest, it was, I didn't know
why I needed it. You're like, I already got this covered. Whatever. Yeah, I already got this covered. But one of the things that changed
my mind is not too long ago, he put up a video.
And so if you go to docs.pipenv.org, there's a four minute
screencast of him just using it. And that
video got me convinced. I'm like, oh, wow, this is really a lot easier
than I've done before. And actually oh wow this is really a lot easier than than i've done before
and actually i've been doing a lot a lot more virtual environments than i used to and i kind
of lose track of where which ones are where so this helps and um so that bend if you haven't
listened to episode 11 it's something that deals with your virtual environments and pip and install
and all that for you.
And it's just a way of working that if you give it a try, you might like it.
So the video is one thing that's new that convinced me.
But there's also a bunch of other stuff that he's done recently.
He also included security checks. So our scare from last week of whether or not you're going to install a problem package. This pipend will
look through with pipend check. You can look through all your dependencies and make sure
that you don't have any security vulnerabilities installed in any of them.
That's awesome. And that's not like you have an old version of Django, so it has a security
vulnerability. That's like somebody called it Django without the D and put a virus in it, right?
That type of thing, right?
Yeah.
And the other thing that one of the things that it has that it's had from the start is a lot of packages.
So packages have these hash values to compare your actual install from what's been published.
And pipenv deals with that and checks those, which is hard to do manually.
And then one of the things it does recently is also it allows multiple package indexes.
So you could have PyPI, of course, but also maybe a company index and a group index and maybe even one for your project.
That's really cool.
The features are piling up and he recently said that it's nine months old but it's had 192
releases so he's uh he's not sleeping a lot i don't think no probably not yeah that's really
cool my favorite thing is pip and lock dash r will generate a requirements dot txt file that's
cool right and that's that's actually the thing that turned me off the first time and it's because
it uses a pip env uses a thing called pip file and
pip file.lock which i don't really follow what those why i need those but i know sometimes i
need a requirements and this allows you you can use this and still get your requirements file so
yeah very cool pretty cool all right so the final thing i want to talk about is a little bit a
little bit of a softer more squishy concept right, right? Not just an API or something.
But Stack Overflow, they're doing some interesting data science.
I think they actually have full-time data scientists that are just mining these and
generating reports and analysis on the industry.
So that's pretty cool.
And what I'm actually pointing to for this is not Stack Overflow, but to an Ars Technica article,
which is a follow-up to this kind of unfortunate article they did called Tabs and Spaces,
Who Gets Paid More, or something like that.
And they made the claim that, like, well, people who are uninformed use spaces,
and for some reason they get paid more than people who use tabs.
Don't know why.
That was something they found in the survey. Well, the reason why is because those are Python developers,
right? Whereas the other ones aren't. So that's an interesting thing in and of itself. But this
is like a follow up to say, like, let's look at not programming languages, but like just different
locations. So if you live in New York versus you just live in a random place in the U.S.
versus Germany or France, basically the U.S. versus Europe.
Well, U.S. and Europe all compared against each other.
So it talks about like in these different places,
if you're a DevOps or a data scientist, you earn really well,
probably using Python.
Surprisingly, if you do graphics programming like OpenGL or something,
you're not paid very
well, even though that's super hard to do. The reason is, I think, and they sort of hinted this
as well, is you're probably working in a game company. And there's a lot of young people working
at game companies who are just, they want to work on games, period. It doesn't matter if they have
to work 80 hours a week and get paid a little for it, right? Okay. Yeah. So that's pretty tough. I
have heard that the game industry is a pretty hard it right okay yeah so that's pretty tough i have heard that the game game industry is pretty uh pretty hard place to work but you know that's sort of one part
of it right you don't get paid tons but the most surprising fact was really that in the u.s
developer pay is significantly more than in europe and it's not like 10 more or something
like that it's like i don't know close to double it's really like quite% more or something like that. It's like, I don't know, close to double.
It's really like quite a bit more.
So they say things like, hey, people in the U.S. have substantially higher median income,
even regardless of experience.
So they say, for example, a median salary of a developer in the U.S. is comparable to
somebody with 20 years experience in Canada or Germany.
And it isn't even quite higher than people in France or the U.K. with 20 years experience. Canada or Germany, and it isn't even quite
higher than people in France or the UK with 20 years experience. Like a new, like, hey,
I just graduated. What can I do now? Sort of job. So pretty interesting. The comments are also super
interesting because people coming from all over the place and they're thinking about like, well,
okay, salary is not everything. There's cost of living, there's cost of healthcare, there's social
support. There's a lot of stuff.
So this is kind of partly interesting for the article, but also partly interesting for
the way people are analyzing it.
Yeah.
Well, actually, it's kind of nice to have some good news for being an American.
Yeah, it's been a little sketchy lately, but hooray.
We've got the weirdest president, the highest health care costs.
But hey, we get paid a lot. Yeah. And healthcare actually makes a big part of the conversation
and the end like, hey, well, you guys pay so much more for healthcare and maybe the salary
doesn't offset it, but we don't pay like half our salary yet in healthcare. So it doesn't offset it
yet. Anyway, pretty interesting. So if you're thinking about this kind of stuff, here's an
article with a lot of data to back it.
All right, that's our news items, Brian.
Got anything else you want to share with the folks?
No.
Oh my gosh, you're not doing anything, right?
You're just like chilling now that the book is done and you're just kicking back?
I think some people have already received the book,
although I haven't.
I'm waiting for my box to show up this afternoon.
Oh, how exciting.
Yeah, I've seen a lot of Twitter messages,
people like posting that they've shipped
and things like that.
That's great.
Congrats.
Thank you.
How about you?
Not too much going on right now.
I'm working on a free MongoDB course
and that is super close to done.
So I'm hoping to have some announcements soon,
but not there yet.
One of the things,
I'm going to try to start some new projects
and not talk about the book so much every episode, but I'd yet. One of the things I'm going to try to start some new projects and not
talk about the book so much every episode, but I'd really love to hear from people when they get
them and what they think. Go ahead and send me a shout out on Twitter at Brian Hawken and say,
hey, I got your book. And that'd be cool to hear from people. Yeah, that'd be awesome. Yeah,
it's really cool. People are excited about it. I've been watching from the sidelines. All right.
All right. Well, thanks for joining me for another one of these chats.
Thank you.
Yep.
Talk to you later.
You bet.
Bye.
Thank you for listening
to Python Bytes.
Follow the show on Twitter
via at Python Bytes.
That's Python Bytes
as in B-Y-T-E-S.
And get the full show notes
at PythonBytes.fm.
If you have a news item
you want featured,
just visit PythonBytes.fm
and send it our way.
We're always on the lookout
for sharing something cool. On behalf of myself and Brian Ocken, this is Michael
Kennedy. Thank you for listening and sharing this podcast with your friends and colleagues.