Python Bytes - #35 How developers change programming languages over time
Episode Date: July 19, 2017Topics covered in this episode: [more] Python Quirks : Comments Python 3.6.2 is out! Contributing to Open Source Projects: Imposter Syndrome Disclaimer The Dark Secret at the Heart of AI Arrange Ac...t Assert pattern for Python developers Analyzing GitHub, how developers change programming languages over time Extras Joke See the full show notes for this episode on the website at pythonbytes.fm/35
Transcript
Discussion (0)
Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds.
This is episode 35, recorded July 18th, 2017. I'm Michael Kennedy.
And I'm Brian Aukin.
And we got some really cool stuff to start out with. Like, do you have any comments maybe to kick us off, Brian?
Well, I like to comment about just about everything.
But to kick off, we've got a article from, I think it's Philip Tronner, and it's called Python Quirks Comments.
I kind of like this article.
When you're looking at source code, there's definitely comments that start with pound that are obviously just comments to other coders.
But then also sometimes, since we have doc strings in there, some people have learned that you can just put strings or other
objects in your code. And as long as they're not referenced by anything else, it just acts like a
comment. But this is an article taking a look at the abstract syntax tree and also taking a look
at timing them. And obviously... Yeah. The fundamental question was,
is there a reason to prefer one over the other? Is there a performance difference between them? Things like that, right? Yeah, definitely. And I have seen it even,
I haven't seen it in a lot of big open source projects, but I've seen it in just random,
like Python code that I look at from coworkers or whatever. People will comment out, even commenting
out a chunk of code with the three quotes just to block it out. It's not good. It just, it actually leaves that
object in your code and can slow things down a bit. Yeah. So he did a bunch of testing and the
hash or pound comments that actually get grayed out, like those literally do not appear in the
abstract syntax tree. So once the PYC file is generated, they're gone, right? Like they literally
don't appear in the resulted executed code.
However, if you have like triple quote for doc strings,
that actually gets set to the dunder doc property, I think.
But those appear, if you have other ones,
those just execute, they get allocated and then they immediately get dereferenced
and garbage collected.
But those steps happen, right?
Yeah, one of the things that made me think about,
because I've been adding for the code about because i've been i've been adding
uh for the code examples for the book i've been adding doc strings and i i'm curious i'd like to
do a similar test i think i might take his code example and do a similar test on like size of doc
strings if you if you do like a i don't know a 10 character one liner versus uh 100 200 characters
of a huge docstring.
Does it make a difference for performance at all?
Yeah, pretty interesting.
I'm definitely a fan of the hash comment unless, you know, it's literally four docstrings.
Yeah, and one of the things that this is outside of the article, but in a comment recently,
I can't remember where, a discussion about this was that doc strings are information for your user or for the user of a function or something.
But the compound or hash comments are information for a future developer.
And I like that.
Yeah.
Yeah.
That's the, imagine a psychopath who's having a bad day 10 years from now is inheriting your code and they know where
you live you want to leave them some comments to help them feel happy yeah all right so you know
what i get if i open up my terminal i type python 3 dash capital v i don't know what do you get i
get three six two because python three six two is out, I think, today even. Awesome. Yeah, very, very cool. And I was blown away at how much stuff is in here.
And I think these are mostly fixes.
I didn't see, I don't think there were any new features.
I think that's coming in 3.7.
But holy moly, there are a lot of changes.
And it's pretty interesting.
So I pulled out a few just to highlight.
And I'm highlighting for a variety of reasons.
So I broke them in, well, they broke them in too, and I grabbed them from four categories here.
There's a few others that I decided not to touch on, like changes to idle, don't care.
But security, that I very much care about.
So the security ones, we have these changes, and they've got a bunch of numbers. You guys can look them up, but prevent environment variable injection into sub processes on windows, right? So prevent other
things from freaking out or taking over what the system looks like for, for Python, or this one is
kind of scary. Upgrade expat copy from two to two, two to zero to two to one gets fixed. They have multiple security vulnerabilities and all these loops, integer overflow, regressions
of other bugs, counter hash flooding, all these things that are like, you know, probably
someone where in there, there's a really bad vulnerability.
Also parsing the host URL lib and things like that.
So there's a bunch of security fixes, not just features or whatever.
So if you can, you should probably install this.
Yeah, just definitely from my first glance at it this morning,
it just seemed like it's just better for security and other changes to definitely install it.
But I didn't see anything that jumped out of like, oh, I was waiting for this.
Now I can use it or anything like that.
Yeah, yeah, yeah.
I don't think that, I think that it's just only bug fixes and security fixes, no features
until 3.7.
Okay.
And maybe you run a Mac, right?
So maybe I'm just dense, but to upgrade, you just download the new one and install it.
There isn't a way to just upgrade, is there?
Well, you can use Brew.
I actually installed it off of Python.org.
So to upgrade that version, I think you got to keep rolling.
You just get the next one and run it.
That's fine.
But if you Brew install Python 3,
then you can Brew upgrade Python 3,
which is kind of maybe what I'll do in my next Mac.
Or I don't know.
I'm happy with what I got now,
but I'm going to try to use Homebrew more.
I'm starting to love it.
I actually, so if you do,
if you have 361 and you install 362,
you just have both of them there, right?
Yes, I think so.
But certainly if I type Python 3,
now that's 362.
Okay.
Well, I like having both around anyway
for testing multiple versions, but.
Yeah, sure.
And then there's some things,
some tools like Virtual Environment Wrapper,
but slightly different.
I'm forgetting the name right now.
That'll let you get all the versions and flip between them and whatnot.
It's pretty cool.
All right.
All right.
So some other ones, core and built-in stuff like the parsing of F-strings with backslashes
apparently is broken.
Segmentation faults when you are working with dictionaries.
Those are never used in Python anywhere.
When you're changing them while searching you know if your
python just goes away your web app keeps crashing like all sorts of bad stuff control c when um
you're inside of a yield from or a wait call gets fixed and all these different things so tons of
fixes there uh the library gets race condition fixes uh for some signal delivery and wake up for file stuff.
The lib two to three now understands F strings,
race conditions, Windows.
Oh my God, this one is awesome.
If you work on Windows or you teach people Python who work on Windows, you can cheer for this.
This is amazing.
Windows now will locate msbuild.exe
instead of vcvarsallbat.
It is so much more reliable to find ms build on windows than it
is that stupid old vc vars all bat thing for like all the c compilation so that means pip install a
thing on windows should get more reliable so there's there's about 40 more of these types of
fixes so i what i wanted to share in the news, how awesome is this?
I also wanted to hit on some of those things, especially the security stuff, because we're
coming up quickly on the end of legacy Python, right?
Yes.
Yeah.
Legacy Python has to have some of these in there.
Like people discovered these and now here are these problems that are uncovered.
In 2020, these problems are going to stay in Python 2.
So the sooner that you can get to Python 3, so these changes keep coming to you rather
than become just, oh, that's a security vulnerability.
Sorry, you have to live with that.
Just one more reason to upgrade to Python 3 for those holdouts out there.
Yeah, definitely.
And one of the things looking at this list, I just have to say, give a big thank you to
everybody that worked on all this so that I don't have to work on things like this. Yes, thank you. It's awesome. It's all getting better. Yeah. Cool. All right. Speaking
of contributing to open source projects, a lot of us feel like we're not good enough,
or maybe we don't know enough, or our experience isn't rich enough, or whatever, right?
That's a huge problem. Yeah. I think everybody has gone through that. I mean, definitely everybody
that's now contributing to open source has had an initial time where they felt like whether they knew enough about something.
And so make sure I get her name right.
Adrienne Lowe, who does coding with knives and has spoken at a couple of PyCons and other places.
Yeah, she's great.
She wrote a contributing to open source projects.
Well, she wrote a thing on GitHub called imposter syndrome disclaimer.
Essentially, it's in places where you have how to contribute to your project.
She'd like you to add this or think about adding this little disclaimer to people that
maybe don't think that they're ready to do it.
And it's kind of great wording.
It has things including saying, I want your help.
No, really, I do.
There might be a little voice, and I'm just quoting right out of this,
there might be a little voice inside you that tells you that you're not ready,
that you need to do one more tutorial or learn another framework
or read a few more blog posts before you're ready.
But I assure you that's not
the case and goes on to like tell you to point to your contributing guidelines and then also to
comment about other stuff. And here's another quote, and you don't just have to write code.
You can help out by writing documentation tests or even giving feedback about this. And we talked
about this in one of our previous episodes of
many ways you can contribute to open source projects. But I think that this is a great
idea to put it right in your contributing guidelines for your project. Yeah, really
nice work, Adrian. If you guys were at PyCon, she was the host of the art museum dinner. And this
is really great. She does a bunch for the community and contributes to many projects. So I know she's been on both sides of this.
And I do think having this on your projects will help.
She'd like to collect examples.
So I've got a link.
We've got a link in the show notes for where she's, or just get a hold of her and say you're taking this and contributing.
So anyway.
Yeah, sounds good.
Yeah, nicely done.
And you just grab that and you can drop it into your
project. It's just like a markdown file or something like that. Yeah. So Michael, do you
have any dark secrets that you want to share? I think we all have dark secrets and I don't really
want to talk about it, but it's time to get it out in the open. And so the next thing I want to talk
about is a article, a pretty deep article from MIT Technology Review called The Dark Secret at the
Heart of AI. So we've touched
on this a few times. It's kind of a nice follow up from last week. There's a huge problem with AI.
And we've had statistical models, and we can look at the model, we can see things that is
predicting. But as we move farther and farther into things like deep learning,
the machine doesn't know why it knows a thing. We don't know why it knows a thing,
but we can teach it a thing and then it does that thing, right? Even the creators of these
deep learning models can't explain why it makes a decision. You can't like set a break point and
step through and go, Oh, this is the if case. Yes, of course here, there's none of that. It's like,
I've taught it a bunch of stuff and now it somehow knows. And then I ask it a question. So
they gave a really interesting example to kick off this article. They said, last year, an experimental
vehicle made by NVIDIA was just like any other automated car was released somewhere in New
Jersey, I think it was. And they said, but it was unlike anything demonstrated by Google, Tesla,
or GM. And it shows the rising power of AI. The car didn't follow instructions by being programmed or engineered.
They basically taught this car how to drive by having it watch humans drive.
And then they put it out on the road.
Oh, wow.
Yeah.
And so it was really weird.
Like the results seemed to do what human drivers did.
But if it did something different different how do you understand or debug
it or even change it to make those decisions differently like if it crashes into a tree it
sits at a light or there's always the hilarious joke that people seem to play on these cars is
like draw like what looks like painted white lines in a circle around it can't get out you know like
but if it does an unexpected thing how do you debug it or change it
that's really the secret is we even the developers of ai and ai itself they don't know how they work
yeah and there's also things that when i think about this stuff i i don't i'm fairly optimistic
about the self-driving cars and i'll be one of the first to grab one if I can afford one. But there's always the question of like, okay,
so if a car comes up to, say, decide whether or not to crash you
and your family into a tree or take out a whole glob of school children,
what does it do?
Yeah.
Yeah.
That's the sort of moral questions.
I don't know how people are going to debug that.
For sure. And if you get the AI to do that, maybe like, how do you know it's always going to make the right choice? You don't. It's right now. It's difficult to design a system so that it could explain what it does.
People can't explain always why they do what they do precisely.
And so it's interesting.
One of the consequences that might be coming really soon, this is in the EU,
is there's an argument being made that you have to be able to get machines and AIs
to tell you why it reached a conclusion as a
fundamental legal right. Oh, wow. Okay. So if I'm told I have cancer and I go crazy and I burn all
my life savings, oh, sorry, glitch in the Whopper core. You're fine. You want to know why. If I'm
denied a loan, if I'm denied the ability to buy a house, if I'm denied a job, right?
These are like serious, serious questions.
So basically, they kind of round out.
It didn't go all the way.
There's a lot to cover in this article.
But last thing for us is, you said, we've never before built machines that operate in ways their creators don't understand.
How well can we expect to communicate and get along with intelligent machines that could
be unpredictable and inscrutable? Crazy, huh? Yeah. Yeah, definitely. I'm optimistic with you
as well, but it's just a very, it's interesting that philosophy and morality is starting to become
part of programming. Yep. We definitely have machines now that I think more that one person
doesn't understand, but yeah, I think the biggest consequence for us is that we're going to have programs
we can't debug or understand why they do things.
That's going to be a bizarre program in the future.
Before we move on, did you say
the Whopper Core?
I did say the Whopper Core.
Is there a computer based on a hamburger?
No, no, that's from the
War Games. Oh, okay.
Remember they had to hack into the Whopper Core
because that machine, they had to teach it to play tic-tac-toe against itself or something.
Nice reference.
Thank you.
Yeah, I always think it's great that in that movie you can get from Colorado Springs to Bainbridge Island in a helicopter.
Four stops for gas.
On one take of gas.
Not possible.
Anyway.
Awesome.
All right. like on one take of gas not not possible anyway awesome all right so let's proceed safely back to the three a's of testing patterns and away from this philosophy so yeah so actually i was
i loved seeing this so this is a uh article i didn't write his name down sorry uh called
arrange act assert patterns for python developers and i am uh i'm you know i'm a big and this am a, I'm, you know, I'm a big, and this is a, the arrange, act, assert pattern is,
is a structure for how to set up test cases. And, uh, and this is, you know, a fairly gentle,
easy introduction, basically just, uh, telling people to not have big, long spaghetti test code.
Your test code should be, uh, something structured and this is a decent structure.
And the arrange part is get yourself
ready to do whatever you're going to do is the setup part act is uh whatever thing you're testing
and the asserts part are is where you check so the the important thing is don't go back and do a
whole bunch of try to do as many test cases as you can that the all of the asserts are at the end and
you don't do more actions and do more asserts. I wrote a list. There's other names that people might know
it by, like given, when, then. That's often attributed to behavior-driven
development, but it's essentially the same pattern. And I did cover it in a couple
places on pythontesting.net and also in Testing Code.
Yeah, the links are both in the show notes there, yeah, for the episode and the article.
But more, I'm pleased with more people being like one so far.
More other Python developers writing for targeting developers and teaching people how they should set up their tests.
Yeah, and it's such a simple pattern.
But I find when I follow up my code, my tests are more focused and they're not less rambly.
So I think it's good.
Yeah.
And also you have less chance of something going like a test failing.
You pretty much know what's wrong instead of a test failing.
And it might be one of 15 different things.
Yep, for sure.
All right.
So last thing I want to cover is to shine a bit of a bright light on the future of Python. So everyone
out there listening, you are in a good place, let me tell you, in terms of being interested in
working in Python right now. So there's another really deep article by this company called Source
D, Sourced, not entirely sure, but their mission, they're not super Python focused. I think they're doing mostly Go stuff, but their mission is to build the first AI that understands code.
Speaking of AI, pretty interesting.
So they wrote this really long blog post.
There's a decent amount of data science and math in there, and it's called Analyzing GitHub,
How Developers Change Programming Languages Over Time. change programming languages over time. So we've talked before, Brian, how Python is the number one most at number two, sorry, most active language on GitHub, right? For active
non trivial projects, things like that. And I think JavaScript was number one, because
everybody has JavaScript in their web apps, right? Yeah. Yeah. So this is a different question,
but kind of similar, not what is just most popular but how is it changing
over time where are those trends going to if people are changing languages
where do they change from and so they have these cool Gantt charts that
they've studied 4.5 million github users over 393 different languages and 10
terabytes of code and they said given one of those 4.5 million users how do we
visualize them how do we think about them and they've got a Gantt chart of like, as they transition from one language to another over time.
And this is based on an original article by Eric Bernhardson. And he's at Google. And the name of
his article is pretty interesting as well. The eigenvector of why we move from language X to
language Y. All right? So this takes us...
Oh, I love me a good eigenvector.
I do love me a good eigenvector as well.
It tells you where you're going.
So this is a slightly different approach,
more of a data science,
less of a statistical approach, I believe.
And they said, look, first of all,
we're going to not include JavaScript
because JavaScript is like spread
amongst all these projects, right?
Hey, my Pyramid project has JavaScript.
That Ruby on Rails project, it's got JavaScript.
Everything has JavaScript.
So it's super hard to make reasonable claims about JavaScript
because it's such a complementary language.
So they said, like, we can't reason about this, so put it to the side.
Take that for what it's worth.
And they said, we're going to look at the most popular languages on GitHub,
and they do a whole bunch of work, And they said, we're going to look at the most popular languages on GitHub.
And they do a whole bunch of work.
And they come up with this stationary distribution of a Markov chain.
How about that?
And what they find out is the number one most stable language at GitHub is Python.
And interestingly, its stability level is higher by almost 50% than the amount of code as a percentage of it on GitHub.
So it's really, really stable.
And then behind that... So what do you mean by stable?
People, once they get to Python, are least likely to move away from Python.
Oh, okay.
Yeah.
I believe that's the right interpretation.
Then there's Java, which is also very stable.
There's C, then C++, then php then ruby then c sharp and
then it goes on and on and on so they they make some claims uh based on this they say python at
16.1 percent appears to be the most attractive language followed closely by java uh it's
especially interesting since 11.3 percent of all code on github is written in python so it's more
attractive than it's like level of code would imply. They said there are
some languages that are repulsive. That's my wording, not theirs. But said, although there
are 10 times more lines of code on GitHub and PHP, rather than Ruby, they have the same level of
stationary attractiveness. So much less reason to be attracted to Ruby. But if you're there,
you're more likely to stick. And so they said, what about sticking to a language? Developers coding in one of the five most popular languages,
Java, C, C++, PHP, and Ruby are most likely to switch to Python with a 22% chance on average.
How about that? Yeah. So people who like Python are most likely to stay there. And people who
are one of the five most popular languages are most likely to move there as well. I think it
also didn't, I don't know if I haven't read this article, but I think it also goes
to the fact how easy it is to think of something that you could solve that you could share
with somebody else on a project with Python.
For instance, I've programmed C++ all my career, but I've never contributed any C++ code to GitHub.
Yeah, that's for sure.
It's definitely a more open source friendly language as well.
A few more random stats.
They say visual basic developers are very likely to move to C sharp with a 92% chance of that.
And people using numerical and statistical environments such as Fortran, Matlab, or R are most likely to switch to Python using this measure of analysis.
Whereas Eric, the base blog, the original blog,
was suggesting they might move to C.
So pretty interesting little article there
about stability and attractiveness of projects.
That correlates with other, I guess, anecdotal things
that we've heard of more people migrating,
especially in data science, to Python.
Yeah, it seems totally believable to me,
given all the other pieces of information and studies we've, we've seen and talked about. Yeah.
Nice. Nice. So I think I would say, you know, if you're thinking about where do I bet my career,
that's another positive sign that Python's probably a good spot to hang on to for a while,
long while. All right. Well, that's the news. Anything else going on, Brian? I wasn't there, but EuroPython got wrapped up last week.
But I did have some stickers, some rocket stickers, Hits a Ride.
Nice.
They blasted all the way over to, what was that, Spain?
Yeah, Italy, I think.
Oh, Italy.
Yeah, that's right.
And had a bunch of stickers handed out to promote the book.
So that's fun.
So I got one more week to finish it and then it'll be done.
You must be looking forward to that.
Yeah. So how about you? What's going on?
Awesome. Not too much. I'm really enjoying summer. I'm actually working on some apps,
some very interesting apps from my training courses. Not that I'm going to teach,
but to deliver stuff. So more to come on that right now. I'm just writing.
Okay.
See how it comes out in the end.
Yeah.
Very fun.
Hey, one of the things that you brushed by fairly,
I know you talked about it a lot somewhere else,
maybe, but your Python for Entrepreneurs course,
it's freaking awesome.
Thank you, man.
Thank you very much.
I think more people should check it out.
And I don't think it's just for entrepreneurs.
I think it's a good top to bottom
Python for web plus front end and back end. It's a nice thing for people to look at.
Thank you so much. Yeah. And it's officially, officially done as of last week. So it's finally
ready for the world. Thanks a bunch for the shout out.
Yeah. And Matt McKay helped you with that, I believe.
Yep. Matt McKay from Fullstack Python. We are happy to be done and we're planning our next thing that we're going to do. Yeah, you guys did a good job on that. I believe. Yep. Matt McKay from FullStackPython. We are happy to be done and we're planning our next thing
that we're going to do.
Yeah, you guys did a good job on that.
So, cool.
Thanks a bunch.
And yeah, it was super good
to see your book coming out as well.
It's very fun, isn't it?
Yeah.
Awesome.
All right.
Catch up next week.
You bet.
Catch you next week.
Thanks for being here, Brian.
Thanks everyone for listening.
Thank you for listening to Python Bytes.
Follow the show on Twitter
via at Python Bytes.
That's Python Bytes as in B-Y-T-E-S.
And get the full show notes at pythonbytes.fm.
If you have a news item you want featured, just visit pythonbytes.fm and send it our way.
We're always on the lookout for sharing something cool.
On behalf of myself and Brian Ocken, this is Michael Kennedy.
Thank you for listening and sharing this podcast with your friends and colleagues.