Python Bytes - #182 PSF Survey is out!

Episode Date: May 19, 2020

Topics covered in this episode: PSF / JetBrains Survey Hypermodern Python Open AI Jukebox The Curious Case of Python's Context Manager nbstripout Write ups for The 2020 Python Language Summit Extra...s Joke See the full show notes for this episode on the website at pythonbytes.fm/182

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Python Bytes, where we deliver Python news and headlines directly to your earbuds. This is episode 182, recorded May 13th, 2020. And I am Brian Ocken. And I'm Michael Kennedy. And this episode is brought to you by Datadog. There's two surveys that I feel really do a good job of keeping their, you know, they have their thumb on the pulse of the community. And I would say one is the PSF JetBrains combination.
Starting point is 00:00:27 That one's really good for Python in particular. And the other is the Stack Overflow Survey. Well, the Stack Overflow Survey recently came out a couple months ago or something like that. So now it's the PSF JetBrains Survey. And when I first saw this, I thought, oh, this is last year's because it was 2019. But they do a ton of analysis on it. And the survey was
Starting point is 00:00:45 done in 2019. But it's released now. So anyway, I want to talk about some of the results in there, because there's some interesting takes. And also want to say thank you to Jose Nario, who sent this in because like I said, I thought it was last year's I'm still waiting for the 2020 edition. Alright, so let's talk about some of the results. First of all, one of the first questions asked was, do you primarily use Python as your main language, or are you interested in Python because it's some other, like I also happen to use it in addition to JavaScript or something. 84% of the people who took the survey,
Starting point is 00:01:18 and that was mostly folks who visited python.org, so 84% said it's my primary language, and that's unchanged from last year. Okay, interesting. A lot of the analysis here was, how has this changed over the last year or so? So there's a lot of interesting trends to be pulled out from here. They said, what other languages do you use?
Starting point is 00:01:40 Well, there was JavaScript, which has gone down in usage. There's Bash, which has gone down in usage. There's bash, which has gone down in usage. There's HTML, which has gone down in usage relative to last year. And C++, which has gone down relative to last year. So the people who took the survey, the same number of people said they're primarily using Python, but the other languages they're employing, all the ones who are, which were popular seem to be going down. And I would say this means maybe people are starting to lean more on Python, those who use it. I don't know what conclusions we should draw there. There's also some growth in some other languages as well, I'm guessing.
Starting point is 00:02:16 Interesting to see all the other ones down. Yeah. Well, this is down within the Python community, right? This is not necessarily down overall, right? Yeah. down within the python community right this is not necessarily down overall right so i would say we should look at the stack overflow survey to like really more like even stack overflow trends but if you look at that one it's all the other languages are relatively down as well so interesting now a lot of the divide happens around web versus data science here and so a lot of the
Starting point is 00:02:43 questions said are you a web developer primarily or data scientists primarily? And then if you are, what type of tools do you use? So for example, if you're a data scientist, use Python, of course, but you also make larger use of C++, Java, R and C sharp as a data scientist. But if you are a web developer, you make more use of SQL, JavaScript and HTML, which is, you know know, no surprise because it's hard to write the web without HTML and JavaScript. Yeah. So those are pretty interesting. And they said 58% of the people use Python for both work and personal projects. And they said, all right, well, what do you use it for, right?
Starting point is 00:03:21 So there was a bunch of answers. I'm going to give you the top four because the average person who filled this out chose four things that they primarily use Python for, right? It's like I could use it for web and DevOps or something like that, right? It's not exclusive. So the top four were data analysis, which is exactly the same as last year. 59% of the respondents said that. Web development, this is pretty interesting. 59% of the respondents said that. Web development, this is pretty interesting. 51% of the people said they do web development, but this is down 4% year over year. Interesting. That's a big drop. And I feel like web is a big component of what the Python space is. So I don't really know what to draw from that. Like maybe just more people are sort
Starting point is 00:04:01 of backfilling in the data science side maybe maybe the tools that they use are better so things like streamlit and stuff they're like well i don't have to do web now i'm not really sure yeah machine learning 40 that's up one percent devops 39 that's down four percent as well which is the same number as web development and these are kind of similar things so uh what you use Python for is pretty interesting. And then more broadly, it says, what do you use it the most for? Web, data analysis, and machine learning. Web is down 1% as the primary thing you use it for. Data analysis is up 1%.
Starting point is 00:04:36 Machine learning is up 2%. Also, not surprising, I guess, but good news on the Python versus legacy Python story. 90% of people are using Python three. That's good. That's a huge change from five years ago when it was like, Oh yeah, that there's that one weird guy that uses Python three, but everyone else now more of the data scientists are using Python three as a percentage over the web developers. I think that's because data science, the libraries and the tools are changing a lot. It's not like you have legacy data science. Like, oh yeah, we're using that machine library
Starting point is 00:05:10 from machine learning library from like 10 years ago because we just don't want to change it. It's like, those are fundamentally changed and you just have new tools. Whereas web development, I think there's a lot of folks that are like, yeah, we're still got that Django 1 app going on Python 2 or something.
Starting point is 00:05:25 I'm not surprised there's still some Python 2, but 10% still seems a little high. I think it's mostly legacy code. I'm not entirely sure, but you can use your legacy Python for your legacy code. All right. And then web frameworks. Flask was kind of neck and neck with Django last year.
Starting point is 00:05:41 Now it's like totally ahead. So 48% Flask developers and 44% Django developers year. Now it's like totally ahead. So 48% Flask developers and 44% Django developers and everything else is pretty small. Data science, 63% of the people are using NumPy, 55% Pandas, 46% Matplotlib. Here's one I threw in for you, Brian, testing. 49% of the people are using PyTest. Yeah, 30% of the people are using the built-in unit test. Yeah, 30% of the people are using the built-in unit test and 34% are using this other framework I hadn't heard of before. It's called the none framework. Yeah, that's unfortunately high, but it's funny that it's higher than unit test. Yeah, I know. That's what I thought as well. Pretty funny. All right, a couple more and we'll be done with this one. So cloud,
Starting point is 00:06:21 the cloud platforms people are using, AWS is in the lead. No surprise there with 55%. What did surprise me is Google Cloud, GCP and whatnot, is actually number two at 33%. DigitalOcean, shout out to them, is at 22%. And then Heroku is 20% and Azure is 19%. So pretty interesting. And then how do you run your code in the cloud in a production environment? Do you run it in containers, VMs,
Starting point is 00:06:49 platform as a service like Heroku or something? So containers, 47%. That's pretty high. VMs, 46%. And then platform as a service is 25%. What editor do you use? PyCharm, 33%. VS Code, 24%.
Starting point is 00:07:04 Vim, 9%, everything else is like a super small margin after that. Yeah, and that's pretty much it. Yeah, so I threw in an extra one on there at the end. The containers, I thought containers was number two below VMs last year.
Starting point is 00:07:20 And now it's jumped to number one. Oh yeah, like that's probably Kubernetes like hosted Kubernetes clusters, right? That people are throwing their Docker images into. Oh yeah, what's the last one? Take us through this one. And then the tool use. They also listed a handful of things of people using.
Starting point is 00:07:36 90% people using version control, that's good. 80% write their tests, that's good. 80% code linting. And 65% people use type hinting which actually is a little higher than i oh yeah way to go that's nice and uh about half the people using code coverage tools 52 yeah thanks for adding that i'm very familiar with legacy code and non-legacy code which i guess you call modern but you're going to take it to 11 what's up what's up with this yes this level it's hyper modern yeah this is from claudio cool name but anyway so he wrote a actually it's like a book this is a an incredible series of blog posts and he actually writes them
Starting point is 00:08:20 out they're all linked together called hyper modern python and sets them up in six chapters he's got setup testing linting typing documentation and then the last chapter is cicd and he just wrapped it up i've been watching this and kind of i was going to announce it when when he was done it's a really fun series to learn about you know you've learned some of the basics of python but you want to like get some best practices and take it up a notch. And I think this is good. It's opinionated, of course. I actually like opinionated things, and some of the opinions I don't quite follow.
Starting point is 00:08:56 Like, he likes PyENV. I'm not really a fan of PyENV, but that's all right. Also, poetry, he uses poetry. I use Flit, usually. But anyway, he does recommend the source layout, which is good. But for setup, there's a whole bunch of neat stuff in here. And some tools that, like for linting, he talks about Flick8, Black, Import Order, Bugbear, which is fun. I don't know if we've covered that. But there's a whole bunch of tools that I haven't even heard of, like
Starting point is 00:09:23 Safety and Dessert and Data Validation fun. I don't know if we've covered that. But there's a whole bunch of tools that I haven't even heard of, like safety and dessert and data validation with dessert and marshmallow. So I'll have to look that sort of stuff up. That looks neat. So just a good run through it. In the CICD section, he talked about using GitHub Actions and reporting your coverage with CodeCov, uploading to PyPI and using test PyPI servers and documenting on read-the-docs. I think this is a fairly good representation for modern projects. Yeah, it covers a lot of stuff that people probably should be doing
Starting point is 00:09:53 and maybe haven't taken the time to set up or dig into. Yeah, and one of the things I want to highlight also is an incredible use of pictures. The imagery that he uses on these posts are, like the CICD section was some 1970s space station or space colony images, and they're beautiful. So it's worth it just for the pictures. It is worth it just for the pictures.
Starting point is 00:10:20 They're great. Yeah, very nice one. Well, another thing that's really nice is Datadog. Indeed. Yep. This episode of Python By thing that's really nice is Datadog. Indeed. Yep. This episode of Python Bytes is brought to you by Datadog. And let me ask you a question. Do you have an app in production that is slower than you like?
Starting point is 00:10:34 It's performance all over the place, sometimes fast, sometimes slow. Now, here's the important question. Do you know why? Well, with Datadog, you will. You can troubleshoot your app's performance with Datadog's end-to-end tracing. Use the detailedadog. And you even get a free t-shirt. So I really like my purple Datadog t-shirt. I wear it a little too much, probably. Yeah, it's a cute one.
Starting point is 00:11:11 Awesome. Yeah, thanks, Datadog. This next one that I want to cover, Brian, is a little bit just kind of a fun thing. Dan Bader shared this with me the other day. He's like, man, you got to check this out. Look at this thing. And I'm like, what is it? It's called the Open AI Jukebox. And so it's this AI that creates music. And it doesn't just create music,
Starting point is 00:11:34 it creates different genres of music in the styles of certain artists with lyrics and musical accompaniment. So it's wild, right? I mean, I had you listen to a couple of the samples and folks out there, you should just go click on the open AI jukebox link in the show notes and go to the curated samples and play a couple. How would you classify them, Ryan?
Starting point is 00:12:00 I'd classify them as you can tell they're music. None of these I would pick up, want to go out, rush out and buy the album right away. Yeah, neither would I. I mean, to me, they sound like sort of bad recordings of an artist, you know, maybe taking on like a phone at a live concert where you've kind of got like not a good audio set up. There's like a little bit of a, I think I remember this song bit to it, even though there's no way. Yeah, but it was created by an AI, which is insane. So one, they've got a country song in the style of Alan Jackson, which is a country singer.
Starting point is 00:12:37 And you could convince me that that was Alan Jackson singing that song if I didn't know better. I'm like, oh, I've never heard that song and I'm not really super familiar with his music, but I kind of know what the guy sounds like. That's probably one of his songs. Cause it sounds like he's singing it, which is crazy. It's got Elvis Presley. He's got Katy Perry. It's got some heavy metal and the style of rage. I've not actually heard them. You got some other crazy stuff like alternative metal in the style of Disturbed or a jazz like Ella Fitzgerald. And it's really interesting how accurate these things reproduce what those artists' style of singing is, their voice, what their voice sounds like, the style of music they write. I mean, I would definitely not want to go listen to this to like relax or whatever but it's as a ai example it's pretty crazy yeah and one of the things i really appreciate is the uh the music while you're listening to it it kind of it shows you the words the lyrics while you're yeah the lyrics highlight because uh some of them it's easy to understand but others like
Starting point is 00:13:41 the disturbed one it's not so much easy to understand it's like a highlighted thing but my my brain wanted to have see the little bouncy ball like um i don't know the kids music yeah exactly so the code for this is available on github the data set they use they train the model to train it they crawled the web and curated a data set of 1.2 million songs oh wow that's got to use some bandwidth to get a hold of those and then 600 000 of those were in english and then it paired those with lyrics and metadata from lyric wiki okay so it went and found the songs that said okay we need the written text so we can teach the model what the words are so that it can make up its own lyrics i guess yeah and then it says the the top level transformer is trained on the task of predicting compressed audio tokens and they
Starting point is 00:14:32 provide additional information like the artist and genre for each song and said they get two advantages from that first it reduces the entropy of the audio prediction so the model's able to get better audio quality for the given style and And at generation time, they're able to guide it in the style of their choosing. Like, no, no, we want some hard rock versus Elvis or whatever. Anyway, if you're into AI, this seems like a pretty wild project to check out. Yeah, definitely. Yep. I'd say it's even curious.
Starting point is 00:15:00 Curious. Or if you're curious, you should go check it out. You should. And you should also check out this next post called The Curious Case of Python's Context Manager. So Redwan Delawar went through this because context managers are really important. And when you start really making some elegant, really readable code,
Starting point is 00:15:21 it's good to make use of context managers. If you're not familiar with what they are, anytime you see a with, like with open file as F or something like that, that's a use of a context manager. And the file one is probably the most notable one. So when you leave, what it does is it hooks up the keeping track of the data so that when you exit the block, it can clean up after itself. And so really handy things. I've seen a lot of different tutorials on how to write them. And a lot of them also, they go through the class. So in general, the punchline is use the context manager, contextlib.contextmanager decorator, and use a yield statement in the middle. And that will help you out. That's really
Starting point is 00:16:05 the working way to do it. But if you want to write, you write a class based one with the dunder init, dunder enter, dunder exit. I think those are good examples. One of the things I liked about this tutorial is that it went through that. And it didn't seem artificial. It seemed it's like, this is one way to do it. But it went through it pretty quickly and went through some other stuff. And it's a pretty quick tutorial. But then it gets into some really fun stuff that I really appreciated, like context managers as decorators and writing your own and then create so that you're not using a decorator to make a context manager. You're actually creating a decorator that is a context manager. And that's a pretty interesting thing.
Starting point is 00:16:48 And then wrapping things and nesting context managers with block or with statement. Yeah, that's cool to have like three of them instead of just one. Rather than like having three with blocks, right? Somehow I just missed that. Yeah. Yeah. And I've just nested the width blocks but you don't have to do that you can have them all on one line which is cool yeah there's a lot
Starting point is 00:17:10 of cool little tips here and then combining them so like creating context manager that's really a combination of two other context managers or more more than one which is nice and then you kind of get into this right and you think well what am i going to do with this where would i use it and so there's three examples that he lists and shows there's a context managers with a sequel alchemy session which is a really cool idea yeah i love this one yeah yeah how to use rollback and sessions so that when you're testing and stuff you can just automatically undo the thing that was in the block sweet idea using it for exception handling so that you can just automatically undo the thing that was in the block. Sweet idea. Using it for exception handling so that you can end that in combination with using it as a decorator is a neat idea to have a policy for how to deal with certain exceptions during
Starting point is 00:17:56 parts of your code and then just decorate those functions with exception handling. That's pretty cool. And then the last one was talking about persistent parameters across HTTP requests. So it goes from very gentle to really deep into using this well pretty quick, but it's really easy to read. Yeah.
Starting point is 00:18:16 Well, you know, I realize reading through this that I've not been using the decorator style nearly enough. I'm always like, oh, I'll just add a class with an enter exit type of thing. But yeah, the context managers, you just get a function and do, you know, it's pretty clean. Yeah. It reminds me a lot of PyTest fixtures. Yeah, for sure. That's a great article. I like it. I'll definitely have to study that one up. So previously we spoke about NBDev, which takes Jupyter notebooks and allows you to do a whole bunch of cool things
Starting point is 00:18:46 with them right you can export the stuff into a script you can have it strip out some of the metadata the saved output which is like the bane of all github committed notebooks because every time you rerun it if it's taking variable input data it's going to have different metadata so every run is a conflict emerge conflict which is no fun so that thing solved a bunch of those types of things but clement roberts sent over a message and said hey that's really cool and mb dev is great but if you're looking just to do the stripping of the metadata there's a project called nb strip out which is pretty clever and yeah you can just set it up as like a git pre-commit hook and then every time you commit your stuff to github it automatically just strips it out so that you never run into any of those merge conflicts no hooking up as a pre-commit hook that's a great idea yeah yeah so it's you can either run it from the command line
Starting point is 00:19:43 or once you're in a github repo and you've nbstripout, you just say nbstripout dash dash install. That's it. Now it's a git pre-commit hook, and it'll take care of doing all the things for Jupyter notebooks. Oh, nice. clear all output in the UI, but it just does it as you try it, you know, only as you save it to GitHub, which is cool. And then there's also a YouTube tutorial, right? We've said that it's really cool to have screenshots of like UIs. Well, this is also really nice. So if you go actually to the PyPI listing, there's like a YouTube video right there that shows you like a four minute tutorial of like why you should care about this. And I've done that. It's super useful. I'm like, Oh, this is kind of interesting. Let me watch it. I'm like, yep, that looks useful. We should talk about it. Yeah. Nice. Yeah. Anyway, so if people are working with notebooks and they're having this merge conflict issue,
Starting point is 00:20:38 it's the saved output people forgetting to clear the output for they committed. Don't make them remember just to MB strip dash dash install, and you're golden. In episode 179, we had Guido on, which was totally a lot of fun. One of the things he brought up was the 2020 Python Language Summit. And it was really interesting to listen about that. But if you want to read more about it, there's a pretty good write-up of all the topics I talked about. So we're linking to
Starting point is 00:21:05 a post that has links to other posts. We've got things like, should all strings become F strings? And using the peg parser and replacing CPython's parser with the peg parser and different things like that. And even some of the lightning talks, just little snippets of kind of what was talked about and like a news article feed of what's going on. I found it really interesting and helpful to be able to kind of pay attention to what's going on at the Language Summit and what's going on with the language going forward to keep up with everything. I also wanted to bring up that there's been notifications recently about there's a voting coming up for the board of directors for the Python Software Foundation. And so the PSF and some of the board of directors did a video on what this feels like and what does it mean to do that. So we're linking to the video
Starting point is 00:21:57 and a link to the information about nominations because nominations are open for new board members up until the 31st of May. Oh, that's cool. Yeah, it is a little bit mysterious to me what the PSF board of director folks do. I mean, I can imagine, but I don't really know for sure. So it's cool that they've got a video on talking about that. Yeah, if you know people who should be part of it, nominate them. That's cool. Yeah. Assuming they want to be nominated awesome all right well i guess that's it for all our items you got anything extra brian i don't i'm just been working along how about you two quick things uh follow up we talked about austin the profiler which is awesome does cpu
Starting point is 00:22:38 profiling also memory profiling remember it had the 2 It had the web GUI. It had all the different user interfaces. And one of the ways you could view it, one of the many ways you could view it, was through SpeedScope. It's cool. So Wendell Bauman sent in a script that he'd created called PySpeedScope, which will let you generate one of these SpeedScope files
Starting point is 00:23:03 that you can then visualize from Python, from running Python. So anyway, I'll just link over to his GitHub repo for PySpeedScope, which looks pretty cool. Also, I updated our search engine. I realized that when you search for stuff over at Python Bytes, it would give you good results,
Starting point is 00:23:19 but it would just present them kind of as if they were all equal. So now it ranks things. So I added ranking to they were all equal. So now it ranks things. So I added ranking to our little search engine. So now if you search for stuff, it's more likely to give you good results. Well, that's cool without ranking.
Starting point is 00:23:35 Isn't it just pulling up random lists of things? Well, I mean, everything fit in there, right? Like if you search for it, everything that came up had, like if you search for,
Starting point is 00:23:45 I don't know, Jupiter, it would have toupiter in it if it came up as a result but for example if jupiter was in the title or just like a random and here's an example jupiter notebook versus the topic is about jupiter like they would show up in whatever order they just came back now it's like the stuff that's about it more specifically shows up first oh Oh, very helpful. Yeah, indeed. So hopefully that's a little bit helpful. For me too. I use that all the time. I know.
Starting point is 00:24:08 That's why I've did it. I'm like, there's this thing. I know it's in here. And why are there so many results? It should be right at the top because the title is basically what I search for. Yeah, exactly. So this is for us, but people can benefit as well. Definitely.
Starting point is 00:24:20 All right. You got some jokes? Yeah. These are definitely groaners, but they're submitted by friends of the show on Twitter. So I appreciate this. So a couple of them. Due to social distancing, I wonder how many projects are migrating to UDP and away from TLS to avoid all the handshakes? Well, we have to, right? Yeah.
Starting point is 00:24:41 You don't want to get a computer virus. Next, a chef and a vagrant walk into a bar. Within a few seconds, it was identical to the last bar they went into. I got it. So vagrant manages virtual machines and chefs help set up, configure those environments. Got it. Okay. Nice.
Starting point is 00:25:00 Anyway, so. Yeah. I like how you're leaving somewhat understanding these as an exercise of the reader. And partially I'm ruining their joy of solving the problem. Yeah. No, it's all good. I took me, both of these took me a little bit of Googling
Starting point is 00:25:17 to understand. Beautiful. I like them. They're definitely groaners, but they do kind of make you think for a second as well, which is great. Yeah. So thanks a lot.
Starting point is 00:25:25 Yeah. You bet. I think we're done. Thanks. See ya. Bye. Thank you for listening to Python Bytes. Follow the show on Twitter at Python Bytes.
Starting point is 00:25:32 That's Python Bytes as in B-Y-T-E-S. And get the full show notes at pythonbytes.fm. If you have a news item you want featured, just visit pythonbytes.fm and send it our way. We're always on the lookout for sharing something cool. This is Brian Ocken, and on behalf of myself and Michael Kennedy, Thank you for listening and sharing this podcast with your friends and colleagues.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.