The Changelog: Software Development, Open Source - Maintainer spotlight! Ned Batchelder (Interview)

Starting point is 00:00:00 Bandwidth for Changelog is provided by Fastly. Learn more at Fastly.com. We move fast and fix things here at Changelog because of Rollbar. Check them out at Rollbar.com. And we're hosted on Linode cloud servers. Head to Linode.com slash Changelog. Welcome back, everyone. This is the Changelog, a podcast featuring the hackers, leaders and innovators of software development. I'm Adam Stachowiak, Editor-in-Chief here at ChangeLog.

Starting point is 00:00:29 Today, we're shining our maintainer spotlight on Ned Batchelder. Ned is one of the lucky ones out there that gets the double dip. His day job is working on open source at edX, working on the open edX community team. Ned is also a single maintainer of Coverage.py, a tool for measuring code coverage of Python programs. This episode with Ned kicks off the first of many in our maintainer spotlight series where we dig deep into the life of an open source software maintainer. We're producing this series in partnership with our friends at Tidelift.

Starting point is 00:01:00 Huge thanks to Tidelift for making this series possible. And for the uninitiated, Tidelift is the first managed open source subscription that pays the maintainers of the exact open source projects you depend on while giving you the commercial support you've been looking for. Learn more at Tidelift.com. And now on to the show. So, Ned, when it comes to maintaining open source, you have two contexts that you do quite a bit of. The first one is Coverage.py, which is a code coverage measurement tool for Python. And then the second one is Open edX, which is the software that powers edX.org and a whole bunch of other online learning sites. So kind of cool. You have both the micro view and kind of a macro view

Starting point is 00:01:42 of open source maintainership? Yeah, I'm deeply embedded in the open source world. edX is my day job. So I work on the community team, the Open edX team at edX, and we try hard to encourage and enable contributions from people to this very large code base that, as you say, powers edX.org. It's very exciting. edX gives away free education, and there's a thousand or so other sites out there using the software to also do their own online education. And working on open source is a noble cause, and working on open source that

Starting point is 00:02:19 educates the world is, I guess, a doubly noble cause. Right, double dipping on nobility there. Double dipping on nobility, exactly. So the software that powers edX.org, can you tell me a little bit about the technical details and then maybe just how people contribute and is it self-deployed science? Go ahead.

Starting point is 00:02:38 Yeah, so it's a large Python, Django, and of course JavaScript code base. The software was started about six years ago in sort of the classic Django, and of course JavaScript codebase. The software was started about six years ago in sort of the classic Django style then with a lot of server side rendered templates. We use Mongo and MySQL databases. These days we're doing a lot of work on the front end to move away from that style of server rendered HTML to React and what we're calling micro front end. So there's a lot of technology there. When you install the software,

Starting point is 00:03:11 you basically, you either find someone who can help you install it or will run it for you because we've got a couple of dozen companies out there that make their living running OpenNX sites and customizing the sites and helping people write courses on the sites. But if you want to install it yourself, there are instructions. You get yourself an Ubuntu machine and then you run some commands and it pulls down some Ansible playbooks and installs all the software. It's a little complicated. I wouldn't recommend it to someone who's new to that kind of thing, but it can certainly be done. One of the challenges we have is that the type of people that are drawn to Open edX

Starting point is 00:03:49 are not necessarily technologists. They're educators. A professor someplace tells their grad student, hey, on your free time, can you go and download and install Open edX? And that doesn't always go so well because chemistry PhD students don't know what I mean when I say Ansible. Right. So on the community side, we try hard to make that clear

Starting point is 00:04:12 and help people find the right pathways. But it is open source, and so they can install it and run a course, and they don't need permission from us. They don't owe us any money. We don't even know where these sites are until we go out with our web scraper and find the sites, which is kind of exciting. You know, you run the web scraper, it finds a new Open edX site, and you can go and see what kind of courses people are running out there. It's pretty cool. That is pretty cool. On another show we do called JS Party,

Starting point is 00:04:38 we were just talking with George Mandis, who wrote this kind of silly JavaScript library called Konami JS, which is just the Konami cheat code. It adds it to your website and calls an arbitrary function callback, and you can do whatever you want. People use it for Easter eggs. And he didn't really track who was using it all that much when it was super active. And then recently he's been giving talks about it, so he went back to archive.org and scraped a bunch of old websites

Starting point is 00:05:06 to find all the places where Konami.js is being used and he was pleasantly surprised that a lot of big sites were putting Easter eggs in. So that always feels good when you find somebody using your software and you didn't even know it. Right, exactly. And the great thing about it is that

Starting point is 00:05:20 edX is doing a lot to educate a lot of people but our design center, our strategy is to get large educational institutions and corporations putting their courseware on the site for a very broad audience. So we've got Harvard and MIT and Microsoft and the Linux Foundation all putting courses on our site, and that's great, but there's a ton of education that needs to happen that doesn't fit that model. One of the sites I found through our web scraper is in Indonesia, the Ministry of Education has a site that has 160 different courses.

Starting point is 00:05:56 They're pretty short courses all focused on vocational skills that will help lift people out of poverty. So there's courses like how to raise chickens and how to fix motorcycle engines and how to be a hairdresser. And edX.org is never going to run a course about how to raise chickens. But that site in Indonesia is probably doing a lot for its students. And it's just really great to see our software being used for that kind of education. So while it's great to see large sites using the software, it's also great to know that

Starting point is 00:06:28 there's a long tail of different kinds of education happening because people can run courses on whatever they want using our software. So in terms of community building and open source, there's overlap there. But it's not 100%. Like you said, a lot of people aren't necessarily interested in the open source software they just want to get the software running or they're just using it to create courses

Starting point is 00:06:52 are there takeaways from community building that you use in your open source work or vice versa that are crossover skills that you found have served you well so one thing is that it takes a lot of work to make a contribution easy.

Starting point is 00:07:09 You know, sort of the old school model of running an open source project was, well, it's on GitHub, and you can click the make pull request button, and, you know, that's all I have to do. Yeah. Right, and then someone makes a pull request, and you ignore it for a long time and

Starting point is 00:07:25 you don't give them good feedback and you're not very friendly. You're being sort of a typical engineer about it. And that makes contribution difficult because people don't feel welcome. They feel confused. They're not sure what to do. They don't know how you feel about their work. They're not sure when they're going to hear from you. Making contribution really successful takes a lot of people skills. It's not a technical problem. I mean, there are technical challenges to it. Your code base might be obscure or poorly documented or it's under-tested. But in order to get the contributions to really flow, you have to have a lot of people skills up front to make sure that people are welcome, people are supported, people know what kinds of things you'd like to see them work on.

Starting point is 00:08:12 They know how you feel about things. You're not being too stringent in your rules before you can merge the pull request. And I've been learning this on both sides of it, both at work with Open edX and on Coverage.py. Coverage.py, I mean, to be perfectly honest, I'm probably a lot more like that bad side description that I just said. If you go look at Coverage.py on GitHub, there are some really old pull requests, and there are some bugs

Starting point is 00:08:37 that have been written a while ago that have no comments from me yet. That's just one of the challenges of being a single maintainer in your spare time of an open source project. But at work, at edX, we've been working a lot on trying to improve our contribution process

Starting point is 00:08:53 just making sure that the pathways are as smooth as they can be. One of the things that we've been doing at edX it's a large Python code base and of course it was written six years ago so it was and of course it was written six years ago, so it was written, I mean, it was started six years ago. So it's been running on Python 2

Starting point is 00:09:10 all that time. And Python 2's end of life is in about six months. So we've been working on getting our code base to Python 3, and a lot of that work is actually kind of low-level work, meaning it can be automated, and it just requires kind of someone

Starting point is 00:09:26 to push the button on the tool and babysit the pull request to see what the tests do, make sure it didn't do anything really crazy. But there's nothing controversial about the change, for instance. One of the difficulties with contribution to Open edX is someone says, hey, I want to build a new feature. Well, now you've got to have a big discussion. Is this the right feature?

Starting point is 00:09:46 We have 30 million learners on edX.org, so the feature that you think is going to work great for your 100 students, how is that going to scale up? It becomes a long discussion, and for good reasons. The good thing about work to convert from Python 2 to 3 is we all know that that's exactly what we want. We don't have to have a big discussion up front about what's the design, what does it look like, what's the user experience,

Starting point is 00:10:10 all those questions that are really difficult. So we've built a separate contribution process at edX specifically for that kind of incremental, uncontroversial work. And that's worked out really well to sort of build a separate lane for those kinds of contributions. So is there just like on a website somewhere there's a big if condition, like is this a controversial,

Starting point is 00:10:35 is this a feature that you want to add or is this a small thing? How do they actually funnel into those places? Right, so we use JIRA for issue tracking and so what we did is we automated the job of looking at all of our files and identifying which ones had to be run through the Python futurized converter that sort of does the mechanical Python 2 to Python 3 changes. And our tool wrote a JIRA ticket for all of the files that need to be converted in kind

Starting point is 00:11:04 of bunches of 10 or something. And so there's one Jira board that people can go to, and if there's a ticket on that board, then it tells you exactly what to do, and we know it's not going to be controversial. And so you can take one of those tickets, and you can make a pull request based on it and make a contribution. What about big features? Because you have an entity behind this, like you said, 30 million

Starting point is 00:11:27 learners on edX.org. How does that decision-making process go? Is there a product team, ultimately? Yeah. And how is it communicated back to potential contributors? Like, this is a good idea, but not for us? Or this is a terrible idea?

Starting point is 00:11:43 How does that all work? This is one of the things that makes Open edX as an open source software project very different from other potential models, other projects that we might try to be like. And that is that edX as an organization pays roughly 100 engineers to work on the software all day, every day, and runs a business based on that software. The software is deployed live to production at least once a day, sometimes more. So if a pull request gets merged and it brings the site down, people are going to get mad.

Starting point is 00:12:22 So we have to be very concerned with exactly what goes into the contributions. So you asked about product decisions. We have a product organization, of course. I mean, edX, although all of our software, almost all of our software is open source, if you just walked around the hallways here, it looks like any software business that has a website that it's running. There's product people that talk about what the feature should look like and the engineers take their directions from there and they've got

Starting point is 00:12:47 tickets of what to work on and the DevOps team is making sure the deployments are going well and all of that stuff. So when someone suggests a change, it can become a big discussion and it can be hard for them to get our attention because we're all

Starting point is 00:13:03 heads down making sure edX.org is doing what it's supposed to do for our business and that is a that is a big asymmetry and an unusual characteristic of Open edX and it's honestly the kind of the fighting that is one of the big persistent challenges for the Open edX team here is figuring out how to try to bridge that asymmetry to make the borders around edX as porous as possible, to give a voice to the community, to find ways for them to get done what they need to get done with or in spite of edX people.

Starting point is 00:13:43 You know, that's, it's really, again, it's really a people challenge. There's plenty of technical challenges in the Open edX people, again, it's really a people challenge. There's plenty of technical challenges in the Open edX code base. It's big and old. There's tech debt there. It's complicated. But it's the people challenges that really are the limiting factor in the contributions. Has edX been open from the start? Not quite the start.

Starting point is 00:14:05 We actually open sourced on June 1st, 2013. So it's been quite a long time. We've been open sourced for six years. I've been saying it started six years ago. I guess at this point, it was about seven and a half years ago that the first commit went into GitHub. Time flies.

Starting point is 00:14:18 Yeah, time flies, exactly. So pretty early on. And you've been there since the beginning? I've been here since October of 2012. So yeah. And when I came in the door, the plan was we're going to open source. Okay, that was a nice question. We have to get around to it.

Starting point is 00:14:33 Yeah. Because edX was spun out from MIT. So we've got a culture behind us of sharing. and the whole point was to open up higher institutions of higher education to help get their teaching out onto the internet and we're a non-profit technically our you know edX incorporated is a non-profit so sort of from the ground up it's been built as an open source kind of organization. Well, that probably serves it well, because if it wasn't, and then there was debate internally, and then maybe it was open sourced in haste or in anger,

Starting point is 00:15:16 buy-in is an important thing. So that's why I was trying to drill down on how long it has been open, and if it was at least planned from the start, that seems like a recipe for success, more so than the other way around where some organizations will open source for reasons like they read in a magazine that they should do it and help them get business or whatever.

Starting point is 00:15:34 It's what all the cool kids do, so we should do it. No, no, we've got a strong culture of that kind of sharing. And that doesn't mean that everyone here can easily recite an elevator pitch about why we're open source. I mean, in some ways,

Starting point is 00:15:48 having it as almost sort of background culture noise in a way almost hurts the mission because people aren't quite sure why. It's just like, well, yeah, of course we're open source. But okay, so what does success look like for the open source part of edX? Are we measured on how many sites are running or how many contributions we get

Starting point is 00:16:08 or how many people are chatting in Slack every day? Like, what is the actual success metric? So it's a very interesting, to me, it's a very interesting open source experiment to be doing open source inside what is otherwise a classic business on a website kind of software organization. So what are your metrics?

Starting point is 00:16:28 What do you gauge as success for Open edX, you personally? Right now, we are looking to maximize contributions. And for good reason. If we can get contributions into the code base, then that can feel tangible to the people who are maybe at the farther end of the open source is, of course, a good thing spectrum. So if there are people who are like, well, I'm running a business here.

Starting point is 00:16:58 Why do we bother with this? Well, we got this feature because we're open source. And if we can point to those kinds of things, then it's a very clear win, right? We don't have to get into subtle moral arguments or, you know, try to be altruistic, right? We can be capitalists about it. It's so interesting that an organization

Starting point is 00:17:19 with 100 engineers would be trying to optimize for contributions because you would think, like, we got this covered over here. I got 100 engineers engineers on this you know well so that's interesting but if there are 100 more engineers out in the in the in the community yeah they and some of them are very good engineers um you know i made i always make that joke about the chemistry phd student but there are as i said a couple of dozen firms who are filled with software engineers whose business is running our software for their own profit. And they make good contributions.

Starting point is 00:17:51 So we want to make sure that they continue making those contributions. Hence the efforts at making your contribution flow and onboarding better, right? Exactly, exactly. Well, let's turn our focus to coverage.py because unlike ed edX which is 100 engineers this is basically one engineer and that engineer is you that's right that's right tell me when it started how long you've been maintaining coverage.py and maybe how many

Starting point is 00:18:15 people are using it that kind of stuff so it's um this is the part of the story where I start spitting out numbers and people's eyes get really wide. So I've been, first of all, just to set the record straight, I didn't write Coverage.py. I did not start the project. I picked up, I was a user of the project in 2004 and

Starting point is 00:18:38 it wasn't doing a thing that I wanted it to do and I tried to contact the author, Gareth Reese, and for whatever reason I couldn't get in touch with him so I made the change to coverage.py and I put it up there and he seemed okay with that I've been maintaining it ever since so the answer your question I've been maintaining it for almost 14 years no almost 15 years 14 and a half years um i've been maintaining coverage.py so so anyone out there using a project and thinking hey i could just make one small tweak to it watch out you might be the maintainer for the next 15 years

Starting point is 00:19:15 that's kind of the beauty of open source though right like somebody else is interested and then they can just take the ball and run with it it It's beautiful. Absolutely, yeah. So I've been maintaining it for a long time. It's used by a lot of people. So in the Python world, it is pretty much the only game in town for coverage measurement. In fact, many people don't realize this, but there is a coverage measurement tool

Starting point is 00:19:35 in the Python standard library that many people have never heard of because they use Coverage.py. Wow, that's got to feel good. Yeah, it's very good. I mean, I love the fact that I can make a thing and a lot of people get benefit from it, right? That's sort of the original motivation for getting into this, right? That's sort of the lone engineer working on open source.

Starting point is 00:20:00 That's their motivation. They don't think they're going to get rich. They don't even necessarily think they're going to get famous although that seems cool they just think hey i wrote some code and then this guy i didn't even know he's using my code and he seems to like it yeah um that's cool yeah so coverage.py you ask how many people are using it um so github now has a used by thing on the top of i love that um yeah uh i'm trying to type i got the number for you if you want me to fill you in because i'm staring at tell me what it is uh 68 760 these are repositories i assume that are dependent upon coverage.py somehow or maybe just include it in there

Starting point is 00:20:38 uh i'm not sure exactly the way they count it, but they seem to know how to examine the Python requirements or setup.py files to decide that. So yeah, 68,000. The funny thing about my GitHub metrics is that that number is up at like 68,000, but I only have 700 stars. So I think I might be setting a record for the ratio of used by to stars. That I think I might be setting a record for the ratio of used by to

Starting point is 00:21:05 stars. That's interesting. I don't know that that's a proud thing to be proud of. And the reason it's got so few stars is because I only moved on to GitHub about a year ago. So I was on Bitbucket for years and years. And I moved to GitHub and there's just a

Starting point is 00:21:22 dynamic about, you know, if you're not making a splash on Hacker News, you're not going to get stars. And so I just kind of quietly moved over. All my users don't know it's even moved because they're getting it from PyPI. So I don't have that many stars. Listen up, Python people out there listening to the changelog.

Starting point is 00:21:39 CoveragePy is on GitHub now. You need to head over to there and star, helping it out, because he's got 14 years of effort into this thing. It needs more stars. Get me some stars. Yeah, so coverage.py is run like the classic guy in his bedroom

Starting point is 00:21:55 open source project. I work on it in the evenings or in the mornings over my cereal bowl on the weekends. It's been very gratifying to see the use and to see it become the de facto that it is and to know that people are getting benefit out of it. The downside, of course, is it can be hard to keep up with people's desires for it. I don't seem to get much drama in it.

Starting point is 00:22:24 A lot of open source maintainers seem to find that when their project becomes popular, it also becomes a magnet for drama. And I'm not sure why I haven't gotten that kind of infamy on coverage.py. But people ask for things, and I think that does seem like a good idea, but honestly, it's going to be two years before I get to that. And that's not a good feeling. And like I said, there are issues that are languishing there and pull requests that like seem fine maybe i don't even know i don't have time to kind of look into it and think about it so you you do have 58 contributors over the years at least in the git history maybe there's

Starting point is 00:23:00 there's more you know way back in the day when it was on some other version control. But are many of those, like you still say it's like one person coding over your cereal bowl, are there other major contributors? Are there any, maybe they're not even major, but they're in the issues? Or has it really just been casual contributions

Starting point is 00:23:20 over the years? So most of them are casual, but there have been some things that stand out. So for instance, way back in the history, the coverage.py only had a text-based report on your terminal. And the beginnings of the HTML report were contributed by George Song, who just by coincidence years later worked here at edX with me for a year or two. So that's a small world kind of story. But so he contributed that.

Starting point is 00:23:51 Recently, I've been working on the 5.0 alpha series of Coverage.py, which is the big new feature is going to be, and this is a long requested feature, so I'm glad to finally be able to get to it. Instead of just telling you which lines of your product code were covered, it will tell you for each of those lines which tests covered that line so that you can do analysis like, all right, I did a whole test run, but now I just want to see these tests, what covered it. Or I can see that only one test covered that line,

Starting point is 00:24:19 so I want to think about whether to do more tests that would get to there, et cetera, et cetera. So that feature has been a long time coming, and Stefan Richter and his coworkers at Shoebox have made some significant contributions this year to that. He added the HTML changes, some of the fixes for the SQLite code that's in there. So they made a lot of contributions, which I'm really grateful for.

Starting point is 00:24:44 And a year and a half ago, a guy I didn't know named Loic Dockery from France, he wrote to me and said basically that his way of working in open sources, he picks a project and he commits to it for like three months. And he's like fully embedded in that project for three months. And then he moves on. And I didn't know what to make of that but sure enough suddenly he was commenting on all my open issues and triaging them and trying to reproduce them and trying to fix them and there was just dozens of contributions from him all over the project i love that which was a yeah it was amazing and and it was amazing not only because people were getting responses and I was getting contributions,

Starting point is 00:25:25 but his energy just sort of helped me with my energy, right? Just having him doing things, I was in there doing things too. So the loan maintainer, not only can you only do as much as one person can do, but it can feel literally lonely and having someone to bounce things off of or just see that they're making progress too can really be energizing so i was really thankful to loweek for that um and again just by coincidence now loweek is doing work for one of those companies that i said uh runs open edX sites for for profit so i'm glad to get have him back in my circle. That's such a cool thing.

Starting point is 00:26:07 A man with a plan. He's like, I'm going to go out. I'm going to do three months. I'm going to really dive in and go all in for three months, and I'm going to move on to the next person. I mean, that's really cool. That's exactly what happened. I mean, at the end of the three months, I was like, no, don't go away.

Starting point is 00:26:21 But he said he was going to do it, and he did it. And I was really glad for that time. And maybe it wasn't three months, I'm forgetting the exact timeframe. But there was that period where Loic was all over everything. And I was really thankful for it. Well, he sounds like he might be a future guest because I got to hear he probably has stories from all sorts of projects that he's gone into and, and helped out. And that, and until, until I'd heard from him, I had never encountered anyone who worked that way. I haven't either.

Starting point is 00:26:47 Right, and my way of working, so I make lots of tiny pull requests on things that I need fixed. So I use a Vim plugin, and it doesn't work quite right, and I'll go and make a fix, or I'll go to a library, I'll make a fix. So I will make little changes all over the place, but I'm not just going to pick a project almost at random. I don't know how he picked coverage.py and commit to it. So that was a very interesting style of

Starting point is 00:27:10 working and something that I really liked. One of the other difficulty I find with being a maintainer is just the context switching. So if I'm working on coverage.py with my cup of coffee in the morning, and then I go to work, I've got to forget about all that coverage.py with my cup of coffee in the morning, well, then I go to work, I've got to forget about all that coverage.py excitement that I might have had in the morning and, you know, become excited about Open edX. And, you know, I'll do that. And during the workday, I'm embedded in those concerns and I'm thinking about what to do and I'm plotting out where I can go from that.

Starting point is 00:27:43 And then in the evening, well, now I'm switching back what to do and I'm plotting out where I can go from that. And then in the evening, well, now I'm switching back to Coverage.py. And then on the weekend, it's sort of the same dynamic, but with much bigger shifts. And that kind of context switching can be difficult because, not only because you might forget, you know, lose the thread,

Starting point is 00:27:59 the technical thread of what you were thinking about, but you get excited about, like, the next thing I'm going to do, oh, now I have to wait eight hours or whatever before I can do that thing. All right. I hope you've been enjoying this conversation between Jared and Ned, the first of many in our maintainer spotlight series. Special thanks to Tidelift. We're producing this podcast series in partnership with Tidelift because we both deeply care about supporting the maintainers of open source software. Our goal with this series is to dig deep into the life of an open source software maintainer, to learn what challenges they face, the highs and lows of

Starting point is 00:28:48 being a maintainer, how they financially support their projects, how they maintain balance between life, day job, and open source, and also how they're supporting and encouraging contributions and community. For the uninitiated on Tidelift, they're the first managed open source subscription model that pays the maintainers of the exact open source projects you depend on while giving you the commercial support you've been looking for. Tidelift's mission is simple, to support the open source software you depend on and pay the maintainers. Learn more at Tidelift.com. I have to ask for your opinion on code coverage since we're here and you write a code coverage tool. And I'm seeing that you have 90% code coverage

Starting point is 00:29:39 on coverage.py. Sounds kind of ironic, right? Yeah. Why isn't it 100? Yeah, you're not a 100% kind of guy. Well, it's not that. It's that. Well, I don't know if that's the question you wanted to ask.

Starting point is 00:29:50 I have a couple of questions. That's one of them. Yeah, go ahead. The trick, the problem here is that there is a significant amount of code in coverage.py which runs inside the Python trace function, which is code that cannot itself be covered because, or can't be measured because you are inside the measurement

Starting point is 00:30:09 and Python is not set up for it to measure its own measuring function. And so there's a lot of code there that cannot easily be seen by Covered.py. It's like a doctor operating on himself. Yeah, exactly. Yeah, something like that.

Starting point is 00:30:24 So that's where that 10% comes from. I mean, there's a couple of percent that are probably just me not pushing quite hard enough on the lever to get the percentage up, but the bulk of that 10% is because of that problem. And honestly, I've thought about tricky ways to get at it, but I also recognize that it's probably not worth it. So do you feel pressured to go to 100%

Starting point is 00:30:43 because you build a code coverage tool or do you believe in that level of coverage as a practice? I do believe in that level of coverage as a practice. I have myself personally been in a situation where I had a file that had only one line that wasn't covered

Starting point is 00:30:59 and I look at that line and I thought, well that's fine, there's no need to test that weird case, but okay let's go ahead and I write the test and there's a bug at that line and I thought, well, that's fine. There's no need to test that weird case. But okay, let's go ahead. And I write the test and there's a bug in that line. And so I have found it to be useful to get to 100% coverage. I know it can be very difficult and it means dealing with weird edge cases and maybe contorting a bit to get at those edge cases.

Starting point is 00:31:26 The other thing about 100% coverage is, in a way, once you get there, then you're really out of luck because the coverage tool can... Well, the coverage tool can no longer tell you things about your code, and there's probably still plenty of things you don't know about your code. For instance, code coverage tools can't tell you whether you covered the full range of data that you have to cover in your function only whether you covered the full range of code in your function and there's probably tons of edge cases in your data that are missing from your tests even when your function is 100 covered

Starting point is 00:32:00 there's lots of downfalls to believing in 100% coverage. Gotcha. So one question, I guess, about Python community stuff, because you're in there and you've been a part of it for a long time, and I'm on the fringes of that, looking in sometimes, talking with people who use Python but not using it on a day-to-day basis.

Starting point is 00:32:19 By the way, just to fully flesh out how deeply embedded I am, I'm also the organizer of the Boston Python Meetup. Okay, so you're deep in the community. Love it. I'm deep in the community. That's awesome. A great community, by the way.

Starting point is 00:32:31 I love all the Pythonistas we talk to. We always have a great time. Is code coverage, is that 100% goal? Do you find that to be a norm inside the Python community? One thing I always think of the docs, that great documentation is like something that Python needs to strive for. And I love that about Pythonistas,

Starting point is 00:32:49 even though that term, I can't say it too many times, I start to feel strange, but I'm hitting the ratio. But what about code coverage? Like is testing that important or is it just for you, Ned? No, I think Python has a pretty good track record of testing as a good thing. One of the things Python people will say when they're debating with static type language people is you don't need static type checking if you have good tests.

Starting point is 00:33:20 You could do a whole hour about getting into the details of that. Yes. But certainly because we don't have the types, we can't find the types of errors that static typing at compile time can find you, we do rely more on tests to find those kinds of problems. And that's also shifting a bit because now we've got gradual typing in Python that can be checked by static type checkers, you know, separate from the compilation phase. But that's still fairly new to the community. So it'll be interesting to see how that seesaw tilts as gradual typing

Starting point is 00:33:59 becomes more and more used. So we mentioned a couple times you've been doing this maintenance thing for 14 years on coverage.py. Yeah. Curious how you stay motivated. I like the story about, was it Loic, who comes in and kind of gives you this spurt of motivation. Yep. But even on a technical level,

Starting point is 00:34:19 just like working on the same code for such a long time. I'm curious if you've had spits and spurts or if you've just been slow and steady with the race. How do you stay motivated all these years? Well, so one thing, my personality, I will stick with a thing for a very long time. So I've been here at edX for six and a half years,

Starting point is 00:34:41 which is longer than probably everyone but five people here. I've been in the Python world since 1999. I'm about to celebrate my 35th wedding anniversary. I pick things and I stick with them. That's awesome. Thank you. So just by my personality, once I start a thing, I probably am fine sticking with it.

Starting point is 00:35:03 And also, I enjoy the polishing aspect of projects. There's people who just want to start new things and just be throwing out new things all over the place. I like being able to say, you know, I really nailed that. And if it took a while, that's okay.

Starting point is 00:35:19 But we're going to make it really, really good. So, I don't mind sort of, I've been working on this project for 14 years. The place where that bothers me is when there's a thing that I still don't understand about my own code, and once a year I'm revisiting the same thing, and I feel like, why can't I internalize this finally

Starting point is 00:35:42 after all this time? So there is that aspect to it. So there's my personality. But the other thing is hearing from people who use the project, getting contributions, knowing that it's helping people to improve their code in various ways. Because I work in a Python world at work, we use coverage at work, and so i see how it's being used there and

Starting point is 00:36:07 that helps inform you know what i think is important to add to the to the tool um so it's that kind of thing that seeing it actually get used and actually have some benefit which again to go all the way back that's why most people get started writing software in their spare time and then giving it away. You sort of can't explain that in pure economic terms. No, you can't. It's about the sharing

Starting point is 00:36:33 and having the benefit reflected back to you from others. Now I'm going to ask you just a series of maintainery questions. And so you can just use whichever project makes the most sense or helps answer the question, whether if it's CoveragePi or if it's Open edX, whichever one you choose. So I guess the first one, you may have already answered this, but I'll just ask explicitly and see if this is true.

Starting point is 00:37:00 I'm going to ask you, what do you like the most about being an open source maintainer? It sounds like maybe that feeling you get when somebody's using your thing, but I'm going to ask you, what do you like the most about being an open source maintainer? It sounds like maybe that feeling you get when somebody's using your thing, but I'm wondering if that's truly the number one or if that's just one of the things you like. What would you say if you're like, well, the reason that I do this or the thing I like the most about this thing I do with my free time, what would it be? That's a good question. On the Coverage.py side, I really like being able to build a thing and have it do it well. You know, it's sort of just a pure hacker feeling of it.

Starting point is 00:37:32 You know, you tell people, you like coverage, that's cool, but what if it could tell you which tests covered each line? The challenge. Well, that's kind of magic. How would you do that? That would be amazing. So it's cool to just, all right, let's think it through. What would it take?

Starting point is 00:37:47 And how can we make all that happen? And so I like the building aspect of it. But the other thing, and I keep coming back to this, I also like the people aspect of it. And I think as I get older and older, I find people more and more interesting, both in terms of what I get back from them and also the challenges in working with them. And that's on the Open edX side. Honestly, I'm not as technical in the Open edX code base as I was six and a half years ago when I started because I've been doing a lot of community work. But we do an annual conference every year. And it's just amazing to fly to that place and

Starting point is 00:38:25 see all those people from around the world who are there because of this project and they're people that i've known for years now and i know what sites they're building and the kinds of education they're doing and it's just it's a community of people that really appreciate what i'm helping them get and what i'm helping them do. And that's really rewarding. So flip that on the other side. What do you like the least about being an open source maintainer? So I don't like the feeling that I'm not doing a good job at it, but I try,

Starting point is 00:38:57 I'm not, I'm trying not to beat myself up. Right. I mean, it's not like coverage.py has to do whatever I think it should do. You know, it's, It's not like coverage.py has to do whatever I think it should do. It's sort of got a safe position as a popular project now, but even if someone were to make a new project and that were to become the one, okay, that would be okay.

Starting point is 00:39:16 So I try not to beat myself up about it. One of the things I don't like about being an open source maintainer is that people have gotten into open source for that sort of pure sharing idea. And there's a lot of people getting value from open source projects who do not think that way for a variety of reasons. And it can be easy to feel bad about that imbalance but i'm trying to think more realistically about it and it'll sort of a deeper level about why that imbalance is there and and what could be done about it do you have any over the years quote-unquote war stories or any crazy things that have happened or bad things or you said you haven't had too much drama which is

Starting point is 00:40:05 nice no but anything that other maintainers might relate to or enjoy hearing about well i'll tell you the crazy the craziest thing that happened with coverage.py i mean there's of course there's stories like oh there was that day that i released 4.3.1 and then also realized that it was broken so i had to release 4.3.2 but but that fix was also broken. So there's days like that. Everyone's been through that. But the craziest thing that happened with coverage.py, so coverage.py has an HTML report, so it generates HTML pages. And for whatever reason, I was using single quotes around the attributes in my HTML tags, just because it's visually less obtrusive than the double quotes. And I got a bug report that said,

Starting point is 00:40:46 could I please change to double quotes because I've got a tool here at work which is copying the files around and it needs to find the style sheets and it can't find the style sheets because it only finds style sheets that have double quotes around the URL. And I was like, who's writing tools that are parsing HTML

Starting point is 00:41:08 and doesn't know that both styles of quotes are okay? So I was like, no way. I am not changing for that. But then I went to PyCon, and at PyCon, there are sprints after the conference, and I was there for a day of sprints. And someone comes up to me and says, hey, I'm looking for something easy to do. I say well there are the issues and he pulled up

Starting point is 00:41:27 that issue and says well I can change all the double quotes to single quotes I mean it's the other way on single quotes to double quotes and I thought do I want to let him do that like this is just the dumbest change ever but okay he's going to do it he'll feel good about it whatever and so I he made the change and in the change log i wrote the entry in the change log i said change the quotes to double quotes to capitulate to the absurd request from quote software engineers who don't know that single quotes exist love it so i got a little snarky in the change log uh but the change was there and you know everyone's happy now so that's awesome that's funny how we can we can go about such trivial things like such small

Starting point is 00:42:12 nitpicky you know yeah i know why did i care like okay double quotes what what's it to me right it's because it's for such a wrong reason yeah exactly it is a principle of the thing not the style the principle of the thing that's right uh do you have any tips or tricks that you've learned over the years that make your life easier as a maintainer or maybe like text expander snippets or scripts you use or anything like that you can share so i haven't well i'm i do use github pull requests uh issue templates um so there if you go to write an issue on coverage.py, it'll offer you either this is a bug report or a feature request

Starting point is 00:42:50 and then it prompts you for what to fill in there. I'm not sure it's making a huge difference in the quality of the bug reports, but it seems like a good first step. GitHub's doing a lot of good little things like that that should make open source work better.

Starting point is 00:43:08 Again, from my point of view, my main tip is to really think about the person on the other side of that issue or pull request and try to be good to that person, whether that means using more words when you tell them why you're not going to take their pull request or answering them quickly even if it's to say thanks but I can't get to this for two weeks which again I'm not doing that well myself but I'm trying.

Starting point is 00:43:35 I feel like I've been saying the word people more than I've been saying code during this podcast and I think that's for a reason. I think the whole point is people ultimately so the more you can think about it as a people effort than a code effort I think the better off it'll go. Absolutely. Speaking of people, are there any people out there

Starting point is 00:43:56 that you, that are maintainers or they provide you tools or services that you admire or appreciate you want to give them a shout out, say thanks or maybe even point somebody towards a tool that you admire or appreciate, you want to give them a shout-out, say thanks, or maybe even point somebody towards a tool that you use and helps you in your day-to-day maintenance? Yeah, yeah, sure.

Starting point is 00:44:13 So one tool that I haven't been able to use on Coverage.py, but I have used on other projects, is called Hypothesis. And it's maintained by David McIver. I'm not sure that I'm pronouncing his name right. And it is a property-based testing tool, which takes a little getting used to, but when you get to the point of knowing how to make it work for your code,

Starting point is 00:44:37 it can do a really great job. Instead of writing explicit tests of this is the input and this is the output I expect, you write code that expresses what properties you expect in the results. And it tries to generate input test cases that fail those properties. So is it kind of like a fuzzer? It's kind of like a fuzzer. It's a little bit more advanced than that.

Starting point is 00:45:02 So, for instance, you can say I need a list of integers at least 10 long as input to this function, and it will start generating lists of integers, and it will start doing things like the list is a million long or the list is exactly 10 long, but all of the numbers are bigger than 2 to 64 or whatever. It just tries to find all those weird edge cases. And then if it does find a failure, it tries to walk back to a simpler case that still fails

Starting point is 00:45:31 to try to get at sort of exactly where that line is between what works and what doesn't work. So it's the same idea as a fuzzer, which is put some intelligence into the randomization of the inputs and then detect whether something failed. That's cool. Yeah, it's very cool. It's very cool.

Starting point is 00:45:49 And I've used it on other projects to good effect. I haven't been able to use it for coverage.py yet. Now, if we could just hypothesize on the actual code required to pass the test, then my job here would be done. No, no, you've still got to record podcasts oh that's true uh anybody else maintainers you admire appreciate maybe some sort of effort that you've seen put together a maintainer does this thing that i really liked and i stole it and i do it

Starting point is 00:46:16 as well or anything like that so another another name that's in my head i've never met him daniel holler i think i don't i don't know how to pronounce his last name. His GitHub handle is blue-eyed. He just seems to pop up on a lot of projects. He's been helpful on coverage, not in quite as large a way as the other people I mentioned, but he's been sort of a consistent presence. And I find when I go and look at other projects, I see his name in their pull requests too. So I think he deserves a shout out because he seems to be doing a good job at spreading his efforts around to a lot of projects and just improving things all over the place. And Julian Berman is like that too.

Starting point is 00:46:59 Awesome. I keep running into Julian. I had dinner with him. He was in Boston. We got together and that was really cool. But I've known him online as a faceless maintainer of code for a long time, and it's good to see his name pop up in various places. Isn't that fun when you know somebody online for years, and you've never actually met them,

Starting point is 00:47:21 and then you finally get to meet them in the flesh? It's always so interesting. Yeah, well, the real trick is do you call him by his real world name or by his online nick because you tell me what do you do well it feels weird to to call someone you know daniel if you've only known him as blue-eyed but you're not going to call him blue-eyed when you're sitting across the table in a restaurant. So you just got to get used to that cognitive shift between the online world and the real world. Or just the social awkwardness of calling him blue-eyed

Starting point is 00:47:51 and dealing with the consequences. Yeah, you got to hope he doesn't have a weird, too weird in there. Exactly. Well, Ned, this has been lots of fun. I love the two perspectives that you bring with Coverage.py and with Open edX. Any final words to maintainers out there

Starting point is 00:48:07 or the open source community writ large? If you had a call to action or anything you'd like to say before we call it a show? Yeah, keep up the good work. Stay optimistic. Don't let the bad stuff get you down, whether that's people yelling at you

Starting point is 00:48:24 at your issues or feeling like someone should be contributing when they're not contributing, in whatever form that contribution might take. Open source started from a really, really good impulse, and it's those good impulses that's going to keep it going. Awesome, Ned. What's the best way people can reach you online? All right, so I'm on Twitter as NedBat. It's the first three letters of my first and last names,

Starting point is 00:48:48 NedBatchElder. Coverage.py is on GitHub as NedBat slash CoveragePy. I have a blog that I've been running, again, for way too long on NedBatchElder.com where I write about open source. One of my recent pieces was about me getting over that feeling that big corporations should be doing more to help open source, or at least understanding more about that dynamic. So you might want to read about that.

Starting point is 00:49:18 Those are good ways to stay in touch. Awesome. Well, listeners, as you know, links in the show notes, all the ways you can get in touch with Ned will be there, as well as links to all things discussed and to the people who shouted out. So hit up your show notes for those things. Ned, this has been a lot of fun. Thanks so much for coming on the show. Thank you, Jared. It's been fun. tuning into this episode of the change log guess what we have comments on every single podcast episode head to changelog.com find this episode and you can discuss with the community huge thanks to tidal lift for their support of our maintainer spotlight series and of course thanks to fastly roll bar and leno for making everything we do possible our music is produced by the one and only break master cylinder if you want to hear more episodes like this, subscribe to our master feed at changelog.com slash master or go into your podcast app and search for Changelog Master.

Starting point is 00:50:10 You'll find it. Subscribe, get all of our shows as well as some extras. Only hit the master feed. It's one feed to rule them all. Again, changelog.com slash master. Thanks for tuning in. We'll see you next week. Bye.

The Changelog: Software Development, Open Source - Maintainer spotlight! Ned Batchelder (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.