The Changelog: Software Development, Open Source - Maintainer spotlight! Ned Batchelder (Interview)
Episode Date: June 28, 2019In this episode we’re shinning our maintainer spotlight on Ned Batchelder. Ned is one of the lucky ones out there that gets to double-dip — his day job is working on open source at edX, working on... the Open edX community team. Ned is also a “single maintainer” of coverage.py - a tool for measuring code coverage of Python programs. This episode with Ned kicks off the first of many in our maintainer spotlight series where we dig deep into the life of an open source software maintainer. We’re producing this series in partnership with Tidelift. Huge thanks to Tidelift for making this series possible.
Transcript
Discussion (0)
Bandwidth for Changelog is provided by Fastly. Learn more at Fastly.com.
We move fast and fix things here at Changelog because of Rollbar.
Check them out at Rollbar.com.
And we're hosted on Linode cloud servers. Head to Linode.com slash Changelog.
Welcome back, everyone.
This is the Changelog, a podcast featuring the hackers,
leaders and innovators of software development.
I'm Adam Stachowiak, Editor-in-Chief here at ChangeLog.
Today, we're shining our maintainer spotlight on Ned Batchelder.
Ned is one of the lucky ones out there that gets the double dip.
His day job is working on open source at edX, working on the open edX community team.
Ned is also a single maintainer of Coverage.py,
a tool for measuring code coverage of Python programs.
This episode with Ned kicks off the first of many in our maintainer spotlight series
where we dig deep into the life of an open source software maintainer.
We're producing this series in partnership with our friends at Tidelift.
Huge thanks to Tidelift for making this series possible.
And for the uninitiated, Tidelift is the first managed open source subscription that pays the maintainers of the exact open source projects you depend on while giving you the commercial support you've been looking for.
Learn more at Tidelift.com.
And now on to the show.
So, Ned, when it comes to maintaining open source, you have two contexts that you do quite a bit of.
The first one is Coverage.py, which is a code coverage measurement tool for Python. And then the second
one is Open edX, which is the software that powers edX.org and a whole bunch of other online
learning sites. So kind of cool. You have both the micro view and kind of a macro view
of open source maintainership?
Yeah, I'm deeply embedded in the open source world.
edX is my day job.
So I work on the community team, the Open edX team at edX,
and we try hard to encourage and enable contributions from people to this very large code base that, as you say, powers edX.org.
It's very exciting. edX gives away free education,
and there's a thousand or so other sites out there using the software to also do their own
online education. And working on open source is a noble cause, and working on open source that
educates the world is, I guess, a doubly noble cause.
Right, double dipping on nobility there.
Double dipping on nobility, exactly.
So the software that powers edX.org,
can you tell me a little bit about the technical details
and then maybe just how people contribute
and is it self-deployed science?
Go ahead.
Yeah, so it's a large Python, Django,
and of course JavaScript code base.
The software was started about six years ago in sort of the classic Django, and of course JavaScript codebase. The software was started about
six years ago in sort of the classic Django style then with a lot of server
side rendered templates. We use Mongo and MySQL databases. These days we're doing a
lot of work on the front end to move away from that style of server rendered
HTML to React and what we're
calling micro front end. So there's a lot of technology there. When you install the software,
you basically, you either find someone who can help you install it or will run it for you because
we've got a couple of dozen companies out there that make their living running OpenNX sites and
customizing the sites and helping people write
courses on the sites. But if you want to install it yourself, there are instructions. You get
yourself an Ubuntu machine and then you run some commands and it pulls down some Ansible playbooks
and installs all the software. It's a little complicated. I wouldn't recommend it to someone
who's new to that kind of thing, but it can certainly be done.
One of the challenges we have is that the type of people that are drawn to Open edX
are not necessarily technologists.
They're educators.
A professor someplace tells their grad student,
hey, on your free time, can you go and download and install Open edX?
And that doesn't always go so well
because chemistry PhD students don't know what I mean when I say Ansible.
Right.
So on the community side, we try hard to make that clear
and help people find the right pathways.
But it is open source, and so they can install it and run a course,
and they don't need permission from us.
They don't owe us any money.
We don't even know where these sites are
until we go out with our web scraper and find the sites, which is kind of exciting. You know, you run the
web scraper, it finds a new Open edX site, and you can go and see what kind of courses people
are running out there. It's pretty cool. That is pretty cool. On another show we do called JS Party,
we were just talking with George Mandis, who wrote this kind of silly JavaScript library called Konami JS,
which is just the Konami cheat code.
It adds it to your website and calls an arbitrary function callback,
and you can do whatever you want.
People use it for Easter eggs.
And he didn't really track who was using it all that much when it was super active.
And then recently he's been giving talks about it,
so he went back to archive.org and scraped a bunch of old websites
to find all the places where Konami.js is being used
and he was pleasantly surprised
that a lot of big sites were putting Easter eggs in.
So that always feels good
when you find somebody using your software
and you didn't even know it.
Right, exactly.
And the great thing about it is that
edX is doing a lot to educate a lot of people
but our design center, our
strategy is to get large educational institutions and corporations putting their courseware
on the site for a very broad audience.
So we've got Harvard and MIT and Microsoft and the Linux Foundation all putting courses
on our site, and that's great, but there's a ton of education that needs to happen that doesn't fit that model.
One of the sites I found through our web scraper is in Indonesia,
the Ministry of Education has a site that has 160 different courses.
They're pretty short courses all focused on vocational skills
that will help lift people out of poverty.
So there's courses like how to raise chickens
and how to fix motorcycle engines and how to be a hairdresser.
And edX.org is never going to run a course about how to raise chickens.
But that site in Indonesia is probably doing a lot for its students.
And it's just really great to see our software being used for that kind of education.
So while it's great to see large sites using the software, it's also great to know that
there's a long tail of different kinds of education happening because people can run
courses on whatever they want using our software.
So in terms of community building and open source, there's overlap there.
But it's not 100%.
Like you said, a lot of people aren't necessarily
interested in the open source software
they just want to get the software running
or they're just using it to create courses
are there takeaways from community
building that you use in your open source
work or vice versa
that are crossover skills that you
found have served you well
so one thing is that
it takes a lot of work
to make a contribution easy.
You know, sort of the old school model
of running an open source project was,
well, it's on GitHub,
and you can click the make pull request button,
and, you know, that's all I have to do.
Yeah.
Right, and then someone makes a pull request,
and you ignore it for a long time and
you don't give them good feedback and you're not very friendly. You're being sort of a typical
engineer about it. And that makes contribution difficult because people don't feel welcome.
They feel confused. They're not sure what to do. They don't know how you feel about their work.
They're not sure when they're going to hear from you. Making contribution really successful takes a lot of people skills.
It's not a technical problem. I mean, there are technical challenges to it. Your code base might
be obscure or poorly documented or it's under-tested. But in order to get the contributions
to really flow, you have to have a lot of people skills up front to make sure that people are welcome,
people are supported, people know what kinds of things you'd like to see them work on.
They know how you feel about things. You're not being too stringent in your rules before you can
merge the pull request. And I've been learning this on both sides of it, both at work with Open edX and on
Coverage.py. Coverage.py, I mean,
to be perfectly honest, I'm probably a lot
more like that bad side description
that I just said. If you go look
at Coverage.py on GitHub, there are some
really old pull requests, and there are some bugs
that have been written a while ago that have no comments from me
yet. That's just one of the challenges
of being a single
maintainer in your spare time
of an open source project.
But at work, at edX, we've
been working a lot on
trying to improve our contribution process
just making sure that
the pathways are as smooth
as they can be.
One of the things that we've been doing at edX
it's a large Python code base and of course
it was written six years ago so it was and of course it was written six years ago, so it was written,
I mean, it was started six years ago.
So it's been running on Python 2
all that time.
And Python 2's end of life is in about
six months. So we've been
working on getting our code base to Python
3, and a lot of that work
is actually kind of
low-level work, meaning it
can be automated, and it just requires kind of someone
to push the button on the tool
and babysit the pull request to see what the tests do,
make sure it didn't do anything really crazy.
But there's nothing controversial about the change, for instance.
One of the difficulties with contribution to Open edX
is someone says, hey, I want to build a new feature.
Well, now you've got to have a big discussion.
Is this the right feature?
We have 30 million learners on edX.org,
so the feature that you think is going to work great
for your 100 students, how is that going to scale up?
It becomes a long discussion, and for good reasons.
The good thing about work to convert from Python 2 to 3
is we all know that that's exactly what we want.
We don't have to have a big discussion up front about what's the design,
what does it look like, what's the user experience,
all those questions that are really difficult.
So we've built a separate contribution process at edX
specifically for that kind of incremental, uncontroversial work.
And that's worked out really well to sort of build a separate lane
for those kinds of contributions.
So is there just like on a website somewhere
there's a big if condition,
like is this a controversial,
is this a feature that you want to add
or is this a small thing?
How do they actually funnel into those places?
Right, so we use JIRA for issue tracking
and so what we did is we automated the job of looking at all of our files and identifying
which ones had to be run through the Python futurized converter that sort of does the
mechanical Python 2 to Python 3 changes.
And our tool wrote a JIRA ticket for all of the files that need to be converted in kind
of bunches of 10 or something.
And so there's one Jira board that people can go to,
and if there's a ticket on that board, then it tells you exactly what to do,
and we know it's not going to be controversial.
And so you can take one of those tickets,
and you can make a pull request based on it and make a contribution.
What about big features?
Because you have an entity behind this, like you said, 30 million
learners on edX.org.
How does that
decision-making process go?
Is there a product team, ultimately?
Yeah.
And how is it communicated back to potential contributors?
Like, this is a good idea, but not for us?
Or this is a terrible idea?
How does that all work?
This is one of the things that makes Open edX
as an open source software project very different from other potential models,
other projects that we might try to be like.
And that is that edX as an organization pays roughly 100 engineers
to work on the software all day, every day, and runs a business based on that software.
The software is deployed live to production at least once a day, sometimes more.
So if a pull request gets merged and it brings the site down, people are going to get mad.
So we have to be very concerned with exactly what goes into the contributions.
So you asked about product decisions.
We have a product organization, of course.
I mean, edX, although all of our software, almost all of our software is open source,
if you just walked around the hallways here,
it looks like any software business that has a website that it's running.
There's product people that talk about what the feature should look like and the engineers
take their directions from there and they've got
tickets of what to work on and the
DevOps team is making sure the deployments
are going well and all of that stuff.
So when someone suggests a
change, it can become a
big discussion and it can
be hard for them to get our attention
because we're all
heads down making sure edX.org
is doing what it's supposed to do for our business and that is a that is a big asymmetry and an
unusual characteristic of Open edX and it's honestly the kind of the fighting that is one
of the big persistent challenges for the Open edX team here is figuring out how to try to bridge that asymmetry
to make the borders around edX as porous as possible,
to give a voice to the community,
to find ways for them to get done what they need to get done
with or in spite of edX people.
You know, that's, it's really, again, it's really a people challenge. There's plenty of technical challenges in the Open edX people, again, it's really a people challenge.
There's plenty of technical challenges in the Open edX code base.
It's big and old.
There's tech debt there. It's complicated.
But it's the people challenges
that really are the limiting factor in the contributions.
Has edX been open from the start?
Not quite the start.
We actually open sourced on June 1st, 2013.
So it's been quite a long time.
We've been open sourced for six years.
I've been saying it started six years ago.
I guess at this point,
it was about seven and a half years ago
that the first commit went into GitHub.
Time flies.
Yeah, time flies, exactly.
So pretty early on.
And you've been there since the beginning?
I've been here since October of 2012.
So yeah.
And when I came in the door, the plan was we're going to open source.
Okay, that was a nice question.
We have to get around to it.
Yeah.
Because edX was spun out from MIT.
So we've got a culture behind us of sharing. and the whole point was to open up higher institutions of higher education to help get
their teaching out onto the internet and we're a non-profit technically our you know edX incorporated
is a non-profit so sort of from the ground up it's been built as an open source kind of organization.
Well, that probably serves it well, because if it wasn't,
and then there was debate internally,
and then maybe it was open sourced in haste or in anger,
buy-in is an important thing.
So that's why I was trying to drill down on how long it has been open, and if it was at least planned from the start,
that seems like a recipe for success,
more so than the other way around
where some organizations will open source
for reasons like they read in a magazine
that they should do it
and help them get business or whatever.
It's what all the cool kids do,
so we should do it.
No, no, we've got a strong culture
of that kind of sharing.
And that doesn't mean that everyone here
can easily recite an elevator pitch
about why we're open source.
I mean, in some ways,
having it as almost sort of background culture noise
in a way almost hurts the mission
because people aren't quite sure why.
It's just like, well, yeah, of course we're open source.
But okay, so what does success look like
for the open source part of edX?
Are we measured on how many sites are running
or how many contributions we get
or how many people are chatting in Slack every day?
Like, what is the actual success metric?
So it's a very interesting, to me,
it's a very interesting open source experiment
to be doing open source inside what is otherwise
a classic business on a website kind of software
organization.
So what are your metrics?
What do you gauge as success for Open edX, you personally?
Right now, we are looking to maximize contributions.
And for good reason.
If we can get contributions into the code base, then that can feel tangible to the people
who are maybe at the farther end of the open source
is, of course, a good thing spectrum.
So if there are people who are like,
well, I'm running a business here.
Why do we bother with this?
Well, we got this feature because we're open source.
And if we can point to those kinds of things,
then it's a very clear win, right?
We don't have to get into subtle moral arguments
or, you know, try to be altruistic, right?
We can be capitalists about it.
It's so interesting that an organization
with 100 engineers would be trying to optimize
for contributions because you would think, like,
we got this covered over here. I got 100 engineers engineers on this you know well so that's interesting but
if there are 100 more engineers out in the in the in the community yeah they and some of them are
very good engineers um you know i made i always make that joke about the chemistry phd student but
there are as i said a couple of dozen firms who are filled with software engineers
whose business is running our software for their own profit.
And they make good contributions.
So we want to make sure that they continue making those contributions.
Hence the efforts at making your contribution flow
and onboarding better, right?
Exactly, exactly.
Well, let's turn our focus to coverage.py
because unlike ed edX which is
100 engineers this is basically one engineer and that engineer is you that's right that's right
tell me when it started how long you've been maintaining coverage.py and maybe how many
people are using it that kind of stuff so it's um this is the part of the story where I start
spitting out numbers and people's eyes get really wide. So I've
been, first of all, just to
set the record straight, I didn't write Coverage.py.
I did not start the project.
I picked up,
I was a user of the project in
2004 and
it wasn't doing a thing that
I wanted it to do and
I tried to contact the author,
Gareth Reese, and for whatever reason I couldn't
get in touch with him so I made the change to coverage.py and I put it up there and he seemed
okay with that I've been maintaining it ever since so the answer your question I've been maintaining
it for almost 14 years no almost 15 years 14 and a half years um i've been maintaining coverage.py so so anyone out there using a project and thinking hey
i could just make one small tweak to it watch out you might be the maintainer for the next 15 years
that's kind of the beauty of open source though right like somebody else is interested and then
they can just take the ball and run with it it It's beautiful. Absolutely, yeah. So I've been maintaining it for a long time.
It's used by a lot of people.
So in the Python world,
it is pretty much the only game in town
for coverage measurement.
In fact, many people don't realize this,
but there is a coverage measurement tool
in the Python standard library
that many people have never heard of
because they use Coverage.py.
Wow, that's got to feel good.
Yeah, it's very good.
I mean, I love the fact that I can make a thing and a lot of people get benefit from it, right?
That's sort of the original motivation for getting into this, right?
That's sort of the lone engineer working on open source.
That's their motivation.
They don't think they're going to get rich.
They don't even necessarily think they're going to get famous although that seems cool they just think hey i wrote some code
and then this guy i didn't even know he's using my code and he seems to like it yeah um that's cool
yeah so coverage.py you ask how many people are using it um so github now has a used by
thing on the top of i love that um yeah uh i'm trying to type i got the number for you if you
want me to fill you in because i'm staring at tell me what it is uh 68 760 these are repositories
i assume that are dependent upon coverage.py somehow or maybe just include it in there
uh i'm not sure exactly the way they count it, but they seem to know how to examine the Python requirements
or setup.py files to decide that.
So yeah, 68,000.
The funny thing about my GitHub metrics
is that that number is up at like 68,000,
but I only have 700 stars.
So I think I might be setting a record
for the ratio of used by to stars. That I think I might be setting a record for the ratio of used by to
stars. That's interesting.
I don't know that that's a proud thing to be
proud of.
And the reason it's got so few stars
is because I only moved on to
GitHub about a year ago. So I
was on Bitbucket for years and years.
And I moved to GitHub and there's just a
dynamic about, you know, if you're not
making a splash on Hacker News,
you're not going to get stars.
And so I just kind of quietly moved over.
All my users don't know it's even moved
because they're getting it from PyPI.
So I don't have that many stars.
Listen up, Python people out there listening to the changelog.
CoveragePy is on GitHub now.
You need to head over to there and star, helping it out,
because he's got 14 years of effort into this thing.
It needs more stars.
Get me some stars.
Yeah, so coverage.py is run
like the classic
guy in his bedroom
open source project.
I work on it in the evenings
or in the mornings over my cereal bowl
on the weekends.
It's been very gratifying to see the use and to see it become the de facto that it
is and to know that people are getting benefit out of it.
The downside, of course, is it can be hard to keep up with people's desires for it.
I don't seem to get much drama in it.
A lot of open source maintainers seem to find that when their project
becomes popular, it also becomes a magnet for drama. And I'm not sure
why I haven't gotten that kind of infamy on
coverage.py. But people ask for things, and I think that does seem
like a good idea, but honestly, it's going to be two years before I get to that. And that's not a good
feeling. And like I said, there are issues that are languishing there and pull requests
that like seem fine maybe i don't even know i don't have time to kind of look into it and think
about it so you you do have 58 contributors over the years at least in the git history maybe there's
there's more you know way back in the day when it was on some other version control.
But are many of those,
like you still say it's like one person
coding over your cereal bowl,
are there other major contributors?
Are there any, maybe they're not even major,
but they're in the issues?
Or has it really just been casual contributions
over the years?
So most of them are casual,
but there have been some things that
stand out. So for instance, way back in the history, the coverage.py only had a text-based
report on your terminal. And the beginnings of the HTML report were contributed by George Song,
who just by coincidence years later worked here at edX with me for a year or two.
So that's a small world kind of story.
But so he contributed that.
Recently, I've been working on the 5.0 alpha series of Coverage.py, which is the big new
feature is going to be, and this is a long requested feature, so I'm glad to finally
be able to get to it.
Instead of just telling you which lines of your product code were covered,
it will tell you for each of those lines which tests covered that line
so that you can do analysis like, all right, I did a whole test run,
but now I just want to see these tests, what covered it.
Or I can see that only one test covered that line,
so I want to think about whether to do more tests that would get to there,
et cetera, et cetera.
So that feature has been a long time coming,
and Stefan Richter and his coworkers at Shoebox
have made some significant contributions this year to that.
He added the HTML changes,
some of the fixes for the SQLite code that's in there.
So they made a lot of contributions, which I'm really grateful for.
And a year and a half ago, a guy I didn't know named Loic Dockery from France, he wrote to me
and said basically that his way of working in open sources, he picks a project and he commits
to it for like three months. And he's like fully embedded in that project for three months.
And then he moves on. And I didn't know what to make of that but sure
enough suddenly he was commenting on all my open issues and triaging them and trying to reproduce
them and trying to fix them and there was just dozens of contributions from him all over the
project i love that which was a yeah it was amazing and and it was amazing not only because
people were getting responses and I was getting contributions,
but his energy just sort of helped me with my energy, right? Just having him doing things,
I was in there doing things too. So the loan maintainer, not only can you only do as much
as one person can do, but it can feel literally lonely and having someone to bounce
things off of or just see that they're making progress too can really be energizing so i was
really thankful to loweek for that um and again just by coincidence now loweek is doing work for
one of those companies that i said uh runs open edX sites for for profit so i'm glad to get have
him back in my circle.
That's such a cool thing.
A man with a plan.
He's like, I'm going to go out.
I'm going to do three months.
I'm going to really dive in and go all in for three months,
and I'm going to move on to the next person.
I mean, that's really cool.
That's exactly what happened.
I mean, at the end of the three months, I was like, no, don't go away.
But he said he was going to do it, and he did it.
And I was really glad for that
time. And maybe it wasn't three months, I'm forgetting the exact timeframe. But there was
that period where Loic was all over everything. And I was really thankful for it.
Well, he sounds like he might be a future guest because I got to hear he probably has stories
from all sorts of projects that he's gone into and, and helped out.
And that, and until, until I'd heard from him, I had never encountered anyone who worked that way.
I haven't either.
Right, and my way of working,
so I make lots of tiny pull requests on things that I need fixed.
So I use a Vim plugin, and it doesn't work quite right,
and I'll go and make a fix, or I'll go to a library,
I'll make a fix.
So I will make little changes all over the place,
but I'm not just going to pick a project almost at random.
I don't know how he picked coverage.py and commit to it. So that was a very interesting style of
working and something that I really liked. One of the other difficulty I find with being a
maintainer is just the context switching. So if I'm working on coverage.py with my cup of coffee
in the morning, and then I go to work, I've got to forget about all that coverage.py with my cup of coffee in the morning, well, then I go to work, I've
got to forget about all that coverage.py excitement that I might have had in the morning and,
you know, become excited about Open edX.
And, you know, I'll do that.
And during the workday, I'm embedded in those concerns and I'm thinking about what to do
and I'm plotting out where I can go from that.
And then in the evening, well, now I'm switching back what to do and I'm plotting out where I can go from that. And then in the evening,
well, now I'm switching back to Coverage.py.
And then on the weekend,
it's sort of the same dynamic,
but with much bigger shifts.
And that kind of context switching can be difficult because,
not only because you might forget,
you know, lose the thread,
the technical thread of what you were thinking about,
but you get excited about,
like, the next thing I'm going to do,
oh, now I have to wait eight hours or whatever before I can do that thing.
All right.
I hope you've been enjoying this conversation between Jared and Ned, the first of many in our maintainer spotlight series.
Special thanks to Tidelift.
We're producing this podcast series in partnership with Tidelift because we both deeply care about supporting the maintainers of open source software. Our goal with this series is to dig deep into the life of an open source software maintainer, to learn what challenges they face, the highs and lows of
being a maintainer, how they financially support their projects, how they maintain balance between
life, day job, and open source, and also how they're supporting and encouraging contributions
and community. For the uninitiated on Tidelift, they're the first managed open source subscription
model that pays the maintainers of the exact open source projects you depend on while giving you the commercial support you've been looking for.
Tidelift's mission is simple, to support the open source software you depend on and pay the maintainers.
Learn more at Tidelift.com. I have to ask for your opinion on code coverage
since we're here and you write a code coverage tool.
And I'm seeing that you have 90% code coverage
on coverage.py.
Sounds kind of ironic, right?
Yeah.
Why isn't it 100?
Yeah, you're not a 100% kind of guy.
Well, it's not that.
It's that.
Well, I don't know if that's the question you wanted to ask.
I have a couple of questions.
That's one of them.
Yeah, go ahead.
The trick, the problem here is that there is a significant amount of code in coverage.py
which runs inside the Python trace function, which is code that cannot itself be covered
because, or can't be measured
because you are inside
the measurement
and Python is not set
up for it to measure its own
measuring function. And so
there's a lot of code there that
cannot easily be seen by
Covered.py. It's like a doctor operating on himself.
Yeah, exactly.
Yeah, something like that.
So that's where that 10% comes from.
I mean, there's a couple of percent
that are probably just me not pushing quite hard enough
on the lever to get the percentage up,
but the bulk of that 10% is because of that problem.
And honestly, I've thought about tricky ways to get at it,
but I also recognize that it's probably not worth it.
So do you feel pressured to go to 100%
because you build a code coverage tool or do you believe
in that level of coverage as a practice?
I do believe in that level
of coverage as a practice.
I have
myself personally been in a situation
where I had a file
that had only one line that wasn't covered
and I look at that line and I thought, well
that's fine, there's no need to test
that weird case, but okay let's go ahead and I write the test and there's a bug at that line and I thought, well, that's fine. There's no need to test that weird case.
But okay, let's go ahead.
And I write the test and there's a bug in that line.
And so I have found it to be useful to get to 100% coverage.
I know it can be very difficult and it means dealing with weird edge cases
and maybe contorting a bit to get at those edge cases.
The other thing about 100% coverage is, in a way,
once you get there, then you're really out of luck
because the coverage tool can...
Well, the coverage tool can no longer tell you things about your code,
and there's probably still plenty of things you don't know about your code.
For instance, code coverage tools can't tell you whether you covered the full range of data that you have to cover in your
function only whether you covered the full range of code in your function and there's probably tons
of edge cases in your data that are missing from your tests even when your function is 100 covered
there's lots of downfalls to believing in 100% coverage. Gotcha.
So one question, I guess, about Python
community stuff, because you're in there
and you've been a part of it for a long time,
and I'm on the fringes of that, looking
in sometimes, talking with people
who use Python but not using it
on a day-to-day basis.
By the way, just to fully flesh
out how deeply embedded I am, I'm also
the organizer of the Boston Python Meetup.
Okay, so you're deep in the community.
Love it.
I'm deep in the community.
That's awesome.
A great community, by the way.
I love all the Pythonistas we talk to.
We always have a great time.
Is code coverage, is that 100% goal?
Do you find that to be a norm inside the Python community?
One thing I always think of the docs,
that great documentation is like something
that Python needs to strive for.
And I love that about Pythonistas,
even though that term, I can't say it too many times,
I start to feel strange, but I'm hitting the ratio.
But what about code coverage?
Like is testing that important
or is it just for you, Ned?
No, I think Python has a pretty good track record
of testing as a good thing.
One of the things Python people will say when they're debating with static type language people is you don't need static type checking if you have good tests.
You could do a whole hour about getting into the details of that.
Yes.
But certainly because we don't have the types,
we can't find the types of errors that static typing at compile time can find you,
we do rely more on tests to find those kinds of problems.
And that's also shifting a bit because now we've got gradual typing in Python that can be checked by static
type checkers, you know, separate from the compilation phase. But that's still fairly
new to the community. So it'll be interesting to see how that seesaw tilts as gradual typing
becomes more and more used. So we mentioned a couple times you've been doing this maintenance thing for 14 years on coverage.py.
Yeah.
Curious how you stay motivated.
I like the story about, was it Loic,
who comes in and kind of gives you
this spurt of motivation.
Yep.
But even on a technical level,
just like working on the same code
for such a long time.
I'm curious if you've had spits and spurts
or if you've just been slow and steady with the race.
How do you stay motivated all these years?
Well, so one thing, my personality,
I will stick with a thing for a very long time.
So I've been here at edX for six and a half years,
which is longer than probably everyone but five people here.
I've been in the Python world since 1999.
I'm about to celebrate my 35th wedding anniversary.
I pick things and I stick with them.
That's awesome.
Thank you.
So just by my personality, once I start a thing,
I probably am fine sticking with it.
And also, I enjoy the
polishing aspect of projects.
There's people who just want to
start new things and just be throwing out new things
all over the place. I
like being able to say, you know,
I really nailed that.
And if it took a while, that's okay.
But we're going to make it really, really good.
So, I don't
mind sort of,
I've been working on this project for 14 years.
The place where that bothers me is when there's a thing
that I still don't understand about my own code,
and once a year I'm revisiting the same thing,
and I feel like, why can't I internalize this finally
after all this time?
So there is that aspect to it.
So there's my personality.
But the other thing is hearing from people who use the project,
getting contributions,
knowing that it's helping people to improve their code in various ways.
Because I work in a Python world at work,
we use coverage at work, and so i see how it's being used there and
that helps inform you know what i think is important to add to the to the tool um so it's
that kind of thing that seeing it actually get used and actually have some benefit which again
to go all the way back that's why most people get started writing software in their spare time
and then giving it away.
You sort of can't explain that
in pure economic terms.
No, you can't.
It's about the sharing
and having the benefit
reflected back to you from others.
Now I'm going to ask you
just a series of maintainery questions.
And so you can just use whichever project makes the most sense or helps answer the question,
whether if it's CoveragePi or if it's Open edX, whichever one you choose.
So I guess the first one, you may have already answered this, but I'll just ask explicitly
and see if this is true.
I'm going to ask you, what do you like the most about being an open source maintainer?
It sounds like maybe that feeling you get when somebody's using your thing, but I'm going to ask you, what do you like the most about being an open source maintainer? It sounds like maybe that feeling you get when somebody's using your thing, but I'm
wondering if that's truly the number one or if that's just one of the things you like.
What would you say if you're like, well, the reason that I do this or the thing I like
the most about this thing I do with my free time, what would it be?
That's a good question.
On the Coverage.py side, I really like being able to build a thing and have it do it well.
You know, it's sort of just a pure hacker feeling of it.
You know, you tell people, you like coverage, that's cool,
but what if it could tell you which tests covered each line?
The challenge.
Well, that's kind of magic.
How would you do that?
That would be amazing.
So it's cool to just, all right, let's think it through.
What would it take?
And how can we make all that happen?
And so I like the building aspect of it.
But the other thing, and I keep coming back to this, I also like the people aspect of it.
And I think as I get older and older, I find people more and more interesting, both in terms of what I get back from them and
also the challenges in working with them. And that's on the Open edX side. Honestly, I'm not
as technical in the Open edX code base as I was six and a half years ago when I started because
I've been doing a lot of community work. But we do an annual conference every year. And it's just
amazing to fly to that place and
see all those people from around the world who are there because of this project and they're
people that i've known for years now and i know what sites they're building and the kinds of
education they're doing and it's just it's a community of people that really appreciate what
i'm helping them get and what i'm helping them do. And that's really rewarding.
So flip that on the other side.
What do you like the least about being an open source maintainer?
So I don't like the feeling that I'm not doing a good job at it,
but I try,
I'm not,
I'm trying not to beat myself up.
Right.
I mean,
it's not like coverage.py has to do whatever I think it should do.
You know, it's, It's not like coverage.py has to do whatever I think it should do. It's sort of got a safe position as a popular project now,
but even if someone were to make a new project
and that were to become the one, okay, that would be okay.
So I try not to beat myself up about it.
One of the things I don't like about being an open source maintainer
is that people have gotten into open source for that sort of pure sharing idea.
And there's a lot of people getting value from open source projects who do not think that way for a variety of reasons.
And it can be easy to feel bad about that imbalance but i'm trying to think more realistically about it and it'll
sort of a deeper level about why that imbalance is there and and what could be done about it
do you have any over the years quote-unquote war stories or any crazy things that have happened
or bad things or you said you haven't had too much drama which is
nice no but anything that other maintainers might relate to or enjoy hearing about well i'll tell
you the crazy the craziest thing that happened with coverage.py i mean there's of course there's
stories like oh there was that day that i released 4.3.1 and then also realized that it was broken so
i had to release 4.3.2 but but that fix was also broken. So there's
days like that. Everyone's been through that. But the craziest thing that happened with coverage.py,
so coverage.py has an HTML report, so it generates HTML pages. And for whatever reason, I was using
single quotes around the attributes in my HTML tags, just because it's visually less obtrusive
than the double quotes. And I got a bug report that said,
could I please change to double quotes
because I've got a tool here at work
which is copying the files around
and it needs to find the style sheets
and it can't find the style sheets
because it only finds style sheets
that have double quotes around the URL.
And I was like, who's writing tools that are parsing HTML
and doesn't know that both styles of quotes are okay?
So I was like, no way.
I am not changing for that.
But then I went to PyCon, and at PyCon,
there are sprints after the conference,
and I was there for a day of sprints.
And someone comes up to me and says,
hey, I'm looking for something easy to do. I say well there are the issues and he pulled up
that issue and says well I can change all the double quotes to single quotes I mean it's the
other way on single quotes to double quotes and I thought do I want to let him do that like this is
just the dumbest change ever but okay he's going to do it he'll feel good about it whatever and so
I he made the change and in the change log i wrote
the entry in the change log i said change the quotes to double quotes to capitulate to the
absurd request from quote software engineers who don't know that single quotes exist love it so
i got a little snarky in the change log uh but the change was there and you know everyone's happy now so that's awesome
that's funny how we can we can go about such trivial things like such small
nitpicky you know yeah i know why did i care like okay double quotes what what's it to me right
it's because it's for such a wrong reason yeah exactly it is a principle of the thing not the
style the principle of the thing that's right uh do you have any tips or tricks that you've learned over the years that make your life
easier as a maintainer or maybe like text expander snippets or scripts you use or anything like that
you can share so i haven't well i'm i do use github pull requests uh issue templates um so
there if you go to write an issue on
coverage.py, it'll offer you
either this is a bug report or a feature request
and then it prompts you for what to
fill in there. I'm not sure it's making
a huge difference in
the quality of the bug reports,
but it seems like a good first step.
GitHub's doing a lot of
good little things like that
that should make open source work better.
Again, from my point of view,
my main tip is to really think about the person
on the other side of that issue or pull request
and try to be good to that person,
whether that means using more words
when you tell them why you're not going to take their pull request
or answering them quickly even if it's to say thanks but I can't get to this for two weeks
which again I'm not doing that well myself but I'm trying.
I feel like I've been saying the word people more than I've been saying code during this podcast
and I think that's for a reason.
I think the whole point is people ultimately
so the more you can
think about it as a people
effort than a code effort I think the
better off it'll go. Absolutely.
Speaking of people, are there any people out there
that you, that are
maintainers or they provide you tools
or services that you
admire or appreciate you want to give them
a shout out, say thanks or maybe even point somebody towards a tool that you admire or appreciate, you want to give them a shout-out, say thanks,
or maybe even point somebody towards a tool that you use
and helps you in your day-to-day maintenance?
Yeah, yeah, sure.
So one tool that I haven't been able to use on Coverage.py,
but I have used on other projects, is called Hypothesis.
And it's maintained by David McIver.
I'm not sure that I'm pronouncing his name right.
And it is a property-based testing tool,
which takes a little getting used to,
but when you get to the point of knowing
how to make it work for your code,
it can do a really great job.
Instead of writing explicit tests of this is the input
and this is the output I expect,
you write code that expresses what properties you expect in the results.
And it tries to generate input test cases that fail those properties.
So is it kind of like a fuzzer?
It's kind of like a fuzzer.
It's a little bit more advanced than that.
So, for instance, you can say I need a list of integers at least 10 long as input to this function,
and it will start generating lists of integers,
and it will start doing things like the list is a million long
or the list is exactly 10 long,
but all of the numbers are bigger than 2 to 64 or whatever.
It just tries to find all those weird edge cases.
And then if it does find a failure,
it tries to walk back to a simpler case that still fails
to try to get at sort of exactly where that line is
between what works and what doesn't work.
So it's the same idea as a fuzzer,
which is put some intelligence into the randomization of the inputs
and then detect whether something failed.
That's cool.
Yeah, it's very cool.
It's very cool.
And I've used it on other projects to good effect.
I haven't been able to use it for coverage.py yet.
Now, if we could just hypothesize
on the actual code required to pass the test,
then my job here would be done.
No, no, you've still got to record podcasts oh that's
true uh anybody else maintainers you admire appreciate maybe some sort of effort that you've
seen put together a maintainer does this thing that i really liked and i stole it and i do it
as well or anything like that so another another name that's in my head i've never met him daniel
holler i think i don't i don't know how to pronounce his last
name. His GitHub handle is blue-eyed. He just seems to pop up on a lot of projects. He's been
helpful on coverage, not in quite as large a way as the other people I mentioned, but he's been
sort of a consistent presence. And I find when I go and look at other projects, I see his name
in their pull requests too.
So I think he deserves a shout out because he seems to be doing a good job at spreading his efforts around to a lot of projects and just improving things all over the place.
And Julian Berman is like that too.
Awesome. I keep running into Julian.
I had dinner with him.
He was in Boston.
We got together and that was really cool.
But I've known him online as a faceless maintainer of code for a long time,
and it's good to see his name pop up in various places.
Isn't that fun when you know somebody online for years,
and you've never actually met them,
and then you finally get to meet them in the flesh?
It's always so interesting.
Yeah, well, the real trick is do you call him by his real world name
or by his online nick because you tell me what do you do well it feels weird to to call someone
you know daniel if you've only known him as blue-eyed but you're not going to call him blue-eyed
when you're sitting across the table in a restaurant. So you just got to get used to that cognitive shift
between the online world and the real world.
Or just the social awkwardness of calling him blue-eyed
and dealing with the consequences.
Yeah, you got to hope he doesn't have a weird,
too weird in there.
Exactly.
Well, Ned, this has been lots of fun.
I love the two perspectives that you bring
with Coverage.py and with Open edX.
Any final words to maintainers out there
or the open source community writ large?
If you had a call to action
or anything you'd like to say
before we call it a show?
Yeah, keep up the good work.
Stay optimistic.
Don't let the bad stuff get you down,
whether that's people yelling at you
at your issues
or feeling like someone should be contributing when they're not contributing,
in whatever form that contribution might take.
Open source started from a really, really good impulse,
and it's those good impulses that's going to keep it going.
Awesome, Ned. What's the best way people can reach you online?
All right, so I'm on Twitter as NedBat.
It's the first three letters of my first and last names,
NedBatchElder.
Coverage.py is on GitHub as NedBat slash CoveragePy.
I have a blog that I've been running, again, for way too long
on NedBatchElder.com where I write about open source.
One of my recent pieces was about me getting over that feeling
that big corporations should be doing more to help open source,
or at least understanding more about that dynamic.
So you might want to read about that.
Those are good ways to stay in touch.
Awesome. Well, listeners, as you know, links in the show notes,
all the ways you can get in touch with Ned will be there, as well as links to all things discussed and to the people who shouted out. So hit up your show notes for those things. Ned, this has been a lot of fun. Thanks so much for coming on the show.
Thank you, Jared. It's been fun. tuning into this episode of the change log guess what we have comments on every single podcast episode head to changelog.com find this episode and you can discuss with the community huge thanks
to tidal lift for their support of our maintainer spotlight series and of course thanks to fastly
roll bar and leno for making everything we do possible our music is produced by the one and
only break master cylinder if you want to hear more episodes like this, subscribe to our master feed at changelog.com slash master
or go into your podcast app and search for Changelog Master.
You'll find it.
Subscribe, get all of our shows as well as some extras.
Only hit the master feed.
It's one feed to rule them all.
Again, changelog.com slash master.
Thanks for tuning in.
We'll see you next week. Bye.