The Changelog: Software Development, Open Source - Best Practices Badge from Core Infrastructure Initiative (Interview)
Episode Date: August 12, 2016David A. Wheeler, from Core Infrastructure Initiative, joined the show to talk about the CII Best Practices Badge program....
Transcript
Discussion (0)
I'm David A. Wheeler, and you are listening to The Change Log.
Welcome back, everyone. This is The Change Log, and I'm your host, Adam Stachowiak. This
is episode 215, and today on the show, Jared and I are talking to David A. Wheeler. He's from the
Core Infrastructure Initiative, and specifically, we talked about the CII Best Practices Badge Program. We talked about what this program is, where it came from,
who thought of it, who developed it, why Heartbleed inspired it. We also talked about
why you should get certified and what certification means. We had two sponsors today on the show,
Linode and TopTow. First sponsor of the show is our friends at Linode cloud server of choice
here at changelog head to
linode.com slash changelog
use our promo code changelog20
for a $20 credit
and all you gotta do is pick a plan
pick a distro and pick a location
get an SSD server running in seconds
plans start at just $10 a month
head to linode.com slash changelog
and now on to the show
all right jay we're here with david a wheeler now the a in the middle there is
is pretty important because if you search for David Wheeler, what do you find?
Probably a whole bunch of folks.
A whole bunch of folks.
So I'm happy to talk to people as David.
It's just I used my middle initial so people can find me later.
Gotcha.
You know, like most good shows for us came from a ping,
and this is actually from David himself.
So how did this kind of give us a breakdown of what this ping was all about? Well, actually, I've been listening to the
changelog for some time. So, you know, I'm working on this project called the Best Practices
Badging Project for the CII. I'm sure we'll talk about that in a moment. And so, hey,
who would be interested in this? And oh, man, I bet a lot of people listening to the changelog
would want to hear about this.
So I contacted you guys and here we are.
You got lots of energy, David.
I like that, man.
You come on a show like this and you go on pinging like who best to tell my story with in the changelog.
That's awesome.
Oh, hey, you're welcome.
I enjoy listening to you guys.
So thanks for having me on.
I love it, too.
In the pre-show, David mentioned that he listens to our podcast,
but he listens to us at 2.4X, which is just crazy fast.
And so I'm having a hard time keeping up with you here, David.
I'm going to have to slow you down a little bit.
I don't know.
Maybe that's why he's got so much energy because he's a 2.4 listener.
He wants to get a word in.
That's kind of funny.
It's more because there's a lot of good stuff out there
and it's kind of a fire hose to keep up.
So I do what I can.
I try to speed up my reading,
speed up my listening so I can keep up.
Well, from our perspective, we saw that ping
and we'll tell you, David,
not very often does people ping us about themselves
and get on the show.
I think you're number two.
The first one, I believe, was Evan Yeo or Evan Yu with UJS.
He pinged us asking to be on and his ping was quite impressive.
And so he said, yeah, come on on.
Lots of times people are pinging about other projects.
In your case, I wasn't quite sure about the CII best practices badge as a topic until
I started hearing our friend Daniel Stenberg
blog about it on a repeated basis and trying to get Curl certified.
So that certified this topic to me.
And we're really glad to have you on the show.
Oh, thanks.
And he was very, very instrumental.
He was one of the early people who reviewed it and provided a whole lot of comments.
So and Curl does have the badge.
I'm sure we'll talk more in a sec.
So we really appreciate that.
Very cool.
Well, as you know, as a listener, before we dive into the topics,
we like to dive into the history of our guests a little bit
and just hear where you're coming from.
So if you had an origin story to tell, could you share it with us?
Sure. Although my origin story is a little odd in some points. My first computer was actually
in my middle school. They had an ancient PDP-8, which had six kibibytes total memory, and a literal front panel.
But as soon as I got to use that thing, I was hooked.
I loved computers and have loved them ever since.
A little later on, I ended up with an Apple II and just studied the heck out of it.
I think there was probably a time when I could have rebuilt it from transistors because I thought it was incredibly amazing that you could do this thing called programming.
And ever since, I've been working very much on anything relating to computers.
How can we make software?
How can we make software better?
I've been doing really, what, since the 90s, a lot of work relating specifically to either
to open source software.
And I've been doing security even before that.
So I'm really, really interested in open source software.
I'm really interested in security.
And that kind of brings, I think, you up to date kind of where my interests are.
One of the interesting bits I pulled out of your bio, which caught my eye, was this line about the scepter of Goth.
So you said that in the 80s, you were the maintainer of the scepter of Goth, which is the first commercial multiplayer role playing game in the US and perhaps in the entire world.
Can you unpack that for us and give us some details?
Sure.
You're pulling that way
back machine uh yeah way way back um you know uh i don't know if you remember adventure and zork and
that sort of thing but they were these text-based games you could type back and forth to each other
but no it's not back and forth to each other but you know type in you know get thing drop thing
kill troll right uh and uh basically a number of some folks had the idea
of well this would be cool as a multiplayer game you know this is back when modems were just coming
becoming available and that sort of thing and so i was part of a company which basically
ran as a franchise uh this uh scepter of goth thing where um basically people would log in with their modems
and they could work with other people uh choose various characters if you're familiar with dnd
uh you've got the right idea so you know choose your character over time you get experience you
level up a whole lot of mechanisms that now look kind of normal and everyday and lots of systems
use it but it was kind of challenging the first time.
Gee, nobody's ever done a multiplayer
role-playing game before.
How do you do this?
We had all sorts of weird
problems making that work,
but
it was a lot of fun.
Wow, actually, interesting.
Just reading it, I assumed it's mid-80s
and it's a multiplayer.
So it must have been a card game or a one of those book games where it reads to you, you know, reads out the scenarios.
But this is actually this is actually a digital online experience for people.
Oh, absolutely. Go talk to the bartender and that sort of thing.
Now, the computers that we had at the time were really pathetic. We were running these out of an 8086 with 4.7 second megahertz running 16 users.
So we had to do a lot of tricks.
But one of the big tricks we didn't, it always tells that some people, the dungeon masters, could quietly show up and pretend to be some of the characters.
And all of a sudden, that bartender had amazing AI.
And it only took a few times for people to be very impressed by that.
But even when they weren't on, you could run around, go get the monsters,
go try to find the sharkies to go buy your stuff uh who's always getting shut down and move
somewhere else um and uh you know people had a lot of fun with that people still contact me about
that you know of course long since obsolete people moved on uh but it was uh kind of a cool
cool experience at the time i'm curious about the maintainer side of that you know i mean if that's
part of your origin story at least it's not what you said.
It's something Jared brought up in your bio, but it's in there for a reason, right?
What's it like to maintain that?
I remember one night where I was drinking.
I think it was my second big gold jolt at three in the morning trying to fix some nasty, horrible bug.
It was all in C, heavily optimized with all sorts of special optimizations to try to coax out of these really slow, low-memory machines
the kind of performance necessary.
And I remember spending days optimizing one particular command, the follow command, but it was the way that you managed to get groups together.
So it was really important to get right.
And the number of edge cases were ridiculous.
Things like, well, wait a minute, you may be following someone who's invisible, so other people can't see him.
And then a monster may be following you.
Just all sorts of crazy edge cases that you had to deal with uh i think that uh that jolt is still involved in your system right now because
three big gulp jolts those things were actually outlawed i believe in certain states
because they had so much caffeine oh i i i remember my hands vibrating on the keyboard
after one of those so that's a fun story thanks thanks uh Thanks. So, you know, we had fun. There are other stories we could tell sometime, but after that, I say, you know, maybe I'd
do something else.
Yeah.
So up to modern day where you're involved with this best practices badge program, which
is a core infrastructure initiative, which is part of the Linux foundation.
So a couple of, you know, moving objects here here we'd like to kind of define and nail down,
and especially your relationship with these organizations,
if you're gainfully employed or running it or it's a volunteer thing,
give us the rundown on kind of the players involved in this situation.
Sure. Okay. So let me pull out the baseball cards here so we can identify who's who.
I think a lot of your listeners were probably already familiar with the Linux Foundation.
They employ somebody called Linus Torvalds.
You may have heard of him, some other folks.
They actually run a whole lot of projects, including the Linux kernel as far as funding it and so on.
But a couple of years ago, two years ago, Heartbleed came out.
Big vulnerability in OpenSSL
and
the Lynx Foundation looked and said, wow,
that's,
A, that's a problem, but B, that's a
symptom of a bigger problem. What can
we do to fix that problem?
So they established this thing called the
Core Infrastructure Initiative
and yet it's not a very clear name, but the idea behind it is actually very clear.
It's basically, can we identify the software that's important and find ways to improve things so that the software that we all depend on is more secure and better shape and that sort of thing.
And they've actually funded some specific projects.
For example, they've actually put money into OpenSSL.
They've put money into several other projects,
basically trying to identify some of the key software,
really important.
We need to make sure that stuff is more solid than it is
in cases where there's an issue.
But one thing that immediately became clear is there is no way they can fund everything.
So they're also interested in some projects that can kind of raise all boats, as it were.
And that's where this best practices badge comes in.
The idea is, hey, there are clearly some practices that are generally accepted as these are things you should be doing,
but it doesn't mean everybody's actually doing them.
So can we come up with a list of here's the criteria that's generally accepted?
This is what open source software projects should do.
And then if you actually are doing them, you get a little badge.
And that, of course, helps users figure out, hey, is my project okay or not?
Is this project I'm planning to depend on?
Is it in decent shape or kind of doing the basics or not?
And it also helps projects because most people involved in projects, they want to do the right stuff.
But it's not always obvious when you're trying to fix some specific bug. Oh, wait a minute.
You've got a basic problem here with your project.
So it kind of helps them also figure out, well, what are the basics that need doing?
When we talk about the core infrastructure initiative, you said that they raise funds.
And if you look at the homepage, there's quite a big, quite a list of tech companies
that are providing funding for this.
Amazon, Google, Facebook, all the big players, IBM, Microsoft,
so on and so forth. And a lot of industry leading security experts as well. You have Bruce Schneier,
Dan Kaminsky, Alan Cox, and so on and so forth. So are these people paid as advisors? Are they
like employees of this? I just like to know the kind of how these things fit together.
Yeah.
Some of the stuff, there's probably other people who might be able to better answer
all of that than I would.
Because I focus more on the badging and census work.
But that said, basically what happened is that each of those companies that you mentioned
have kicked in funds either to the Linux Foundation or, you know, if you're looking at the CII list,
that's all the folks who have kicked in money specifically for the Core Infrastructure Initiative.
Hey, we all depend on these programs, and we want to make sure that they're healthy.
If we put money together, then, you know, by collaborating together with the funding,
we can help make those projects more healthy, better, and so on. And, you know, and they, and everybody else reaps the benefits as far as,
you know, who gets, who gets paid and so on. I mean, let's see, the Linux foundation itself
is something called a 501 C six, which is basically an industry, not a, it's a nonprofit
industry consortium. So some of the people that you mentioned there, they're actually employees
of other companies, and they basically provide some time to
they're funded by those companies to help oversee to make sure the Linux Foundation
and the CII are on track, doing the best
they can. And they certainly do direct, but they also
provide great advice uh because lots of
people of course have been around in the industry for a long time uh help make sure that we get good
things going um but then the linux foundation cii itself have employees i'm actually not an
employee of the linux foundation uh if you want to follow the money stuff i'm actually an employee
of a non-profit of a different non different nonprofit company contracted to the Linux Foundation.
The Linux Foundation actually reached out to me because I've been interested in open source security really since the early 90s.
So I've been doing this stuff for a long time.
And when they said, hey, who do we know that's really interested and has done a lot of background work on open source and security?
Apparently, I was on their short list.
They reached out and said, oh, man, this would be awesome.
Let's make it happen.
And so that's what we did.
So what do you do from day to day then?
Kind of give us a lay of the land on what's a typical day of open source security role for you.
Oh, it kind of depends on what I'm doing on the particular projects I'm working on.
But since we're talking about,
let me talk about the two projects
for the CIA I've been working on.
One was the census project.
One of the first things they need to do
is figure out, well, wait a minute,
who should we send money to?
And so I actually whipped up relatively quickly
an effort to quantitatively evaluate projects.
I'm sure you can appreciate that that's really hard to do.
And it's particularly hard when all the different programming languages that exist out there.
So we basically identified some metrics that we could use that would at least give us some indications of risk, scored a whole bunch of projects, and helped identify some of the projects that
were really important and had real concerns.
And I don't think it'll be terribly surprising that some of the ones we identified include
things like OpenSSL, the Network Time Protocol daemon, and various other things that everybody
depends on.
They're really important, but for various reasons, there were reasons to be concerned about the project.
So, and in fact, since that time, they took that data.
Now, that wasn't supposed to just give them the answer.
It was supposed to help them make a decision,
and that's what they did.
And that seemed to have been really, really helpful,
and I'm hoping to, I'm planning to go back
and do a round two of that soon.
For the badging project, again, it's the, hey, we've got this idea, can we identify
the criteria and help projects determine if they meet them or not, if so, they get a badge.
And so I'm actually the project lead.
So I'm basically the guy who grabbed information from all over,
talked to everybody, came up with draft criteria. I should note that the badging project and actually
the census project themselves are both open source software projects. So, you know, we've got mailing
lists, we've got a GitHub location, all the codes available, MIT license in both cases for the code.
So basically, you know, we came up with drafts and then begged for feedback from lots of
folks.
And you mentioned Curl is one thing I probably should give shout outs for more people than
I can easily remember.
So my apologies for all I missed.
But, you know, Greg KH from the Linux kernel and lots of other folks actually provided
some really great feedback.
And I should also quickly note Carl Fogel, who wrote the book Producing Open Source Software.
A lot of the criteria actually derived from his book, and he actually reviewed it and gave us some great feedback.
So basically, we did our best to gather the information and then put it out to the community to review, comment on, critique and improve.
Yeah, very cool. We'll cut you off there for a split second, David, for a break.
We do have actually a little bit of a cross reference there. You mentioned Carl Fogel.
Fogel, he is our very first guest on our brand new show, which just launched with Nadia Ekbal and Michael Rogers.
It's called Request for Commits
or RFC for short. He's actually the guest on the first two episodes that show all about
sustainability, community, the business side of open source, all those cool things. So if you're
listening and you find that interesting, check out rfc.fm and we'll take a quick break. David,
on the other side, I do have a quick question for you since you've been around for so long in the open source community.
You have this term floss, and then we have this other term OSS or OSS.
And it seems like depending on how long people have been around, they may use one, they may use the other.
I'd like to get your take on that.
But we're going to take a quick break, and we'll talk about that as well as all the details on the badging program after this.
This message is for all those team leaders out there that are looking to easily add new developers and new designers to their team, easily scale up when you need to. You got a big push coming,
you got a new area of the product you've got to go into, and you've got more need than you
thought you could. You've got to go through all this hassle of putting a job out there and hiring people to find the right people. Well,
that's a bunch of hard stuff that you don't need to even deal with. Call upon my friends at TopTal.
That's T-O-P-T-A-L.com. The cool thing about TopTal is you can hire the top 3% of freelance
software developers and designers. And what that means is they've got
a rigorous screening process to identify the best. So when you call upon them to help you
place the right kind of people into your team, then you know you're calling upon the best people
out there. Once again, go to toptal.com. That's T-O-P-T-A-L.com. Or if you'd like a personal
introduction, email me, adam at changelove.com. Or if you'd like a personal introduction, email me, Adam at change,
love.com.
And now back to the show.
All right.
We are back with David,
a Wheeler talking about all things,
best practices,
open source,
free Libre,
open source software.
David, I have a question for you about these terms.
So it kind of seems like it depends on when you come into the open source ecosystem.
It's kind of a smell or a tell depending on if you say floss or if you say OS or OSS or open source.
And I've noticed on your bio and stuff you have the FLOSS.
And you also mentioned that you've been into it
since the 90s.
So curious your take on that
and the change in terms and the acronyms
and does it really matter?
And what's it all mean, David?
What's it all mean?
Well, I'm not sure I can completely answer
what's it all mean.
Come on.
Yeah, so I mean, this really comes back to a split a long time back.
The term free software was used for quite some time to describe software where, you know, you can use it, you can modify it.
You can use it for any purpose.
You can modify it.
You can redistribute it, modified or unmodified, without constraints like royalties.
We're talking about the Free Software Foundation in that case,
right? Richard Stolman and the Free Software Foundation.
That's right. The Free Software Foundation,
for example, I think they were established
in 1984, I believe, and that's
the terminology that they used.
This kind of software existed even before that.
It didn't always have a name. The problem with
the phrase Free Software is what they meant
was freedom, but nobody gets
freedom. They assume
free software means no price. So a number of people many years ago declared, hey, why don't
we use this term? Why don't we create a new term? And they came up with open source software. So
that was a second term. The problem is that most people, I think, use the phrase open source
software, but not everybody. There's a number of folks who insist on using free software, and typically they're emphasizing a difference in motivation, where they're emphasizing the purpose of making the software as an ethical reason, not just an engineering reason.
That's not always true.
Some people use the phrase open source software and having an ethical undertone to it. Sometimes people use free software in its original meaning.
Most of the times when I'm writing or talking, I'm not usually emphasizing the motivations.
I'm emphasizing the rights that you have to use the software. So then you have the problem of,
well, one group calls it X, open open source software another calls it free software well what you know and in fact there's another group that wanted to call it
libra software so gee what do i do um so when i was started writing about this stuff in order to
try to cover everybody i started to use the phrase free libra open source software which is free
fl oss um there doesn't seem to be any way to make absolutely
everybody happy anyway. But that was, that's been my attempt trying to cover, you know, hey,
for a lot of this stuff, it doesn't matter what your motivations are. In fact, people's motivations
differ depending on projects and even over time. So trying to, you know, that phrase is often used trying to cover a waterfront of reasons
and motivations.
That's it.
I'm happy to use the phrase open source software.
I'm happy to use the phrase freely open source software.
In all cases, we're talking about the same set of rights, though people have different
motivations for why they do it.
Yeah.
Seems like we have a kind of a standard case of naming things as hard and operator overloading.
And, you know, the similar problems we come into when we're actually writing the software is when we're talking about things and names mean different things to different people at different times.
And so you have kind of this do of different words that we use in terms.
Right.
And of course, it's perfectly OK for people to say, here's our particular motivation and
here's why.
That's fine.
But it makes it makes life complicated when you're trying to talk about something when
motivate the motivations behind it are currently what you're focusing on at the moment.
Maybe for some other things, but a lot of times what I'm writing about is not the motivations,
but the results. In fact, there's even a recent movement to introduce even a new nomenclature because the OS open source versus free software is so troubled.
And it reminded me that XKCD, where the one about, not protocols, is it protocols?
Where there's too many protocols?
Yes, we're going to add one other one, yes.
Yeah, we're going to introduce one more.
There are now 15 protocols
and there were 14 or something like that.
Exactly, like let's create one to rule them all
and then you just added another one to the mix.
Yep, I remember the cartoon, yes.
I'm glad we asked this question because I don't know
if it requires the deep dive, but
we're camping on it for a second at least.
It always feels like to me, like maybe floss is I really hate to say this like this, but it just kind of feels old hat and uncool. Whereas open source software OSS feels like new hat, cool, new hotness kind of thing.
And it almost is a divide of like old school open source and new school open source. And that's to me as an out,
you know,
as somebody like who just is an observer,
obviously we have all these years.
It seems like that's the,
the,
the term that divides.
Yeah.
I,
I,
I think that's an,
I don't think that's a good way to look at it actually,
because frankly floss and open source software are actually from the same
time period.
And I do want to respect the folks who have a very, very specific agenda. I don't necessarily
agree with it, but I don't want to downgrade or make it sound as if I'm not disrespecting their
goals. Not at all. That's not what I'm trying to do at all by saying that you know i'm not saying that's the truth i'm saying that seems like an observation of how it's perceived right
and and i think the uh i think one challenge is that the phrase free is i mean i actually
complained to richard stallman back in the 80s you know it's a stupid word uh because everyone
knows what free means it means no price. And he insisted on it anyway.
And I think, and all the confusion that caused that came later was because he, you know, I think you noted earlier, naming's hard.
Right.
Totally agree.
But it's also important because you only have so many words.
And you've got to try to do the best you can to make things clear
and I don't think that word free has actually helped I think it's actually impeded communication
yeah that's unfortunate at the same time you know other words also are problematic like open
open is another word right especially when we start talking about when we get into products
and android is open you know ios is closed and it's like what that means and now what's open to you and what's open to me is different.
So yeah, these are the things that, you know, we need to be talking about and coming to
as much as we can, you know, where we can understand what each other means and not just
arguing about the words, but trying to overcome those subjectiveness.
So it's interesting for sure.
And I think Adam, Adam's's point perhaps cast a little bit
differently not that people who say floss are old school and lame or i don't know what you said
exactly you said adam but i didn't say lame okay i threw that one in there no uh so it's like
people who've coming to open source software more recently they don't have necessarily the history
they don't that the term floss is less used nowadays not
because it's lame or old but i think because the even the scenario that you just laid out for us i
don't may have never been laid out even on the changelog um and so that's just like a lack of
historical knowledge of like the terms in their use so well it's a general problem i mean people
aren't aware of the history of a lot of this stuff.
And I actually am interested in history, both computer history and general history.
And there are reasons things happen in a certain way.
And I think it's often helpful to know why that is, because frankly, it makes it a lot easier to understand the now when you understand where it came from.
And there's that old phrase, you don't know history, you're doomed to repeat it.
My gosh, how many times have people repeated the same stupid mistakes in computers because they aren't aware that, yeah, that's been done before. Here's why that didn't work. recently did a two-part series about the Free Software Foundation, the open source initiative.
He actually got Richard Stallman to interview,
which I thought at first I was mad because we had never had him,
but then I listened to it and heard all the stuff he had to go through
to get him to agree.
And I realized we're never going to have him on the show.
It's a really good two-part series.
It talks all about the words and the divide,
the ideologies and all those things.
So I would submit that to the listeners. Check out Curious Minds and just look for the open source
ones. But let's get back to you, David, and your initiative with the best practices badge. You've
given the overview. Let's get back to that initial reason behind the core infrastructure initiative,
the heartbleed, the security problems,
and just reiterate for us and tell us maybe exactly the genesis story of the Best Practices badge program.
Well, probably a good place as any is the Heartbleed vulnerability, which is a vulnerability in OpenSSL.
And initially, one of the big problems was it was a really bad vulnerability in OpenSSL and OpenSSL is used all over.
One of the side problems is a lot of people weren't even realizing that they had OpenSSL in there.
So it was a big effort.
Once the vulnerability was found, there was this huge effort to figure out, oh, wait a minute, what do I update?
Well, everything.
Oh, what?
And so it really had bad vulnerability, big impact.
But then when people started drilling in a little further, lots of programs, even projects that are well run with very, very conscientious people, lots of people, lots of resources,
lots of everything, doing everything right,
you can still make a mistake.
But the problem with OpenSSL was that it wasn't just this one vulnerability
that suddenly cast a light on,
this is really an important project.
But in fact, there's only two people
working on it part-time.
They are, there's a lot of things they aren't doing
that really should be doing.
And it's actually surprising that more vulnerabilities hadn't slipped out.
And when people started to investigate it further, this is a problem. And in fact,
you can look around and find other programs that don't do things that, in fact, you ask them, well, gee, shouldn't you do that?
Well, yeah, I haven't gotten around to that yet.
So basically, that's kind of one of the genesis of this badge program is, hey, what can we do to kind of raise the boats for lots of projects and identifying those?
We came up with a number of different criteria.
Let's see if I, there's actually 66 criteria, basically, after looking at what do people
do?
And it turns out that the OpenSSL folks weren't doing about a third of them.
You know, kind of the basic stuff that you're supposed to be doing, they weren't doing.
And that led to, well, frankly, a lot of problems.
Give us a for instance, like give us like top five.
What are some things, easy ones?
Yeah, you know what?
Let me, I'll tell you what, instead of just what I can do is if I go to the bestpractices.chromeforscructure.org site,
that's basically the web application that has the badging and so on.
And hey, go there and get yourself a badge.
If you do go to the projects page and look up OpenSSL, you'll find there's actually two entries for it.
There's current OpenSSL, and I'm happy to say that they actually have a badge now.
But we went back and said, hey, what was the status of them?
And one of the members of the OpenSSL team went back and tried to fill in,
what were they not doing?
And so basically, they didn't have very clear information
on how to contribute the software on the site.
They didn't have information on what to contribute, what are the requirements for contributions.
They weren't putting out the intermediate
forms to the public for people to review before it
became the official version. They didn't have an officially
published, here is how you report vulnerabilities.
They had a general bug report, but it wasn't immediately obvious if that's how you were
supposed to send in vulnerability reports or not.
The, you know, they didn't in general add new tests when they added new functionality.
You'd think that would be a, hey, I added a new crypto algorithm.
Let's make sure that we add, you know, tests for that.
Well, sometimes. Not so much. Not so much. added a new uh crypto algorithm let's make sure that we add you know tests for that well sometimes
not so much not so much um and uh you know they weren't enabling compiler warning flags and other
things and and you know trying to find the basically using lots of tools to find problems
before uh it got out the door uh and that's just a few of the problems.
And that's a, you know, sure, even a well-run project,
you can make a mistake that gets out.
But these are the kinds of things where, you know, no.
When you, you not only should have a test suite,
but you should be improving it as you add new functionality.
You should tell people how to report vulnerabilities and that sort of thing.
So that's kind of the, and that's kind of the level of, you know, you get the question, gee, what do these criteria look like?
It's those kinds of things.
You know, it's the, you know, where's your repo?
Where's your project page?
Which could be the same.
You know, do you have version control?
Do you have an issue tracker?
And you'd be shocked to know that there are open source projects that people depend on that don't have these kinds of basics that help them, you know, keep their project under control and help them focus on the problems and fix things before the users have to suffer with them.
Maybe I should take a moment and talk about the badge itself.
Sure.
I can't remember if we've given that context or not,
but I want to go the whole show without saying,
we're talking about an actual badge
in terms of something that you'd put on your GitHub readme
or on your project website,
like a little image that's just like,
what is it, adamshields.io or badges.io?
That's what we're talking about.
That's exactly right.
I think Shields merged with Badges, but yeah, I think you're right, Seth.
Yeah, and ours is actually from Shields.io in terms of the look and so on.
And basically, if you stick that on, it'll say CII best practices and either in progress with the percentage,
or if you get 100%, percent, you get a passing
hundred percent. Congratulations. Right. And we have a scoring mechanism. There's a couple of
criteria which are not actually strictly speaking required. You know, there's, there shoulds,
you can not do them, but you got to justify it or, you know, suggested where, okay, you,
you don't have to do it, but we want you to think about that and make sure you tell us whether or not
you actually do those or not.
And then basically we score out all your musts
and shoulds and suggesteds.
And if you get 100%, congratulations, you've got a badge.
Have you found the badge?
So developers love badges.
I remember CoderWall was very popular.
People like to have little things that show off what they've done.
But have you found that to be a significant enough motivation to have people submitting their projects to get a badge?
Yes.
People have made changes to their projects in order to get a badge.
And what's sad is that some of the things that people are doing are, well, it's not sad.
It's kind of the point.
Are the kinds of things you think about, well, wait a minute, shouldn't you have done that already?
Well, yes, but here it is now.
People have created test suites.
People have found ways to implement HTTPS. People have reported, hey, here's how
to report vulnerabilities. By the way, I should note that these are some of the more common
problems in getting a badge. They may tell you how to
report bugs, but it's not obvious when you want to report a vulnerability if you're supposed
to use the same process or not. It's fine if you want to use
the same process, just make it really clear.
That's particularly a problem if you're on GitHub,
which a lot of folks are.
Currently, there's no way to have a private bug report
to a public repo.
Something that's got sensitive information involved,
like a vulnerability you might want to actually pass
to the maintainer in a secret manner
so that it doesn't get public and they can actually fix it before it becomes a deeper vulnerability.
What are the workarounds for that? Have a different bug tracker for security vulnerability?
There's actually a thousand ways to do it. We don't even care which way.
That's what the badge is for.
Well, I know, but it seems like there should be one true way. It should be easier than...
One more protocol. Yeah. Well, I know, but it seems like there should be like one true way. It should be easier than two.
One more protocol.
Yeah.
I'm not a big fan of the one true way.
I think there are things we can step back and what is actually required.
There are projects, by the way, SIGWIN has this interesting policy where they forbid public private discussions of any guide.
That includes vulnerabilities.
If you're going to report something, it must be public.
And there it is.
Now, I'm not necessarily a fan of that,
but they are sure clear about it.
And, okay.
Never going to get a best practices badge.
Well, you know what? For the best practices badge, we don't say it has to be private.
You just have to have a way to report it.
Oh.
But the problem, I think
most people, and I think reasonably so, certainly
on all my projects, I prefer that you, you know, send stuff to me privately, but then
you got to tell people how to do that.
And that's okay.
Um, I think one true way was probably the wrong way to phrase it.
What I meant to say is there, there, there should be a happy path.
Like there should be an easy path for everybody to get there.
And it seems like saying, well, there's's 10 000 different ways you get this done is like
well well you know which way should i do well now of course um i will i guess i'm going to reveal
the grand secret which isn't actually secret at all which is um in process of doing this we've
actually been contacting other repos for example um example, GitHub doesn't support private reports on public repos.
So we've actually contacted them and specifically asked them
and put on their own issue tracker,
hey, could you please add this functionality?
Obviously, GitHub isn't required to do that,
but we're making sure that they're aware of that.
Savannah, which is where a lot
of the GNU-based
projects, if you pull up
a Linux distro, it's going to have a lot
of projects that are actually run
off Savannah. Savannah
has HTTPS on the project
pages, but not on the repos.
What?
So we're talking with them right now so i agree with
you that for some of this stuff clearly need to fix it once but we're actually already pursuing
that as well and the process of making these badges the badge and the criteria um the projects
actually we were actually alerted to that um by projects themselves. Hey, we want to do this. So there are ways
to do it now, and we're working with projects to make things
better for everybody. But you can
figure out a way to get people to send you private messages. Here's an email.
Here's a little website just for this purpose. Whatever.
But there are ways you can do it now.
And we're working on making it better for everybody.
Just to answer my own question a little bit.
And we've covered a few of these.
But the Linux kernel, as you said, is badged up.
Node.js, curl, as we mentioned in the intro, GitLab, and of course, OpenSSL.
One thing that was interesting, you can see the entire list of projects on the website,
which we'll have linked up in the show notes.
There are 182 projects in the index,
but only 22 of those are passing.
So that tells me it either takes a while or it's hard,
or maybe you can tell us why so many
are still not quite there yet.
Well, the criteria we created, as I said,
we talked to a lot of folks. And so it's basically what do most projects do for each one? But here's
the challenge. If you identify a bunch of criteria that each of which most projects do, and then you
say, hey, you've got to do them all. Oftentimes what people find is they do almost all of them except for these
few.
And we've actually been tracking those.
I actually recently posted an analysis of the projects,
which are close,
but not quite making it to figure out what were the ones that were most
missed.
So let me list those that are kind of the most missed ones.
And, you know,
we take feedback and we basically plan to
update the criteria every year. And we actually have an intent to add higher levels in the future.
But right now we're just, you know, that basic, what we call the passing level. The most missed
in terms of the criteria were tests are added, which is basically as you add functionality, you add tests.
The second most missed was HTTPS.
And there were some others about crypto certs,
vulnerability reporting, basically tell us how
to report vulnerabilities.
And for the tests are added,
could we reduce their criteria?
We could, but should we really,
if you add new functionality,
shouldn't you be adding new tests?
We're not mandating 100% coverage or anything.
We just, you know, keep improving.
For HTTPS, if folks want HTTPS,
go to Let's Encrypt.
They'll give you a cert for free.
For vulnerability reporting, that's one sentence on a readme on your project page.
And another one that's common is knowing just the basics about secure design and common errors. And that's really just knowing kind of the basic principles, such as from Salter and
Schroeder, and knowing things like the OWASP top 10, what they are, how to counter them.
They aren't hard to do, but they're the sort of thing they're, oh, wait, we're, you know,
and while we could back off on those things, no one is actually suggesting that we should.
It's just that there's a number of projects that don't meet those sorts of things.
And so what we're trying to do is help, instead of changing the criteria, we're trying to help the projects actually meet them, which is going to be good for everybody.
All right, best to take a quick break here real quick.
When we come back, we have a couple of questions, I guess, mainly around not so much just the motivations, but also maybe how they maintain.
You know, for example, if someone gets to 100 percent and they get the badge and all that good stuff and they prove they're certified, they follow these best practices.
I'm curious on the the follow up, the, you know, kind of the checks and balances over the years, how that and how that works out.
But we'll take this break and we come back to that.
Every Saturday morning,
we ship an email called Change Law Weekly.
It's our editorialized take on what happened this week
in open source and software development.
It's not generated by a machine.
There's no algorithms involved.
It's me, it's Jared,
hand curating this email,
keeping up to date with the latest headlines,
links, videos, projects, and repos.
And to get this awesome email in your inbox
every single week,
head to changelog.com slash weekly and subscribe.
So we're back with David A. Wheeler
and we're talking about this great badge initiative
to show off the best
practices of core infrastructure out there. Obviously, as an industry, we're blindsided
by heartbleed, so something had to be done, and this is obviously a great initiative.
But David, in the side chat here we had before in our break, I'm kind of curious about
the motivations, right? So if someone's trying to do this
with the best practices,
they're not just trying to get a badge.
What's the motivation for this?
What are they trying to show off
that they actually follow the best practices?
Can you help break that down a little bit more clearly?
Sure.
I think really with the badge is all about
helping projects identify
what are those key best practices
that are going to help them be
successful, produce good results, and also for the potential users of that software,
help them figure out, gee, which projects are doing well versus the ones that are kind of in
trouble or kind of dodgy. So really, I would strongly encourage any open source software
project, go to the bestpractices.coreinfrastructure.org
site uh click on get a badge and get a badge for your open source project um this is for everyone
not not for you're not small just everyone everyone because because the whole point is hey
there's just somebody probably depends on the software that your project develops.
So for somebody, you're important.
And in fact, for most projects, people often have no idea how many other people really depend on that software.
And I think most, almost everybody, you're involved in an open source project.
You're not there to produce crap.
You're there to try to make something that's useful and helpful. And you want to do that by doing the right things. Well, what are
those right things? What are the things that are more likely to make your project successful?
So I've emphasized the badge because it's a convenient shorthand, but really the goal isn't
get a badge. The goal is do things that are going to help you succeed. And by talking to
everybody that we can, getting that experience from projects that are both old and new,
people who've studied it, what are the things that are really kind of those fundamental things?
And then from there, we've distilled it down to a set of these are the things you should be doing and you know by by getting a
badge uh not only are you showing your users hey we're on track but in fact you're helping make
your project better uh for uh for the future and on the consumer side of that the benefit is once
these badges you know get to be uh in such numbers that you come to expect them, at least on certain projects,
you can use that as an indicator of,
if not the quality of the project,
because there's other things.
You can look at the code coverage,
what's the code scoring system
where they have A pluses and B minuses and whatnot.
Well, there's several of them.
The code coverage with statement or branch coverage
is a pretty common measure.
Yeah, exactly.
There's all sorts of measures.
That's what these badges are kind of for,
is to give a high-level view of what's going on,
all the dependencies out of date.
Ways that you can proxy an idea about quality.
I think with this one,
maybe you can't tell the quality of the project,
but you can at least tell how serious they are
if they're not just trying to apply best practices,
but they're actually going after,
they want to have a badge that shows off
that they're trying to go after best practices.
So hopefully we get to a point where it's something that we can look at and say, OK, this is a plus one for this project.
Right. And a lot of these criteria really are about, you know, helping you go in the right direction.
So, you know, the challenge and I'm a big fan of static analyzers and code coverage and so on.
They can only tell you the current state.
That's not a problem.
That's a good thing.
But it doesn't mean, for example, I actually talked to an open source project.
Oh, I'm not sure I should pull them out like this.
But it's widely used.
There's no issue tracking.
They have no idea what problems they need to work with because they have to keep hunting through the old emails and the mailing list trying to figure out what to do now um that's just sad
you know we've we've got version control systems we've got issue trackers you know we've got all
these tools please go use them your life your your life will be better for it and it's that
sort of thing where yes just following just doing these certain things doesn't make your code into magic gold.
But you can at least avoid some of the crazy problems and help set it on a good path. Heartbleed and obviously creating a list of best practices and providing a way for open source
projects to self-initiate and go and volunteer to follow them, get a badge and then get to passing
or at least their progress level into passing. I'm curious about the lifespan of this. Is there
a committee? Is there people who are keeping these projects in check? How do you know once
they've achieved a certain passing level,
they actually maintain the best practices?
How does that work?
Well,
there's actually several things.
And probably before going into that,
probably should talk a little bit about how you get a badge in the first
place.
Cause I think that'll,
that'll help kind of level set stuff.
So to get a badge,
basically somebody from the project clicks on get a badge and they
fill in basically a form. Now it's basically one of these little, you know, click on, you know,
did I meet, did I not meet? For some of them you need to, for almost all of them you can justify
and some you have to justify. As much as we can, we want to automate this we've already automated a number of things because
there's actually a lot of these questions you can't answer in many cases uh we're particularly
if you're on github we can tell certain things right away we can look at the repo
and fill in some information for some things it's just you know gee uh my our ai isn't quite up to
the task of handling that yet but even just one of the things that we have automated now, we can quickly determine, hey, you didn't meet and kind of go from there.
The current plan is to do an update with these criteria every year.
We're targeting January.
So basically, each year we'll have some adjustments,
and that means that you'll need to go back
and at least update your entry every year.
And in that process, that will force the automatic evaluation.
Technically, I guess the badge is good for a year,
but you don't have to redo the work.
It's not a lot of work.
It only takes about an hour on average
to get the information.
And that's assumed that your project's already in order.
Obviously, if you're not doing any testing,
the problem isn't that,
gee, I have to click on unmet for testing.
The problem is you need tests.
That one's a little harder to go get.
Yeah, you know what?
We actually, people are kind of surprised with it.
We actually don't mandate a coverage level.
I was just going to ask that.
Yeah.
Instead, we'll be focused on, do you have a test framework, and are you working on getting better?
For some projects, actually, test coverage is kind of tricky.
Greg KH and I had a lot of interesting conversations uh the linux kernel folks for
example of the interesting problem that they have a lot of drivers which practically nobody has the
hardware so it's really really hard to do coverage testing with real hardware when you don't have the
real hardware and yeah you can do simulations but that tests the simulators not the hardware
um so instead we're really focused in it on is you know have you started
and are you going in the right direction um now we i mentioned earlier we do actually hope to
have higher levels of badges and then i think we are uh almost certainly going to have a coverage
requirement but i i think to be honest we were kind of a little surprised. There's so many projects which aren't really doing the fundamentals that right now we're kind of much more focused on getting people to the point where you have tests, you have a test framework, you're adding tests, you add functionality, you have HTTPS, you know about designing and secure software, you know about the common kinds of mistakes
that people make and how to counter them.
And right now, that's kind of been our focus
until people are kind of more set.
And then we can work on those higher levels.
Two thoughts there.
The first one is the easiest way to get 100% code coverage
is just to have a single test that asserts true.
So obviously you can fake that pretty easily.
Thing two is when you get to a certain level of coverage, you start to determine or it's a harder time to determine what is and what is not a best practice because there's way more dissension and what different developers think is appropriate test coverage. So I assume that as you got to that phase, you'll have a lot more argumentation or debate
about what should and should not be required.
Yeah, I mean, it also depends on how critical their software is.
I mean, I'm certainly not against code coverage.
The badge app itself, as I mentioned, it's open source.
It's a project.
Oh, I probably should mention right off hand.
Yes, we get our own badge.
You got to get your own badge, right?
You don't got your own badge.
It's like, go on home.
Just go home.
Exactly.
I think it would be ridiculously hypocritical if we don't get our own badge.
But we do a whole lot of practices.
For example, we have code coverage.
I think last I checked, it was something like 98%. We use CircleCI, check our builds and so on, run the automated tests.
So I'm certainly not opposed to coverage testing.
But I think you're right.
I think 80% and 90% for most software is kind you know, kind of, you know, you should be at least getting that.
Whether or not you run all the way to 100%, I think that there's nothing terribly wrong with getting 100%.
But oftentimes those last tests aren't necessarily worth the effort because the code coverage can hide some other problems.
Just because you ran the test doesn't mean that you're really in a good situation.
There are other kinds of testing
that you should do.
Give us a quick, while we're here,
let's talk about the application itself.
Give us a quick technical breakdown
of what it is, how it works,
maybe the technologies involved
and who helped you build it.
It's basically fill in,
the basic notion is fill in a form.
So it's a web application, you fill in a form.
And so we really were trying to make things as simple and as straightforward as possible.
Now, it's not quite fill in a form because once you give us the project URL and repo URL,
we actually go out and try to fill in some of the form automatically.
And even to the point where if we can determine with high probability that in fact something
isn't true we don't care what the human says it's not true uh so um so it's a so in some cases we
will override what the human claims um that's it's fundamentally a form entry so we're using
ruby on rails uh which is bog standard and pretty darn uh common uh way to implement um
uh you know an application with an application with forms and databases
and that sort of thing. Probably one of the more interesting, we have
of course automated tests. As I mentioned, it's
near 100% coverage.
In fact, one of the more interesting things from my vantage point is of course
we want to make sure that ours is secure.
So we actually have on the website, on our page, a little description about how we make sure that the thing is secure.
So, for example, we make sure we try not to store anything that's not public anyway.
Just you can't reveal what's not you're not hiding
uh there is some slightly sensitive information we do have email addresses of people we do have
uh if they're not using a github login uh we do have some passwords but we hash the path
use iterated salted hashes uh for the password so even if you get our database you don't get
uh the actual straight up password uhup password. We just try to apply
various rational things. Ruby is, of course, memory-safe, so we don't have
those kinds of problems.
We apply the typical security
recommendations and various kinds of hardening things.
We use four different static analyzers.
Check the Ruby and the JavaScript also, because there's some JavaScript on it,
including Breakman, which is a very nice little static analyzer
if you're doing Ruby on Rails.
And so basically, we're not really depending on any one thing.
We're actually using a suite of tools and approaches.
Any one of which helps and the combination together makes it much more likely that when we put something out, it's much more likely to just work.
Very good.
Well, any final thoughts or closing words on the best Practices badge program before we get into our closing questions?
I think the main thing that if you don't remember anything else from this conversation, what I would say is, please, if you're involved in an open source project, please pop over to bestpractices.coreinfrastructure.org. Go click on get your badge now and go get yourself a badge. It doesn't take
that long. It doesn't cost anything. And it basically will help you figure out, hey, is your
project in good shape? And if it's not in good shape, it'll help you identify exactly what needs
fixing. And then you can go work in fixing it. And once you've done that, you can get yourself
a badge.
As we talked about earlier, the badge is a nice shorthand,
but really the goal isn't the badge.
The goal is to get projects in good shape.
We don't want more heartbleeds.
Mistakes are going to happen,
but we want those mistakes to be unusual leakages after doing all the right things,
not while there were some basic things I should have been doing.
And that's really what we want to get to the point of where we want to get
to the point where projects are in great shape.
They're ready to go.
They've got,
uh,
they're firing on all cylinders.
And,
uh,
that's what I'd love to see out of this.
And for those out there who are a little afraid of forms,
like I am,
I sometimes don't like to fill them out.
I like to peek behind them if I can. And when I can't, I just are a little afraid of forms like i am i sometimes don't like to fill them out i like to peek behind them if i can and when i can't i just get a little scared well if you want
to peek behind the code for everything is is on github so there is there is no secret there on
what the form is we actually have a separate page on github with just the criteria if you want to
see kind of what you're up, what you need to fill in.
But you know what?
A lot of people just, gee, I don't know what the criteria is.
I'll just click on get my badge now and get started.
And you don't have to do it instantly.
You can fill a little bit and say, oh, man, I don't have any tests.
That's the point I was trying to get to was being able to see the criteria.
And you have a great
doc in, uh, in the GitHub repo that, you know, you can read that. It's like a blog post. If you
just want to, if you just want to know what it takes before you even felt the form, you know,
or, or get started, you know, just kind of seeing behind the, you know, the being behind the veil,
I guess, so to say, um, so I wanted to plug your criteria.markdown file
because it's extensive.
It's got a lot of great information in there.
It's very exhaustive, and it's also obviously in Git,
so you can contribute back if there's a misspelling or a typo or whatever.
You can easily see this information.
Yeah, we take pull requests.
In fact, we've got an issue tracker.
We take pull requests.
People have proposed all sorts of, in fact, their criteria themselves.
You know, it's not just me.
We've gotten comments back from all sorts of folks.
And I think at this point, generally when people have issues, it's not that they think
the criteria are wrong.
It's that, oh, I'm not doing it's that oh i'm not doing it well that's
actually in something that's sad of course you don't have an issue tracker what's wrong with you
um but in a sense it's good because it means that the criteria are doing their job they're
helping people identify those you know those basics that people are generally doing but maybe
you're not we may have covered it but but for those out there who are thinking,
this is great, I didn't know about this,
and they want to be involved in some way, shape, or form,
whether it's, you know, the obvious one
is that they're involved in a project,
submitting to get a badge,
but let's say they want to support this.
What are the best ways for the open source community
to step in and support you and support the Linux Foundation in this initiative?
I think the most obvious one really is if you're involved, as I mentioned earlier, if you're involved in a project, please go work and get the badge.
If there's a project you're depending on that you're not currently involved in, but they're missing some criteria, go help them.
I'm sure there are projects that you're depending on that could really need your help.
If they don't have a test suite, you know what?
Help them make one.
I didn't mention different projects have different problems.
Older projects often have the problems that they don't have test suites.
Newer projects, which think they're open source, often tend to not have a license, which means they're not open source at all.
And so, you know, help them identify and fix those problems.
I guess a third way would be with the project, with the whole badger project itself.
We would love to get feedback, improvements, suggestions. You know, I don't, we don't want
to change the criteria willy nilly because people, you know, spend time answering those,
but certainly if they need to be clarified, that's great. If they want
to actually change them, add new ones or delete ones, that's fine. Although we want to do that
much more slowly, particularly adding new criteria. We don't want to do that more than annually.
But that's what we can and we expect to. But we're going to need people's help because we
want to make sure that we have everybody's viewpoints, not just one person's.
So hopefully that'll give you at least a,
there is room for lots of people to contribute in a lot of different ways.
Yeah.
We'll definitely link up the criteria markdown file
because that's interesting to me to just be able to breeze that on your own
just to look at it.
And I like how you said to,
if you're not involved in
a project and you depend on a project that hasn't applied for a badge or doesn't have the badge or
doesn't have tests, then obviously step in or reach out to the maintainer and ask them how you
can help to spread the word about this initiative. So that makes a lot of sense to me. But one
question since, David, since you're a listener of the show, you may know that
when we tail off the show, we like to ask about somebody who's influenced you. And we often call
it the programming hero or just hero in general. And I'm kind of curious who might be your hero
because you're a child of the 80s. You were doing lots of cool stuff way back when so you've got an expansive history of of who may have influenced you over the years so
if you had to narrow it down to one though who might be your hero i've got several i could point
to but i guess if i if i only am allowed to use one uh i guess i would point out robert doer
who uh some people may not be no may may know. Unfortunately, he died not that long ago.
But he's done all sorts of cool things.
I mean, he was an academic.
He did a lot of advocacy for open source software.
He started an open source company, which is still thriving.
But the area that I remember him specifically as a compiler author, he wrote several interesting compilers.
Way back when, he actually wrote the GANAT, the GCC Ada compiler.
And at the time he wrote it, there was this sort of set of, here is how you do this sort of compiler.
Gee, this sort of compilation work takes a while, so you've got to have all these complicated caches.
And after he talked to a lot of folks, he worked on a system
where he kind of blew away the conventional wisdom.
Instead of having this complicated caching system
that required really a whole lot of complicated error-prone code to keep straight,
he basically threw away all of that and instead worked
very carefully on a hand-optimized lexer.
It was a pain to make that hand-optimized lexer, but it was a little tiny
piece of the compiler. And by optimizing one little piece, he managed
to eliminate a huge raft of code, and the whole compiler was
much, much, much faster than anything
had that had been around before so basically by looking carefully at the problem he figured out
oh here's a much better way of doing the trade-offs than had been done before and you know
he ended up with something that was tons faster, much smaller.
What's not to like?
Oh, it's more reliable, too.
Always going to love that.
But that's good stuff there.
So, David, it's absolutely been a pleasure to have you on the show.
I know that having a listener on the show is a bonus for sure.
And then having not only a listener on the show, but someone who shared a ping and shared their story with us on there.
Obviously, we track that quite well.
So listeners out there, if you're listening to this and you're thinking,
man, I love this show, I want to suggest a topic or maybe even suggest myself to come on,
go to github.com slash thechangelog slash ping.
There's issues there.
Submit one.
Look over some.
Help us out to say hello to people or give feedback on different ideas.
We love, obviously, we love that.
But, David, this core infrastructure initiative
is a great thing.
I'm glad that the Linux Foundation
and the foundation you work with are doing this,
and this is great work to be doing
for the open source community.
But that is it for this week.
Thanks, Dave, for coming on the show
and listeners for tuning in.
But let's call this one done and say goodbye.
Goodbye.
Bye. Thanks. We'll see you next time.