Embedded - 460: I Don’t Care What Your Math Says

Episode Date: September 28, 2023

Author, engineer, manager, and professor, Dr. Greg Wilson joined Elecia to talk about teaching, science in computer science, ethics, and policy. The request for curriculum that started the conversatio...n was the Cost of Change, part of NeverWorkInTheory which summarizes scientific literature about software development.  Greg is the founder of Software Carpentry, a site that creates curriculum for teaching software concepts (including data and library science). Software Carpentry has great lessons for those who want to learn about software, data, and library science. It is a great site if you are teaching, trying to get someone else to teach, learning, or looking for some guidance on how to do the above. Check out their reading list. Greg’s site is The Third Bit. Here you can find his books including full copies of several of his books including The Architecture of Open Source Applications, Teaching Tech Together, and most recently Software Design by Example.  Transcript

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Embedded. I am Elysia White, on my own this week. So my thoughts will be drifting around like dandelion seeds. But our guest this week is Greg Wilson. And we'll be talking about teaching and learning and technology, maybe. Greg, thank you for being on the show with us, with me, with us. Thank you very much for having me, whether you are one or many. Could you tell us about yourself as if we met at an Embedded Systems Conference?
Starting point is 00:00:42 Sure. I actually started my career as an electrical engineer back in the early 1980s, and I still have a scar on my right hand from picking up a soldering iron the wrong way around twice in one afternoon. And the second time I had my hand under the cold water tap, the lab tech came over and put his arm around my shoulders and actually did say, son, have you considered a career in software? I think it was self-interest on his part, because the very next semester, I would have had to do the course on power transmission lines with 50,000 volt transformers and so forth. And I honestly think he encouraged me to go into software to save himself the paperwork. And has that worked out for you?
Starting point is 00:01:32 I think it's too early to tell. It's only been 40 years. I've worked as a programmer at everything from startups to IBM and Hewlett-Packard and the U.S. Department of Energy. I've exited academia three times. I was a professor for three and a half years. I build software for fun, but mostly for my own use. I think it's worked out, but I still don't know if I've had a career in the traditional sense. And I'm still trying to figure out what theme, if any, binds together all of the things
Starting point is 00:02:19 I've been doing, other than the fact that apparently I really do like the sound of my own voice. That actually leads us to the first of the lightning round questions where I'll ask you short questions and I want short answers. And if I'm behaving myself, I won't ask for more detail until later. Are you ready? I am ready. Author, manager, engineer, professor, or teacher? I don't think you can be an engineer or a manager unless you're a teacher. That's still not an answer. All of the above, but the one I like most is the teaching. Reading or writing? Reading, because I don't actually enjoy writing i like having written which is a very different
Starting point is 00:03:07 yes yes oh yes articles or books um either or different audiences different purposes academia or industry industry favorite way to learn new things? Lectures, reading, trying? Working with somebody who is also trying to learn it at the same time. Complete one project or start a dozen? That's the response of someone who sits start a dozen. That's always the response. Yeah, I actually have a to-don't list. response of someone who's at start it doesn't that's always the response yeah um i actually have a to don't list one of the things dave dockendorf taught me in my second industry job was as well as having a list of things you're supposed to do you actually write down the list
Starting point is 00:03:59 of things that you would like to do that would be useful that would contribute to society and that you're not going to start because life is short and you've got too many things on the go. I actually have a written to-don't list, and I recommend everybody else get one too. I could see you putting some stuff on a to-don't list. Do you have a favorite fictional robot? That's an interesting question. None comes to mind all right uh it's funny because one of our guests uh answered the next one with an important thing you've already discussed but do you have a tip everyone should know other than the to don't list i mean mean, that's a good one. When you're being interviewed, always have a cup of tea or a glass of water or something handy.
Starting point is 00:04:50 And whenever you're asked a complicated question, pause and take a sip. Those three seconds might save your career. That's a good one. My wife would tell you that if I had learned to pause before answering questions, I would have been married several years sooner than I was. But we don't have time to go into that in the podcast. One of our past guests said, never catch a falling soldering iron. I wish I had known that 42 years ago.
Starting point is 00:05:18 Okay. So, I have this outline. I have all these questions for you. You've done amazing things, written several books, co-written several books, and you won an Influential Educator Award. I have a bunch of questions about that. But then you sent me this link about a project. It was sort of a request that somebody do this. Could you talk to me about the cost of change project? Sure. For about 11 or 12 years now, sporadically, I've been helping out with a project called It Will Never Work in Theory. The name comes from the saying that it will work in practice, but it will never work in theory. Where did that come from? I mean, it's hilarious. Engineers have been saying this to mathematicians for as long as there have been engineers and mathematicians, right? I don't care what your math says. I've actually built it and we're riding in it right now. So, fix your theory. And historically historically it's interesting that in almost all cases practice
Starting point is 00:06:28 precedes theory the the number of cases in which mathematicians and scientists come up with something or prove something that's then translated into practice is surprisingly small in most cases from radio to aircraft to, we find something that works and then we go and figure out after the fact why it works. But not wanting to get off track here, like most of your listeners, I learned software engineering as a craft by sitting beside other people who showed me what they did, or at least didn't stop me from watching, which is not quite the same thing. And I picked it up in bits and pieces in the way that stonemasons learned how to make blocks in Egyptian times or Roman times. And it wasn't until the late 90s that I discovered that there were actually people studying programs and programmers in empirical ways. Steve McConnell's book, Rapid Development, Bob Glass's Facts and Fallacies of Software Engineering, pointed me at a literature where people had actually gone in, watched programmers, and recorded what they did and seen what worked and what didn't.
Starting point is 00:07:52 Open source, and particularly things like SourceForge and later GitHub and Stack Overflow, led to an explosion in that work because so much more data suddenly became available. So by the early 2010s, we knew a whole lot about what works and what doesn't. For example, you'll find people in industry who have very strong opinions about whether strong typing and programming language is worthwhile or not. All right, how do you tell? Well, one experiment that was done was to go back and look at several hundred bugs in medium-sized JavaScript projects on GitHub and go back and see if you added the strong typing of TypeScript, how many of those bugs would it have caught? The answer is about 15%. All right. 15% is one in seven. That doesn't sound like a lot, but on the other hand, 15% is sales tax in Ontario. And if you told businesses
Starting point is 00:08:54 here they could stop paying sales tax, they would think you were a genius. What it does is give us a handle on the kind of scale that we can expect from adopting strong typing. And is it a definitive experiment? Absolutely not. Is it proof that we can go and find things out by applying the scientific method to software development? Absolutely. Catherine Hicks and her group at Pluralsight have been looking at what actually affects developer thriving, what actually makes developers productive and secure and happy, and what doesn't. And it turns out there's a lot of myths out there that are pretty easy to disprove.
Starting point is 00:09:40 The problem is most programmers have never done science. When I was teaching computer science at the University of Toronto, I did a bit of digging. At the time, for an undergraduate biology program to be certified, students had to spend an average of six hours a week in the lab over four years. They have to learn how to actually do science, set up the experiment, collect the data, do the analysis, and so forth. So by the end of their degree, depending on what path they've taken, they've done anywhere from 20 to 40 or even 50 real experiments so that they learn how this works, how to do the science. The average undergrad in computer science did one
Starting point is 00:10:26 experiment in four years, and that was if they took the human-computer interaction course. So they come out of undergrad not knowing what science looks like. Yeah, they did science courses in high school, but they've never actually set up and run an experiment, collected data, analyzed it. So they don't expect to have any of that when they go to industry. So they're not looking to the research to say, what do we actually know? And this leads us down into a lot of quagmires. You will find a lot of people in industry who swear by test-driven development. They think it makes them more productive and that it leads to better code. Every well-done study of TDD has found no effect.
Starting point is 00:11:14 It doesn't make a difference. But? The best one that I've seen was Davide Fuce and colleagues published in 2016. It was a replication study of something that had been done in simulabs. They're using professional developers. They're looking at it over the course of months. So all of the spurious arguments that programmers come up with, where they say, well, of course, if you're only looking at students for an afternoon, you're
Starting point is 00:11:38 not going to see a signal. None of those apply. And Davide is meticulous about his statistics. There's no signal there. One hypothesis, which has not yet been tested by a follow-up study, is that, yeah, there are differences in productivity. There are actually very significant differences in productivity. But what it seems to depend on is how quickly the programmer alternates between coding and testing, not the order in which they do those two things. And in the early 2000s, as Agile started to spread, people switched to a more rapid iteration, a more rapid cycle, 10 minutes of coding, 10 minutes of testing, rather than two days of coding, two days of testing, and at the same time adopted TDD, and then attributed whatever increase in productivity they felt they were seeing to the wrong thing.
Starting point is 00:12:41 So, I mean, I've always thought the benefit of TDD was more in the, how do I test this? How do I prove that it is correct? Right. More than in the I'm going to mechanically write a test that fails and then write code that makes it succeed. I'd be happy to believe that was true. I used to preach the gospel of TDD myself. The fun thing, and I use the word fun very loosely here, is that you and I are both smart people. The people we work with are smart people, and none of us are saying, you know, I shouldn't be making a claim like that without at least the same kind of evidence that I would
Starting point is 00:13:19 expect a pharmaceutical company to have if they're selling a new skin cream to treat poison ivy rash. We don't expect in our field that people will back up claims with evidence. We want anecdotes and we want a tall man with a deep voice and a confident manner on the stage at a TED Talk. And that is what we take as proof. So I think this is fixable. I think our field would start to advance more rapidly if we who practice were paying attention to the research, and if the researchers were paying more attention to what we already know and to the problems we have. Because I'm critiquing the practitioners right now. You got to understand, I'm equally harsh on the researchers. Most of what they study is completely irrelevant to practicing programmers.
Starting point is 00:14:18 Yes. So, yes. Absolutely. But we throw out the good with the bad. What about the studies that have been done, like the mythical man month? So that wasn't the study. Fred Brooks did not collect any data to back up his claim that adding more programmers to a project makes it late. It was plausible. It's simple to understand. But that doesn't mean it's right. And as my father pointed out to me in a slightly different context many years ago, you go back two generations and you will find a lot of people who could rather indulgently explain to you why women were simply not suited for positions of power because they're too temperamental.
Starting point is 00:15:03 Right? suited for positions of power because they're too temperamental right i like we're supposed to collect evidence and base our decisions on the evidence right not on what everybody knows or what everybody is repeating because they heard it from somebody else who also didn't have evidence i think we advance more rapidly and we can shed a lot of damaging misconceptions if we actually turn the scientific method on what we're doing. And so the proposal I sent to you was, let's get rid of that undergrad course where the students work in teams and pretend to be developing a product. Because I know from firsthand experience that it doesn't really make much of a difference to them.
Starting point is 00:15:49 They learn some tools. They might, for example, learn how to use a continuous integration tool. But a lot of them are picking that up on internships these days. They don't learn how to elicit requirements or how to run meetings or how to do the human side of software development because it's the blind leading the blind. The instructor can't be in all of their team meetings and show them how to actually do that side of their job. So let's take that course and put it away and instead give them the opportunity to go out and collect some data,
Starting point is 00:16:22 go and mine GitHub and look at, for example, are long methods in JavaScript and Python more or less likely to be buggy than short methods? Okay, in order to answer that question, you're going to have to build some sort of a statistical model. You're going to have to clear up the question itself. Are we asking more buggy per line, more buggy per statement? How are we going to relate bugs clear up the question itself. Are we asking more buggy per line, more buggy per statement? How are we going to relate bugs back to particular methods? All of a sudden, the students will have to come to grips with the fact that the answer you get depends on the precise question that you ask. But you can still ask meaningful questions and get useful answers. Have them go and collect some performance data
Starting point is 00:17:05 from the university website and see what kinds of patterns they're seeing there, so that when they go out into industry, they can start to apply those techniques to the problems they themselves are solving, but also engage in conversations with the researchers who might be able to answer other bigger questions for them. This is how other sciences have progressed. And I have made the suggestion for this course multiple times in multiple venues over multiple years and been met with either blank stares or the sort of embarrassed shuffling of the feet that one gets when one suggests that a child should eat more broccoli. So it's a dream.
Starting point is 00:17:53 I don't think the course would be very hard to put together. I think that programmers who've gone through that would be better positioned to deal with the kinds of systems and issues that are coming up today that go beyond simply writing code. But I don't have any evidence for that. I just have a strong personal belief. I wish I had the chance to try that experiment. And so the goal would be for the students to design an experiment related to software development. Absolutely. And collect and analyze the data and then figure outgrads with the resources they've is, have them do code reviews, and then collate those and see how well did they agree and where did they differ and why.
Starting point is 00:19:14 My experience, not just in my present job but previously, is that three different people will provide three very different code reviews, some of which are more useful than others. I'll provide three different code reviews depending on what you ask me to look for. Absolutely. Now, in the early 1980s, there was an experiment done, I want to say Johns Hopkins, but I'm probably wrong, where they took x-ray images, chest x-rays, and put them in front of different radiologists and said, do you see a tumor or not? And the answer was basically a coin toss. There was almost no consistency. The only thing that seemed to matter was which school the radiologist had
Starting point is 00:19:58 gone to, because different schools and different courses were training them in different ways. The response of the radiology community was to say, let's put together a canonical set of examples and come to some sort of consensus about what you are supposed to see here. It doesn't mean this is the right answer. It means that now we've got something we can drive. Now we've got enough consistency that we can start to steer this car if everybody's all over the map we are quite literally herding cats or in this case herding radiologists right um from your work in embedded systems i expect the devices that i release into the wild to meet certain standards we can argue about whether those are the right
Starting point is 00:20:47 standards, but we've already accepted that there are standards and that there are standards bodies and somebody is checking this so that if we want to make things better, we've got a knob that we can go and try to turn. That seems like a better world. I mean, sure, there are FCC standards and power standards, and there are security standards that seem to always be coming soon instead of being here. Yep. Medical, I mean, has its own special set of standards as do other safety-critical systems. But when you're talking about consumers and privacy, that's all kind of left up to the developer. At the moment, yes. I think we would all of us be happier if Western world decided that there were actually going to be standards for medicine. You could sell whatever you wanted and label it however you wanted. That was going badly enough that against fierce opposition and not just from the pharmaceutical industry, we decided as a society that there ought to be some rules here.
Starting point is 00:22:40 Can I ask you, do you know why mechanical engineering exists as a discipline? I assume so bridges won't fall down. In the 1870s, the steam boilers on locomotives in the United States were blowing up with such regularity that people were not riding trains. And so, at the time, millionaires today, we would call them billionaires or oligarchs who owned and operated the rail networks, went and said, we are going to create a standard college program to train people who will then be the only ones allowed to design these boilers so that they stop blowing up, so that people will ride our trains again. And 30 years later in Germany, they invented the profession of chemical engineer by explicitly emulating the invention of mechanical engineering. They said, it's in our own interests to have a floor, to have a lower bar that everybody has to get over and to do it in a public fashion so that people have confidence in what we're producing but does this standard with software is it mizra is it compiler warnings i don't know i know i triple e had this exam that they wanted to push on everyone
Starting point is 00:24:06 yep um i don't know if you've ever been active in politics i have not because i hate them with a deep abiding passion okay well i've been involved in several political campaigns over the years and there's a saying that i think goes all the way back to sol alinsky which is that nothing gets done in the united states until you've got a busload of dead children right there has to be like the plane has to fall out of the sky before there will be the public will to enforce safety standards. I'm too young to remember Ralph Nader meeting bodyguards when he started to complain about the fact that some automobiles made by U.S. manufacturers would blow up if they got rear-ended, right? But these days, we've won
Starting point is 00:25:02 that fight. We no longer fight about whether there should be safety standards for automobiles. We fight about what they should be. I don't think I know enough, I don't think anybody knows enough yet to decide exactly what the standards should be. I think people like Lawrence Lessig have made some very interesting proposals around actionable standards for data privacy. I think Cory Doctorow's recent talk, he talks about the institution of the internet, the way that things start so well and always seem to go downhill from there.
Starting point is 00:25:35 He points at Google Search, for example. He recently gave a talk in which he put forward some reasonable proposals to try to stop that from happening and try to roll it back. I think that we're going to need more, I hate to say this, I hate to be cynical about human nature, but I think we need disasters closer to home. It's clear, for example, at this point that Facebook played a major role in the genocidal slaughter of the Rohingya in Myanmar, that they could have stepped in to squelch the sites that were stirring up that kind of hate, and they chose not to because engagement numbers. And this isn't
Starting point is 00:26:20 a personal opinion. There have been lots of hearings about this. The company has sort of kind of acknowledged complicity in this. But Myanmar is a long way away, at least from me and probably from you. If something like that happens in the 2024 election cycle, if we see something like January 6th, but 10 times or 100 times as large, and it's very clearly been fueled by irresponsible behavior by social media. Or if we see an enormous data breach that directly affects the 1% and our elected representatives. I suspect we will suddenly start to see these laws being made. I suspect a lot of them will be bad laws because they will have been made in haste. And by people who don't understand the underlying technology and its constraints. Absolutely. Or as we're seeing with Sam Altman and co. going and testifying in front of that with pharma companies trying to influence safety
Starting point is 00:27:47 legislation, with oil companies essentially writing the bills around environmental protection. I'd hate to see in a field like AI, a handful of large companies hoping to make trillions of dollars be the ones to dictate to us what they're allowed to mine and how they will be checked for bias, what they have to tell us about the data they're collecting and so forth. We've wandered a long way away from the idea of teaching empirical methods to undergrads in computer science, and I apologize for that. But there's an ex-KCD cartoon from many years ago of somebody up on stage giving a speech
Starting point is 00:28:28 and in the audience somebody's holding up a sign saying citation please you know the Wikipedia protester I want more of that in our industry I want more people saying what is the evidence how did you collect it how reliable is it why are you trying to convince me that this is true? About the processes, about how we learn things, how we teach things? Mm-hmm. About how we develop software? About... Oh, about everything. I don't have time to read all of those. I beg your pardon? I don't have time to read all of those.
Starting point is 00:29:06 I beg your pardon? I don't have time to read all of those. I mean, I read more technical books than most people I know. I know only maybe two or three people who read more technical books, or I don't tend to read journals at all. And I know people who do. Absolutely. So I try to stay up to date in my field. I try to learn new things.
Starting point is 00:29:35 Often it's along my interests, like learning more about Python. And it's not in the area of what I would call squishier things. Okay. Things that could easily revert to norm. Mm-hmm. Experimental things that have low R values. Okay. So very few doctors read the research literature. What they do read is the
Starting point is 00:30:12 summaries that are put together for them by the Canadian Medical Association. I'm here in Toronto, by equivalent organizations in other countries to say, here's what we've learned in the last 12 months about hypertension that you need to know. And here are the citations. If you're curious, if you don't believe this, or if you think you've got an oddball case and you really want to follow it up, or if you have somebody who's an outlier that you think might be worth further study, here's a point in the paper and reach out to the scientist. Whatever it is, there's less of a gap in some fields between the people who are studying the thing and the people who are building than there is in our field. And I think that's what I'd like to see.
Starting point is 00:31:04 Is that because our field is so large? I mean, I can barely talk to cloud people sometimes. I am willing to bet that the medical profession is both larger and more diverse than the software profession. I agree. Right. And absolutely, there's a lot of tension within that and varying standards. The people who do cardiology are usually not polite about the people doing psychotherapy. And absolutely, different standards of evidence because you're dealing with an entirely different category of problem. But that doesn't mean that progress is impossible. I think we know a lot more about public health than we did 50 years ago.
Starting point is 00:31:54 I think we know a lot more about genetic influences on disease than we did 50 years ago. I know that we know a lot more about batteries and solar cells than we did even 20 years ago. Do you think our understanding of how people build software has progressed to anything like the same extent in your working lifetime? It has improved. It has improved significantly. I mean, version control, my W uses version control and he likes it. And bug tracking. And maybe we haven't proven that those are important, but let me tell you, I am never living with that version control again. Do you think that programmers today are able to produce more or better code in a fixed unit of time than they were 40 years ago? I would argue not. Do you think we're any better at eliciting requirements and making sure that the thing we're building is the thing we were supposed to build, I would argue not. In some industries, I would argue yes for both of those. I mean, there are huge libraries of code that are accepted as good and you can build upon those. You don't have to reinvent the wheel. Excellent.
Starting point is 00:33:21 There are ways of wireframing your application that are so much simpler than going and encoding them. And those are accepted processes that are part of building things. This is wonderful. You and I disagree in an empirically testable way. Yes. Now somebody get on testing it. And that's what I would like. But our profession doesn't take that next step. If I tell you that high salt intake increases the likelihood of heart attack in men over 65, and you say, no, it doesn't, we both accept that there is an answer to that question. The answer is probably it's more complicated than that, but there is an answer to the question, and that answer is findable.
Starting point is 00:34:08 I'd like us to be in that place. I'd like to be wrong less often, or at least I'd like to be wrong about more interesting things. journals that and the ones that i know about seem very biased towards making their own profit and the ones that i do see often i don't want to spend 40 bucks if the article is going to be stupid oh oh yeah we started never work in theory.org to try to close this gap, to try to provide the kind of potted summary of recent research that a practitioner would actually read. And it mostly hasn't worked. There's not a large enough receptive audience of practitioners who believe that the research might be useful. And equally, most people doing the research are pursuing a problem that grew out of a problem that grew out of a problem that grew out of a problem that was interesting to somebody three academic generations ago. Or, as we've seen over the last 10 months, they're chasing taillights.
Starting point is 00:35:27 A good third of the papers that are getting posted to preprint servers like archive.org right now in software engineering are basically, I threw chat GPT at the wall and some of it stuck. Now, on the one hand, I believe people like John Udell, Simon Willis, and Ian Bicking, when they say that this is going to change programming at least as much as Stack Overflow did, that it's going to be a game changer. On the other hand, I don't think that grabbing a bunch of source code, running it through one of these tools and doing a t-test is moving anybody forward. And it's an embarrassing distraction. That seems testable. And I know which way I would bet. Well,
Starting point is 00:36:26 that's what we need. What we need is really a gambling system. I say, I say we should use version control. Nobody's going to give me bad odds on that, but somebody will go out, somebody will say, okay,
Starting point is 00:36:38 I disagree and they'll go out and they'll test it and we'll figure it out. And maybe, maybe some of the other things we disagree about, we'll give odds and people will put on which side they're on. And, you know, even that voting, interested voting, leads to some interesting data about how things work now versus how they should work. So you made many interesting comments. One of them is the idea that nobody's going to argue against version control. I've actually proposed that we should go and run a large-scale controlled
Starting point is 00:37:22 study to see whether version control is worthwhile or not. For the same reason that years ago when I was an engineer, the first thing I would do with a voltmeter is hook it up to a known voltage source and make sure that the voltmeter was properly calibrated. If my study method comes back and says there's no benefit to version control, I am going to believe that my study method is wrong. Right. I need to validate my instruments because I can't apply them to more complicated things like
Starting point is 00:37:52 test room development or a long list of other things until I've got confidence in my voltmeter and my thermometer and my other tools. So I've tried three times in two different countries to get funding to go and do a study to see whether or not version control actually makes things better. And three different research agencies have said it's not worth doing. Well, I mean, I agree it's worth doing because you have to calibrate your questions. Like when you say, is software development better? That's not an answerable question. Are there fewer bugs? Are there fewer late worked hours? What do we mean by better?
Starting point is 00:38:39 Yes. And this is it. How do you operationalize that vague question, right? And so doing version control isn't about testing version control, it's about testing your methodology. Right. And it's good practice for seeing how you translate a vague, comprehensible question, like, does version control make things better, into a specific experiment that is probably answering something much more precise and much less general. Does having friends make you live longer? You know, there's this claim these days that having lots of friends increases lifespan. It's plausible. It's being challenged. And it turns out that one of the reasons there's disagreement is that people don't agree on what exactly do you mean by friend.
Starting point is 00:39:30 Yes, yes. Right? Okay, so if I'm measuring that differently, of course I'm going to get different answers. So now let's go and do what scientists and engineers have been doing for centuries, which is refine the question, refine the instruments, learn more about what it was we were asking in the first place. Measure it in the volume of tears from the developers. Yeah. Sure. You've seen the XKCD code review cartoon, right? WTF per minute. Yes. Right. Okay. Sure. I think that's valid. I think it's a valid measure of code quality. How quickly can the next person get to the point where they can make a change? If I say that as a sentence, it makes sense. Now you think about
Starting point is 00:40:22 the hundred things we would have to do to actually do that study and the hundred other questions that we wouldn't be answering because we narrowed our focus. So, I don't think this is ever going to happen, at least not in my working lifetime. There's no appetite for doing this, even among the researchers who are doing this kind of research. I know because I've asked. There is certainly no interest in this in companies, I know because I've asked. You know, we're at the state that medicine was in prior to the 1920s, where, I don't know if you know this, but in 1920, none of the entrants to medical school at Harvard had a science background. Medicine was something that gentlemen did. It wasn't seen as
Starting point is 00:41:05 applied science. And it took a generation to change that. Lewis Thomas has a really good book called The Youngest Science, in which he talks about how it is that medicine came to see itself as a branch of applied science. I think we're all better off for it. But I'm 60 years old. I don't have a generation to wait. So I do apologize if I've been going on too far and too long about this one. Oh, we're totally off topic and we're going to have to do like a speed run through your career, but that's okay. Okay. Let's do the speed run. Well, first, a little bit more about It Will Never Work in Theory. There are talks from April
Starting point is 00:41:46 2023, Emotion Awareness in Software Engineering. Is that emotion of the person, emotion of the computer, or emotion of the developer? The first and the third. Not the emotion of the computer.
Starting point is 00:42:04 You asked me earlier if i had a favorite robot and i i just don't think i'm there yet uh there's also how novice testers perceive and perform unit testing yep and how that differs from the way that professionals in industry do it this is a i think a useful prequel to how do we close that gap? How do we get young programmers to think the way we want them to about the actual purpose of testing and how to go about doing testing well? And there's a talk here, crafting strong identifier naming practices. And this is going to be science-based? Sure, because it's actually pretty straightforward to study how recognizable different naming conventions in code are.
Starting point is 00:42:50 For example, camel case is harder for people to read if English is not their first language. Okay. for example, of somebody coming from a non-alphabetic language like Chinese, the notion that the capitalization of the letter matters, and then we hit acronyms, and then we hit all of the other cases, there is some evidence that pothole case is actually easier for non-native speakers of English to read because it doesn't require as much implicit knowledge. Now, what about short identifiers versus long identifiers? Well, it turns out that that relates to the scope of the variable. It's perfectly okay to say for i equals zero, i less than n in a three-line loop. It becomes more problematic as the number of lines within that variable was in scope increases,
Starting point is 00:43:45 which might seem obvious in retrospect, but as far as I know, nobody writing programming books had actually said that before this team in Ireland did the study. And I'd have to go and look up the names and dates for the study. I apologize. I don't have it at my fingertips. But when you think about your favorite linting tools, they're not as nuanced as that. So we can go back and treat, I think code is a user interface. I think everything we know about HCI can be applied to source code.
Starting point is 00:44:17 And everything we know about the readability of text can be applied to source code. And we actually know a lot about that. Could we use that to design a more readable programming language? Andy Stefik and his colleagues have a lot of evidence to show that we can. They actually A-B test every new feature of Quorum, the syntax of every new feature, before adding it to the language, and they compare different possible syntaxes to see which is going to be most comprehensible to novice programmers. Don't you think your life would be better if that kind of empirical HCI-style approach had been used for C++?
Starting point is 00:44:56 The answer, of course, is unanswerable, because if we'd done that, we wouldn't have built C++. All right. Yes. All right. And there are years of these lectures. Yep. And they go back. We've only done the live ones for, oh, a year and a half. We did three sets. So, spring 2022, fall 2022, which was co-located with Strange Loop, and then spring 2023. Before that, we were just doing written paper reviews. Negotiation and padding in software project estimates, and that's going to be science-based. Absolutely, because you can go and you can interview the developers and you can say, okay, what's your real estimate? Now, what did you put in front of management?
Starting point is 00:45:46 Now talk to me about those differences. Well, you see, I multiplied by four and then I added two days because I knew whatever it was. Ah, so what that team, that's a Brazilian team. And what they found was that there were significant number of cases where the developers would deliberately lowball their estimate.
Starting point is 00:46:04 What? Because if I tell you how long it's going to take, you're going to say no. No, yeah. If I tell you it's only two days, and then I'm stuck in, and I look at you with puppy dog eyes and say, well, we can't quit now, right? If I know it's going to be a month, and it's critical refactoring, and I know that business isn't going to let me take a month, I'm going to lie for the good of the company and the good of my soul. Please don't tell my boss I've done this recently. Wow. Well, but how many times have you said to
Starting point is 00:46:38 a child, we're almost there? Never. I don't have kids. Okay i understand i do understand the idea right um i i'm okay lying to other people's children as well as my own so maybe i'm an outlier right but but this is fascinating right and here's where we get into the fact that not all rigorous empirical research has to be statistical right qualitative methods can get us answers that we can't get through mere numbers if it's done right, right? The case study approach can, in fact, uncover a lot of really interesting, actionable insight as long as it's done carefully. And I think one of the reasons that engineers like me and people in related disciplines don't believe that is that we've never seen it done properly. Oh, no. And yet never seen it done properly. Have you worked with product managers?
Starting point is 00:47:46 Yes. Okay. So do you believe that they are actually getting the right answers, good insights some of the time? Can you take my silence as a lie for yes? Okay. I'm working right now with an absolutely outstanding product manager here in Toronto. And she goes and she interviews the biologists working in the lab. And then she goes and consolidates what she's found and takes it back to them. And after a couple rounds of that, she can say, here are the problems we actually need to solve. And here are the things that would constitute solutions. It's more of a case study style than collecting statistics on
Starting point is 00:48:31 how many visitors did the website have, how long did they stay on which pages, and so forth. Both are meaningful. I think one of them is closer to the kind of training I've had and therefore more easily recognized and more likely to be accepted. And again, one of the reasons I would like to have training like this for young software developers
Starting point is 00:48:54 is that stats isn't the only way of knowing. And I think we throw away or disregard a lot of really valuable insight by insisting that if it can't be quantified, it's not real. I mean, case studies are incredibly important. Even though they can be a small amount of actual data, approaching anecdote, at least they are well-documented anecdotes with their precedents explicit. actual meaning. As I said, it's not the sort of bulk collection of data followed by a t-test that I'm most comfortable with, but I should not discount somebody else's insights because they're using methods that I haven't yet learned. And I wish I had had the humility to understand that 20 or 30 years ago. Where does it end?
Starting point is 00:50:14 Where do we start with the t-tests? Where do we go from? I have no idea. I truly believe that source control is important to test-driven development and whether or not linters are truly helpful or all of these things. It's hard for me to know where we go or where we would stop when we haven't even really started. I believe that empirical study has served engineering and medicine very well. would like the construction of software to be at least as rigorous as the construction of a highway or a footbridge. How much is it my employer's job to give me time and money to learn? And how much is it my responsibility? So that question comes up in pretty much every other profession as well again i'll come back to to medicine um what we don't have that the medical profession has and the legal profession
Starting point is 00:51:38 and accounting and many others we don't have professional associations that are worth a damn. The ACM and the IEEE, yeah, right. I heard you snicker, and I agree. The reason that there is space carved out for paramedics to maintain and improve their skills is because they've got a professional body that goes and negotiates in bulk rather than trying to do it case by case. The reason that lawyers, for example, have time carved out but also have a requirement to recertify is because there's a professional body that has teeth that says, we as a profession have privileges that are not granted to the average citizen the societal bargain for that is we try to ensure a certain minimum standard if i go and get an an engineer for example to to do the drawings to replace the wall at the back of our house
Starting point is 00:52:42 if that engineer does drawings that turn out to be faulty, and we build a wall and it collapses, I can sue the engineer. And I have a reasonable expectation of getting my money back and damages and so forth. The same is not yet true of software. And I think that's largely by design. I think programmers have worked very,
Starting point is 00:53:10 very hard to make sure that they can't ever be blamed for anything in any meaningful way. Right. And I think we've outgrown that. Can I give you an example? Sure. Okay. Do you remember Waze? It was an early piece of wayfinding software, predecessor to Google Maps. Yes. Okay. So Mike Hoy, who used to be at Mozilla, uses this as an example. For the first couple of years that Waze was out there and in use, you could ask it to find a route from A to B that avoided police checkpoints. Okay.
Starting point is 00:53:41 Do you think the person who implemented that feature had ever lost a loved one to a drunk driver? No. Okay. Do you think the person who implemented that, or the person who deployed it, or the person who authorized construction of that feature, that somebody, maybe the company as a whole, should have been liable for deaths that resulted from drunk drivers avoiding police checkpoints. I absolutely do. I'm not sure who and at what level. Is it the individual programmer? Is it the company as a whole? Is it both of them? If it was a piece of hardware going into a car that proved to be faulty and led to a crash we have the legal and societal structures to say here's who you can sue here's whether they go to jail or just give you money but
Starting point is 00:54:34 it's not a perfect system by any means but at least there's a system for doing it and as a result i think your car is probably safer than a lot of the software that you use. It's a measure of harm. I'm not going to take the Waze example because that one's really good. But the reason that cars get more checks is because you can hurt people. The reason airplanes get more checks than cars is because you can hurt a lot of people. The reason medical devices get checked, get certified, is because you can hurt lots of individuals.
Starting point is 00:55:17 And so we have built some of these checks? We have built social media that explicitly and very effectively encourages racism, homophobia, and misogyny. I totally agree, and I don't know how to solve that one because you can't, there isn't one, how do you separate the content from the content provider? Okay. How do you do that with traditional publishing? How do you decide when a Ford Mustang bursts into flames, whether it's the mechanic, the engineer, the company executive, or the company? Right? That's hard. By the fifth one, you figure it out.
Starting point is 00:56:03 Right. Right. And we are now a day late and a dollar short you figure it out. Right. Right. And we are now a day late and a dollar short on figuring this out. Like, there's no question at all that the Internet has been a great force for good and also a great force for harm. There's no question at all that our entire industry has done harm as well as good. You may know that there are now dozens of cases in the United States of people being arrested because of facial identification software misidentifying them. And that in almost every case, the person who's misidentified is an adult black male. Right. Okay. Somebody should be sued for that. And there are cases working their way through the courts.
Starting point is 00:56:44 And the legislation needs to catch up, but also our expectations. should be sued for that. And there are cases working their way through the courts. And the legislation needs to catch up, but also our expectations. As I said, the software industry has trained us to believe that we just shrug and go, okay, I guess you leaked all my personal information again. Well, and as developers, we know we can't do it all. We don't have enough time. We don't have enough time. We don't have enough resources. We don't have enough input data to know for sure that what we're doing is...
Starting point is 00:57:14 The same is true of the engineers who designed the car that I think is probably in your driveway right now. It takes literally tens of thousands of people to design and build and deliver an automobile. And that's why they're more expensive than a Fitbit. Absolutely. But we have decided with automobiles, who's to blame for what? There's a book by Amy Gaida called Seek and Hide, The Tangled History of the Right to Privacy. It goes on a little bit. It could have been half a length. But it's a history of how this notion that we have a right to privacy in the United States
Starting point is 00:58:00 came to exist. Because the founders absolutely had very different ideas. People as recently as the 1920s and 1930s had very different ideas about what kind of things you as an individual had a right to keep private than we do today. We negotiated that right over the course of generations through one case after another, through basically trial and error. And I do not mean to pun, but we take cases to trial and we see does this make society better? What are the unintended consequences? And so forth. I think we are starting to see some of that now. The lawsuits that have been launched against various AI companies over
Starting point is 00:58:45 copyright infringement by people like George R.R. Martin. You scraped my books. You didn't have my consent. That's copyright infringement if you use it to train the model. Okay, that's interesting because there's very clearly not going to be a simple yes or no answer. We're going to negotiate a border, and that border will shift over time. As I said, I would like to see more of that in our profession. I'm on the yes and how are we going to find the money to provide time for the software engineers to possibly get certification? Because as you mentioned, there were other professions like doctor and lawyer, and those are both certified professions with longer degree paths than most software engineers. I believe that Elon Musk could pay for every programmer working in North America today
Starting point is 00:59:45 to go and get this certification and still be one of the richest men in the world. But we don't get to control him. We only get to control our actions. We as voters can pass laws, elect people who pass laws to change the rules of the game. In the Gilded Age, there was no sense at all that people should pay taxes proportional to their wealth. In my parents' time, there was absolutely no sense that you should not end your years in poverty unless you had been lucky enough to have a middle-class job and squirrel money away. Here in Canada, there is the very strong sense that you should not go bankrupt just because you have cancer or a broken back.
Starting point is 01:00:33 All of those decisions we made as a society and enforce them on people who are very, very strongly opposed. You're in the area of things I don't know how to make to affect a change. I mean, we're back in politics where I just sit there and go, this is nice. I'm going to go build something that makes somebody smile because I can do that. Mm-hmm. And then you go and you vote. And you vote for somebody who is for example going to introduce real liability legislation for banks and other companies that lose personal identifying information right right now the fines for that are a slap on the wrist they're they're an operational expense i vote and i feel like it's thrown away each and every time, but I do it because I do believe it's important. But it's just, it's just, I mean, I live in California. It's, your vote probably has mattered quite a bit.
Starting point is 01:01:47 I think that at the state level, your vote matters even more. I think we need to research on this. I beg your pardon? I think we need research on this. We do. Sorry. No, no. And there has been a lot. In the United States and Canada, and to a lesser extent Australia,
Starting point is 01:02:09 the state or province level is often used as a laboratory for trying out ideas that then move up to the federal level. Whether it's around environmental protection or anti-discrimination laws or tax laws. Proposition 13 in California, which crippled the state's finances. But that was a tryout. That was an explicit tryout by right-wing libertarian politicians for legislation that they then wanted to move up to the national level. When we take a look at things like legislation to prevent discrimination in hiring practices. Some states, with electors elected by people like you, take the lead on that,
Starting point is 01:02:55 implement legislation, and the rest of the country looks and says, oh, okay, that looks better than what we're doing right now. For example, I was amazed and very pleased when the law changed in California so that companies are posting salary bans with jobs. Yeah. Right. Okay. Of course, the ban I saw recently was 20K to 300K. So that ban is not as useful as it might have been. Won the point. Yeah, and that was an outlier.
Starting point is 01:03:28 Of the many I was looking at, that one stood out because it was like, oh, come on, play by the rules. Right, right. But here's the thing. We've now won the point. We've won the principle. Everything from here on is about the details. How wide is the band allowed to be?
Starting point is 01:03:45 How closely does it have to reflect the pay of current employees or the last five? Now we get into policy wonk country, but we've won the principle. But I just want to learn a new microcontroller. I want to play with a robot. So have you ever read the English author Terry Pratchett? Yes. Okay. There's a scene in the book Night Watch where Captain Vimes goes out and sits in front, sets up a table and chair in front of the mob. And somebody says, well, you know, we want truth, justice, freedom. And
Starting point is 01:04:27 Weim says, you know what I want? I want a cup of tea, piece of toast, and a hard-boiled egg. Do you remember that scene? Not off the top of my head, but sort of. I don't remember what comes next. Because he then goes on to say, I want that, but I want to have that every morning when I wake up. And I can't be sure of having my cup of tea, my piece of toast, and my hard-boiled egg unless we take care of all this other stuff. I don't want to have to worry that I'm not going to have that because there's no food in the shops. I don't want to have to worry that I can't have that because somebody kicked in my door in the middle of the night and dragged me off.
Starting point is 01:05:11 But not everybody can spend all of their time worrying about the politics of everything. We don't have to. I mean, for me, it's just too stressful. If I spend too much time thinking about politics, I'm out. I just can't do it. Absolutely. And if I spend too much time thinking about software deployment, I have to go sit in a corner. Right?
Starting point is 01:05:35 There are some people, as with everything else, there are some people who thrive on it. For the rest of us, we volunteer a little bit here and there. We go out and vote. Maybe we donate some money. Maybe we don't pay as much attention as that guy on the bus who really, really wants us to understand what's going on in the Greek parliamentary elections this year, right? Sorry, that was today's experience. I know nothing about Greek politics and I don't want to. But the same is true of how does the food get on the shelves in the supermarket? How does the sewage system work? There are people who take care of that. The rest of us just have to support them to make sure that
Starting point is 01:06:11 they can do their jobs. I was appalled at how many Americans chose not to vote in the last midterms. We know what's at stake. We know what's at stake, and we've seen our elections messed with, and we know how difficult it is for some people to vote by design. Yep. It doesn't really surprise me. There's a sense not of apathy but of depression. Yep.
Starting point is 01:06:59 And I can only imagine how the generation before us felt around race rights, around women's rights, around LGBT rights. They felt the same sense of weariness and hopelessness. And here we are today. Better is possible. I realize we've wandered a long way away from embedded controllers, but I think one of the things that I want is for people to truly believe that better is possible and that it's not really that much hard work as long as we do the work together and that doing that work together is a lot of fun um the thing i remember
Starting point is 01:07:36 most about teaching with software carpentry which is a non a nonprofit that teaches coding skills to researchers who didn't get it earlier in their career, was the feeling of teaching with other people, of building lessons with other people. Yeah, interacting with the learners is really fun. Seeing that light bulb come on, that's magic. But being in the room with somebody else who realizes that you've got the wrong slide deck up, and you look and you lock eyes and it's like, okay, it's a Tuesday. We'll get through this. Right? That feels pretty good, too. I totally agree with you.
Starting point is 01:08:17 We can make it better. And I think making it better is about doing things each day and to take it back to software. Oh, is that what we're talking about? This never work in theory. There are all of these well-researched, well-articulated ideas about software engineering. And some of it is the idea of software engineering. Some of it's nearly the typing part of software engineering, the tactical parts.
Starting point is 01:08:52 Sure. And you don't have to do it all in one day. You can be just a little better every day or every week. Absolutely. Watch a video, take a little time to think about it. And if it doesn't apply to you, go on. That's fine. You don't have to tackle everything at one time.
Starting point is 01:09:12 Absolutely. I think anybody who comes into a large code base and says, we're just going to rewrite this from scratch. Oh, that person. Yeah. Right. That person. Right. Oh, that person. So here's a funny thing. I learned this when I was doing a dive into the teaching literature. Have you ever seen yourself on video?
Starting point is 01:09:52 Only in the last few years, and I hate it so much. Okay. So here's the funny thing. We can make you watch a video of yourself, and we can measure stress levels. You can put the electrodes on you, I response, people response, things like that. Then we can get an actor and train them to imitate everything you're doing in that video. And then you get to watch them and your stress levels will be far lower. Oh yeah. Right. You're far more critical of your own tics and so forth because you've had practice your whole life ignoring other people's foibles because it's not socially useful to notice when people stumble or say um and er or tug their earlobe. But unless you spend a lot of time in front of a mirror, you've no practice at all subtracting out that static from you.
Starting point is 01:10:47 And now you're confronted with it on video, which is something that evolution did not prepare us for. And I think the same is true of code. If I get a chunk of, okay, here's another experiment. If I get a chunk of code written by one of my colleagues, and I do a review, I'll probably say, yeah, there's a couple of things you can clean up, but basically this is okay. If you were to have me look at exactly the same code and I was the author, I would say, all right, rm-rf, no, wait, reformat the hard drive. Can we destroy all record of this?
Starting point is 01:11:24 I'm going to react very, very differently. And I... That's funny. I wouldn't. I wouldn't. I mean, looking at my own code, I mean, it depends on which code. But for the last few years, so much of my code is written almost for demonstration purposes. Clients that I work with who, yes, they want me to implement a bunch of stuff, but at the end of the day, I know I'm handing it off and I'm handing it off to a junior engineer
Starting point is 01:12:01 and I want the junior engineer to be well prepared. And so when I look at my code, I want it to be something that is almost amusing to read. If not amusing, then engaging or something. Like when I write a book, it's not just here are all the facts. It's here are all the facts in a way that I think will make it easier for you to understand them. And so, no, I don't have as much trouble with my own code. I can't see my own typos, which is kind of a problem. But since I started writing code to be either public or to be for other people, I'm not nearly as critical about it. The video thing, yes. And by the way, I have a podcast, if you don't know, and I don't like my own voice. I don't like the way I have to
Starting point is 01:12:55 slur sentences together so that I don't stutter. And then people don't seem to notice that I stutter, but I do a lot. And then the whole slurry, mushy thing is terrible. How can you even listen? So, I realize we're coming up on time, but another thing I came across when I was learning about teaching, we've seen steady improvement in athletic performance throughout my entire lifetime. Records keep getting broken over and over again. And the question is why? Some of it, I'm sure, is better pharmaceuticals.
Starting point is 01:13:34 Better teaching. What in specific when you say better teaching? Because there is an answer to this question. The first person learns that if you start with your dominant foot forward and launch from there, you go a little faster. Second person doesn't have to learn that. They get taught that. Now they get to learn that if you put your ankle out just a little bit, that you can go faster. And now the third person starts with both of those and has time to learn the additional stuff on their own.
Starting point is 01:14:06 And I agree with all of that. There's also one other thing that, depending on who you trust, might account for as much as one quarter or more of the improvement in athletic performance in the last 45 years, and that is video. Yeah, right. five years, and that is video. Prior to the early 1980s, you had to be pretty far along to ever see yourself play the sport. And even then, there would be a delay of at least a day between you're on the field and you get to watch the movie, right? Today, every kid doing athletics can watch themselves on their phone as soon as they're done performing. Here, can you record me doing this? We've made that feedback loop so tight and so ubiquitous that every serious athlete can watch themselves in the same way that every serious musician since the 1970s at least has
Starting point is 01:15:06 been able to get a tape recorder and listen to themselves. It was physically impossible to actually listen to your own performance prior to the Edison phonograph. It was impossibly expensive until the invention of tape recording in the 1920s or 30s, depending on which version you care about. And it was impractically expensive until at least the late 1960s for anybody except professionals. And then all of a sudden, in the 1970s, a 10-year-old kid could have a tape recorder radio at home, and they could play the saxophone and then listen to themselves. And if you think that hasn't had an impact, boy, it has. tape recorder radio at home and they could play the saxophone and then listen to themselves. And if you think that hasn't had an impact, boy, it has. And I bring this up because we've seen something similar with performing arts. You might feel uncomfortable with the
Starting point is 01:16:01 sound of your own voice, but every actor, every TV presenter, radio presenter, of course, they're listening to themselves and watching themselves now in the same way that athletes are. And there are companies like Athena that build products to help teachers do this. And yes, absolutely. The first two or three times you see yourself in front of a class teaching, you are going to want to go and find some deep, dark cave to hide in for the rest of your life. Right? But then you get used to it in the same way that athletes and actors and others get used to it. That initial discomfort goes away, and it just becomes another Tuesday. And at that point, your performance accelerates because now you can view yourself as if you were a stranger and give yourself the feedback you would give the stranger.
Starting point is 01:16:52 And you know what the value of getting code reviews and other kinds of reviews are. We can do it for ourselves now, thanks to technology. And it took me a while to get used to it. But, you know, if the jocks can do it, I can do it. Well, Greg, we are out of time. Do you have any thoughts you'd like to leave us with? I think I've shared far too many in the last 90 minutes, and I do apologize for that. No apology needed. It's been fun. I'm very grateful to you for that. No apology needed. It's been fun.
Starting point is 01:17:27 I'm very grateful to you for inviting me on the show. Well, we didn't talk about many other things, so expect another invitation. I'd be happy to come back. Our guest has been Dr. Greg Wilson, a programmer, author, and educator based in Toronto. He co-founded and was the first executive director of Software Carpentry, which has taught basic software skills to tens of thousands of researchers worldwide. Dr. Wilson has also authored or edited over a dozen books, including Beautiful Code, which was awesome. The Architecture of Open Source Applications, which was awesome. The architecture of open source applications, which was awesome.
Starting point is 01:18:07 Teaching tech together, which I haven't read. And most recently, software designed by example, which he's got part of a version of a Python one online, and that was pretty awesome too. Greg is a member of the Python Software Foundation and a recipient of ACM Sigsoft's Influential Educator of the Year, to which he wrote a rebuttal instead of an acceptance. And he currently works as a software engineering manager at Deep Genomics. As you can tell, Greg has been around and done a lot.
Starting point is 01:18:43 He's got the bite marks on his butt to prove it. Thank you so much for being with us. Thank you very much, Alicia. Thank you to Christopher for producing and co-hosting. He would have broken in and said, lead in paint and lead in gas is what has been the big improvement in the last 40 to 60 years. But since he's not here, he doesn't get to say that.
Starting point is 01:19:09 Thank you to our Patreon listener Slack group for some of their help with preparing this. And of course, thank you for listening. You can always contact us at show at embedded.fm or hit the contact link on Embedded FM. And now a quote to leave you with. I think we'll just go with Aristotle. It's a classic. Those who know do, those who understand teach.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.