The Changelog: Software Development, Open Source - Thinking outside the box of code (Interview)

Starting point is 00:00:00 This week on The Change Log, I'm joined by Leslie Lamport, a computer scientist and mathematician who won ACM's prestigious Turing Award in 2013 for his fundamental contributions to the theory and practice of distributed and concurrent systems. Leslie also created LaTeX and TLA+, a high-level language for writing down the ideas that go into the program before you do any coding. Quick thank you to our partners at Fastly for shipping all of our shows super fast all around the world. Check them out at fastly.com.

Starting point is 00:00:42 And to our friends at Fly. Deploy your app servers and database close to your users. No ops required. Find out how at FLY.io. What's up, friends? I'm here with Jeremy Tanner, dev rel at Tailscale. And I've been using Tailscale for the past, I want to say the last year. And I consider it critical home lab infrastructure for me.

Starting point is 00:01:11 As you may know, I'm a home labber, so that means I take pride in a solid home network, Wi-Fi, services that make the home run, home assistant for automation, lighting. And of course, I run PiHole to block ads at the network level and to protect my family from ad tracking. And PiHole has this web UI that you can access from the land. But for a while there, I just didn't have access to this service or other services like Portainer or just anything else that had a web UI or something I would access via IP on the local area network. But Jeremy, tell me about a world where via Tailscale, you can access Piehole, Portainer, or any of these other services you might run in your home lab

Starting point is 00:01:50 and access it externally from a local area network. Jeremy Yeah. You hit it when you said anything, anything and from anywhere. And so Piehole being a DNS black hole, if you notice the difference between the comfy experience that you had at home and when you were out of your house, whether it was for work, a coffee shop, or even on your mobile devices, being able to set up a tail scale on your mobile device and have access to the same pie hole and filtering rules that you do at home wherever you have an internet connection is a thing. Same thing when talking about streaming.

Starting point is 00:02:31 Instead of exposing your network storage to the internet and getting that ransomware, keeping that on your private network, but extending the private network to any device that can run the client. And so having anything that can run the client includes your Steam Deck. That's just a Linux box with, uh, the controllers grafted onto the sides of it. And so if you're somewhere that has internet connection and wants to watch some of your video, plug it up to a large TV. And if you're in a hotel room, use that as your, uh, as your media center, that's playing the, uh, the shows that you're trying to catch up on from home instead of whatever hotel TV has on it. Instead of buying a service or paying a small monthly subscription, you have you have compute

Starting point is 00:03:12 and storage at home. It's just an easy way to get back to those instead of usually inventing a bunch of things that you'd love to do. But those sorts of things become accessible to you on your phone while you're outside the house instead of something that's neat to use, but then you miss while you're away. Yeah, for sure. You know, I run Plex, so I'm in big agreement with that, of course. But just being able to do what you can do with Tailscale is so cool without having to learn all the VPN things. It's just such an easy tool to use, easy tool to install on anything really Linux, Windows, Mac. So easy to use. Big to install on anything really Linux Windows Mac so easy to use big big fan of Tailscale and for my Homelab users out there you can totally use Tailscale 100%

Starting point is 00:03:53 for free you get three users up to 100 devices which is like way more devices than I think I would ever need to use Tailscale on the free forever tier is amazing and a great place to start and especially to use it for your home lab needs alone check them out at changelog.com slash tail scale the free forever tier is waiting for you to try it out again changelog.com slash tail scale and make sure you tell them your friends at changelog sent you at changewellcenture. All right, I'm here with Leslie Lamport, a distinguished scientist at Microsoft Research, creator of LaTeX, not the rubbery substance, but the documentation system that you've probably heard of or used.

Starting point is 00:05:03 It's used in academia. It's used in scientific circles, math circles, and a Turing Award winner. Leslie, thanks so much for coming on the show. Thank you. You might want to try that again and pronounce it either latech or latech. Ah, I was reading about this controversy, the pronunciation controversy. So as a layman, I see L-A-T-E-X, and I do think of that rubbery substance. But the initial author was a tech, right? It was supposed to be T-E-C-H was the sound. And your take on that is La Tech. That's how I should say it? Or La Tech. All right. Either one.

Starting point is 00:05:39 Now, where do we get tech out of T-E-X. That's the curious bit. Oh, that's because LaTeX or LaTeX is built on top of Don Knuth's tech typesetting program. Okay. And Knuth called it tech, even though it was T-E-X. Is that right? Yeah. He says it's the Greek pronunciation of the original word. Not speaking Greek, I wouldn't know. Ah. Well, you seem to have some Greek influence though, right? Paxos, a lot of Greek mythology going on. Is that an interest of yours? No, it was just, well, for the Paxos paper, it was just something that seems a good idea at the time.

Starting point is 00:06:17 But before I get away from that, I should say that Knuth did all the heavy lifting and building tech. You know, that was a real remarkable thing that he did. What I did was, you know, in comparison, quite simple, just building something, a macro package on top of it. I see. So Knuth did all the heavy lifting. However, in terms of the tooling, like it's still used.

Starting point is 00:06:40 It's still prominent. Is it still the majority tool that is used in those communities for for specifications and writing or is there something that's usurped it since no i think it's still uh math and physics at least it's still the the standard i don't i don't know about other fields some fields it is some some fields you know i'm sure chemists have never heard of it right well as a working software developer i say i definitely heard of it. Right. Well, as a working software developer, I say I definitely heard of it. I don't know it well enough to pronounce it correctly, as you can tell. But I hear it talked about often.

Starting point is 00:07:13 Some people love it. Other people use it in anger. But everyone seems to use it. It still seems to be the best. And to me, what speaks to me about it is the separation of what you're saying versus how it looks. That's a key aspect, right? Oh, yeah. That was why tech needed something on top of it.

Starting point is 00:07:34 And Knuth intended for things to be built on top of it, but I'm not sure he expected so much of the underlying tech to be hidden from the user. By the way, I should say that the whole idea of basically separating the ideas from the typesetting, that was done by Brian Reed and Scribe System. I used Scribe, and it seemed perfectly obvious to me that that was the thing to do. One of the failings, one of my failings in my career is building on things that seem natural, some seem obvious to me, and without realizing that there were new ideas and that they were significant ideas. I think a prime example of that is in my Time Clocks paper, where it was inspired by a paper from Paul Johnson and Robert Thomas, where they had an algorithm for something, and I realized that their algorithm

Starting point is 00:08:46 wasn't quite the right one because it basically violated causality. You could have something that caused something else, but the things were ordered as if they happened in the opposite direction. And so, one of the two significant contributions of that paper was pointing out this notion of causality, which again, seemed very straight, very obvious to me. But what they had done, they had used timestamps, added timestamps to the messages. And I just assumed that that was a well-known idea that people did that and sort of not even worth mentioning. But as it turns out, I think they were actually the first people who did that because their paper is not as well-known and mine was. People have credited me with inventing adding timestamps to messages.

Starting point is 00:09:46 So a couple of themes here. The first one being is you building on the work of other people who've come before and maybe getting, in some sense, some of the credit on accident, deflecting as I can tell. But also this idea of your ideas, which you've put into papers and have been read now and implemented around the world, seeming obvious to you, but not necessarily to the rest of us, right? You find it hard to explain your ideas to people or to bring them along with you? Yeah, because I just expect more of them than it's reasonable to expect of most people. Fortunately, I've had some really smart colleagues who are smarter than me, in fact. And I think if I've been more successful than them, I think it's because of my different way of looking at things, not by being brighter than them. So, you know, I'm not, you know, I'm not a genius. You know, I'm not in the, you know, in the league with, you know,

Starting point is 00:10:53 people like Einstein or Feynman or, you know, I can name, you know, any number of people. If we were to focus in on the looking at things differently than other people, can you describe that? Like what your angle is or how you see the world that's askew or that's tangential to maybe the way your colleagues were looking at things? What's that different perspective? people is because I think like a mathematician and I approach things mathematically, not computationally. Can you give a for instance, like maybe in a basic case of a situation where maybe me as a programmer would approach something imperatively or computationally, step by step, whatever it is, and you as a mathematician,

Starting point is 00:11:47 maybe we would look at it differently. Because I can't look at it your way, I want to. And so, how do you look at things? When you say, I think about it in math, to me, I'm just like, you're going to add numbers together? Like, what do you mean exactly? Yeah, it's funny. I don't have a ready example, except something that happened fairly recently with a colleague who asked me how to define something. And he said it was really hard to write this definition. And he was defining some property of sequences, actually infinite sequences. And it seemed really easy to me.

Starting point is 00:12:24 So I wrote the definition. And what blew his mind was that it was like a recursive definition, just like he was trying to do. That is, you define it for one and then n equals one. And then so we get for n equals one, you define it for n equals n plus one. But the value that you assign to a given n, the value assigned to n plus one depends on the complete rest of the sequences, infinitely far past n plus one. And that's perfectly obvious to a mathematician. But because he's what I call trapped inside the box of computational thinking, that never occurred to him because you can't compute something by looking infinitely far ahead to find the next value. Right. That's where I'm sitting right now. But he wasn't trying to compute it. He was trying to write a definition. A definition of the problem or of the solution? No, just a definition of some property of infinite sequences that he wanted to define.

Starting point is 00:13:29 I see. I mean, you know, this is, you know, he was a theoretician, you know, working as, you know, this is theoretical computer science in some sense. Right. It's talking about, if you like, properties of programs rather than how to write programs. So you represent the execution of a sequence as an infinite sequence of either states or steps, depending on how you want to do it. And then you can talk about programs scientifically by talking about those sequences mathematically. Gotcha. So this kind of is a pattern or at least echoes what we talked about earlier with Knuth's tech and subsequently LaTeX was the separation of concerns, right? In

Starting point is 00:14:14 that case, it's from the ideas and then the display of the ideas, the typesetting. You're talking about separation of concerns, right? You're talking about specifying the ideas as separate from the implementation, right? When you say programming, I know in the Quanta video, which we'll link up in the show notes as well, that featured you, you made a distinction between coding and programming.

Starting point is 00:14:38 This idea that, okay, there's the idea that you're trying to express because what good is a program if it doesn't do something or express something or change the world somehow or provide some sort of value? So you have the idea and then you have the implementation. And many of us, and I hold myself in this category,

Starting point is 00:14:55 they're kind of munched together and kind of like grow out of the same milieu. But the way you look at it, the idea, the program is a separate thing altogether from the implementation. Is that correct? Am I thinking about it the way that you think about it? Well, yes, exactly. I mean, I wouldn't say it's separate.

Starting point is 00:15:13 Well, one informs the other. Well, the thing I'd like to say is coding is to programming what typing is to writing. I mean, obviously, it's somewhat hyperbolic. There's a lot more intellectual effort goes into coding than typing. But I think actually, maybe in the days of, before the days of computers, the analogy would be writing and typography. Typography is an art. A lot of effort goes into designing a book, but that's completely separate from the writing of the book, the ideas, what the book is trying to express. But it's not completely separate because good typography will reinforce the ideas. So in the coding programming aspect of it, you can't think about what the program is going to do without an understanding of whether you can code it or not to understand what you can do with code. Actually, there are two things that you should do before you start writing code.

Starting point is 00:16:36 The first one is figure out, is decide what the program is supposed to do. And, again, you can think of that in terms of what you have to tell the user of the program the program is supposed to do. And again, you can think of that in terms of what you have to tell the user of the program the program is supposed to do. Now, that program very often is something that's code to be used by somebody else in his program or her program. And I have a very simple criterion for whether you've done this properly is that you should write this description of what the program is supposed to do in such a way that someone can use it without having to look at the code. And the idea that I can be able to give you something that's a bunch of code and you should

Starting point is 00:17:22 be able to use it without having to look at the code is foreign to programmers. It's obvious. It's completely obvious to me. And I think it's completely obvious to a lot of software engineers, but it doesn't seem to be obvious to programmers because they're not programmers, they're coders. And if all you think you're doing is writing code, well, of course, somebody has to read your code to understand what the code does. But that's at the heart of so much of what's wrong with software being built today. So you're referring to like a formal specification. It doesn't have to be formal. Well, it has to be usable without the code. So how you can do that informally? Well, by formally, I mean, you know, writing it down in mathematics or in language like TLA plus.

Starting point is 00:18:10 Yeah, that would be formal in my mind. I mean, I do this whenever I, you know, if I'm built, have a coding, you know, a class and, you know, there's some method that in this class, I will write a description of what that method does. And in 95% of the cases, it's just English, just pure prose. Maybe a formula or two tossed in. Occasionally pseudocode, usually not, but sometimes perhaps pseudocode, whatever works. But it has to be precise enough that somebody can use the code without having to read it. And if what it does is you should be able to see whether what you write, you know, as you write it, you'll know whether what you're writing is making sense. If you want to understand something, you've got to write it. Because there's this cartoonist, Gwynn, that has this wonderful caption of a cartoon, which is, writing is nature's way of showing you how fuzzy your thinking is.

Starting point is 00:19:23 I like that. I had this one example. There's one certain aspect of writing a parser for a TLA plus. Somebody was doing it, and he asked, how do you compute this little condition of some check of correctness of the specification? And I thought, oh, well, this is quite simple. I would write down a piece of usual informal English, a little math. And the more I started writing, the more tricky I realized it was. And I decided that I needed to write a TLA plus specification of it. Fortunately, the person who was building the compiler knew TLA plus. So anyway, he didn't have to explain the TLA plus for him. And he coded it from the TLA plus. And I actually sort of asked him about that. TLA plus is a lot different from

Starting point is 00:20:20 Java when he was coding it. And how much effort was that? And he said, oh no, it was really straightforward. And actually, at some point later, I modified the parser and looked at the code and had to modify the code. And it was clear. It was just completely obvious that you could look at this TLA plus specification and the relation to the code was completely obvious. It made it very easy to modify the code. Then the wonderful thing about that code is about once or twice a year for the next several years after he built that parser, I thought, oh my God, here's a special case of this. Oh, I bet the parser doesn't get it right. And every time I looked, checked it, the parser did the right thing. You know, that stuff really works.

Starting point is 00:21:11 Okay. So this process that you're describing, I guess it kind of matters if we're talking about it in the small, like at a function level or down in the details versus like large systems thinking. And we can, we can discuss that, but it's certainly counter-cultural to the way that software generally speaking is developed in the world today. I mean, we have agile methodologies, we have move fast and break things. We have the only constant is change, you know, head west, young man. I mean, there's lots of like, let's get coding. Coding is thinking. A lot of these axioms that describe how we go about building software, in many cases today, are like exactly opposite of what you're describing.

Starting point is 00:21:57 We don't say you can't think about it, but we say kind of you think as you go. You figure things out. The only thing we know now is that we're going to know more later. We know the least right now. Specs change. Requirements change. The world changes. And so it is quite a different way.

Starting point is 00:22:15 Well, yeah. I mean, first of all, you're right. The world's changed. If you build a program and it's really used a lot, You're going to want to modify it at some point. And the real world is you're not going to go back and start from scratch and redo everything. You're going to start patching it. And we've seen that happen. And as it goes on, that program gets harder and harder to deal with, harder and harder to modify.

Starting point is 00:22:43 And eventually, either the code is going to be thrown away, or it's going to be rewritten from scratch. Entropy, that's law. So, you start patching things, it gets sloppy. But if you don't do this design right from the beginning, then every piece of the code you write is a patch, and you start out from day one with a mess. I love this quote of Eisenhower, who said, no battle was ever won according to plan, but no battle was ever won without one. I like that. Okay, that's compelling.

Starting point is 00:23:17 You start off with a mess. Every line of code you write is a patch. I think there's wisdom in there. You might want to go back to what you said about the small versus the large. Every line of code you write is a patch. I think that that's, I think there's wisdom in there. You might want to go back to what you said about, you know, in the small versus in the large. Yeah, let's do that. The only difference in the small versus the large is that in some sense, bigger something is, the more formal you have to be because it's going to be more complicated. So a lot of the users today are people in Azure, basically the web services of Microsoft and Amazon Web Services that's also a big user. They understand how hard it is to get something right because it's big and also

Starting point is 00:24:01 because it involves concurrency. Once you're dealing with something big, it involves concurrency. And that, it's a whole enormous complexity to things. But it also works at the small level. But the people who write the small pieces, they don't understand that yet. The people who write the web services understand it. My colleague, Marcus Cooper, who's the one who is currently basically maintaining the TLA plus tools, he has this wonderful talk, which is titled, How to Save Two Hours of TLA with Two Weeks of Debugging. People will just say, oh, using this formal stuff is just so hard. It's just so much work. You know, I don't need to do that. And it's not a good attitude, especially if you're dealing with concurrency. And if there's concurrency involved, you just can't fly by the seat of your pants. You know, you're not going to get it right. I guess one of the perhaps tragic, sad but true things is that oftentimes, based on this description, we don't know if we need these tools. We don't know if we need this upfront design, the formalized spec, the math stuff, until we're already feeling the pain down the road. Maybe it's two weeks.

Starting point is 00:25:20 Maybe it took us half my career to realize where are the source of all these bugs are right we're because i think as uh industry you know as uh pragmatic workers like bugs are just part of the game i mean we're just like we've kind of resigned ourselves to the fact that our programs are going to be wrong aren't haven't we but with the your tools sounds like you can actually prove that they're going to be right before you start to write. Well, there are two kinds of specifications you write. One is what the program is supposed to do. That's not telling you anything about coding it.

Starting point is 00:25:56 Then the other spec is how the program does it, which is a high-level view of how the code works. Now, the way I like to think about it is a high-level specification. Think of it as an algorithm and think of the code as an implementation of that algorithm. Then the only difference is that you think things as algorithms because you read them in algorithms books because they're algorithms that are useful for lots of things. But you're arriving at an algorithm that's just good for that particular problem you're solving. But getting the algorithm right doesn't imply that you're not going to make coding bugs. But if the algorithm is wrong, doesn't matter how good your code is, the algorithm is wrong. So it's eliminating one class of bugs. But it's a class of bugs, and especially with concurrency, that testing is very

Starting point is 00:26:55 poor at finding because the bug can occur just in terms of how relative speeds of which two things are happening. And so you could have some system that you've tested the hell out of and it's been running great for five years. And then you do something that changes some piece of hardware that makes it run at a different speed than some other piece of hardware and the bug will appear. But also there are a lot of debugging tools for finding bugs in code. They're not good at finding bugs in algorithms. So, it's not that I'm saying, oh, coding is no problem. They're not going to be bugs. I'm saying, get the algorithm right before you code it. Yeah.

Starting point is 00:27:40 And for that, there are tools that are just as formal as code and ways of testing it that are just as good as those tools that work at fighting bugs in code. of the plan. And so your plan is your algorithm, right? There's your thing that you're going to make sure you have right in the first place. And just because you got the plan right, doesn't mean you're going to execute it right. You can still make mistakes later, but if the plan is wrong from the get-go, well, the entire endeavor is a waste. Yeah. And when you do find, I mean, if you do, you know, in testing find that it's wrong, it's a hell of a lot more work to fix it that it would have been had you found it when you were writing the algorithm. Yeah, the sooner, the faster you can find those problems, the cheaper they are

Starting point is 00:28:33 to route around. So this whole thinking in algorithms, this is something you say is very few programmers think in terms of algorithms. And I think you're right about that. But my question to you, for me, is how would one begin to change this predilection? Like, how would you, you know, take me and help me start thinking in algorithms before I'm thinking in programs? Well, what I do to, you know, for engineers is I force them to think mathematically. I'm not the only person who talks about things about the need for specifications, but I think just about everybody else wants to make writing the specifications easy for the programmers. So he writes their specifications in what looks like a programming language. And I'm very hard-nosed in this.

Starting point is 00:29:27 You don't want to do that. I want to get people to think outside the box. And people say that all the time, but they never say what the box is and everything. What is the box? But here I can tell you exactly what the box is. The box is thinking programmatically as expressed in what I'm now calling coding languages, instead of thinking outside of that box. And the way to think outside of that box is to think mathematically. So a TLA plus spec is a formula, a real mathematical formula. It may be hundreds or even thousands of lines long,

Starting point is 00:30:05 but it's a formula. And you say something to people, oh, a thousand line formula, how could I possibly understand that? You're giving me a thousand line program. Oh, that's trivial. Well, the math that's being used is simpler than programming languages. If people give a semantics to a programming language, they translated it to math. And that math is pretty complicated. But the language TLA plus is really simple in the sense that it is math and math is really simple. But math is incredibly powerful.

Starting point is 00:30:40 And you have to think outside the box. You have to really learn to use that power. here in these United States, as I was, and go into the industry and begin our work. Math is part of it, but it's not like a main part of it. It's like it's there, you know, you take calculus, et cetera. But, you know, we don't have a lot of math. I don't have a lot of math. And I've been a working developer for 20 plus years. And I don't have that much math.

Starting point is 00:31:22 And also, the math that you learned hasn't been taught very well. Well, a lot of the stuff I learned wasn't taught very well, but that depends on your school, I guess. No, simply because you've learned about sets and functions, right? Everything I write in a spec is sets and functions plus arithmetic. I presume you do pretty well with arithmetic. I got that part down, yeah. You've learned what sets and functions were, but you haven't learned how you would use them to describe what your program is supposed to do. And that's the part that you have to learn. That's the part that you're forced to learn. You start writing in TLA, TLA plus. Now I do what I can to educate, but as I've said, I'm not that good as a teacher. So, you know, and I have sort of a video course on TLA Plus,

Starting point is 00:32:08 and some people like it. I don't know. I probably only hear from the people who like it. But, you know, one way you learn is by just reading other people's specs. Is there open source TLA Plus specs out there that people can read? Oh, yeah. Go to the TLA Plus website. There's a GitHub repository with a few dozen specs there. One of the things I hope to do before I retire is, after my current project, which is writing a book, but after I finish that, I hope to start organizing a set of examples that will help people learn. Whether I

Starting point is 00:32:57 will do that again, whether I'll finish the book, those are unknown questions. And as I like to say, I'm no better at predicting the future than anyone else, even if it's my future. Well, if I might ask you to predict the future a little bit, I just noticed, I mean, you've been in industry, you've been at Microsoft Research, of course. So in the academic circles, in the industry circles, for many, many years, we've had recent shows.

Starting point is 00:33:23 We interviewed, we spoke with Kelsey Hightower. He just retired from Google at 42. Doesn't mean he's retired like from anything, but he's just retired from Google at 42. Steve Yegge just unretired at 54. So he retired at 52, unretired, went back into the workforce. And you're at 82. You've had a long, illustrious career. I mean, lots of productive work. Do you have plans on hanging it up? Are you going to work until your time on Earth is ended? What are your thoughts on that?

Starting point is 00:33:54 Because you have a book in the works and you want to do something after that. No, it's not going to be too long. Not many years. I'm not ready to announce a retirement date, but it's not going to be too far off. I'm here with Lazar Nikolov, developer advocate at Sentry. Okay, let's talk about your live streams.

Starting point is 00:34:27 You're constantly building something and live streaming it. Give us a peek behind the curtain. What do you do in your live streams? What can we expect if you watch? Yeah, so at Sentry, that's even how I started. I started to build a mobile application or tracking expenses in four different frameworks because I wanted to you know explore basically the DX of the four most popular frameworks SwiftUI, Jetpack Compose, React

Starting point is 00:34:52 Native and Flutter. Then I moved on during October of course we did the Oktoberfest where we tried to fix a bug in the React Native SDK and that was really cool. And what else for example right now I'm streaming on and I'm usually streaming on YouTube. I started building a really cool project that I want to call the Errorpedia. So it's like a Wikipedia of errors. So my idea was to just build a website that enumerates famous frameworks, like used frameworks, and what errors can be encountered within those frameworks with a little bit of explanation why they happen and also how to fix them. I had an interesting choice of technology so like Astra for the whole website because of its ability to embed react components or view components or solid svelte components and these are frameworks that I want

Starting point is 00:35:42 to cover the errors from so like the whole website the whole doc site would be just astro and markdown but when the interactive example needs to happen I'm just going to export that from a package in the monorepo so that was interesting and I started building that and it put me into this mode of thinking about errors and I was like okay we can do these errors and we can do these errors I started to compile a list of errors that we can document and I started thinking about you know what about errors that don't necessarily break the page yeah I mean they break the page but they don't produce errors in the console right there could be responsiveness errors right uh like mobile or

Starting point is 00:36:20 tablets something like that something gets pushed off the screen there's like an overflow hidden on the body you can't really access that you know gets pushed off the screen. There's like an overflow hidden on the body. You can't really access that, you know? So it breaks the flow, the operation flow for the user, but it doesn't produce anything in the logs, right? Maybe there's, maybe we're talking about, I don't know, Firefox or Safari CSS issues. Cause we know, especially Safari,

Starting point is 00:36:39 when you're building something for Safari, those who do front end, things usually break, but they don't produce an error. So I was thinking about that and I was like, OK, we have all the tools in Sentry. So, so, yeah, that's what I'm doing right now. I'm streaming, building that widget that lets you, you know, start the recording and send it to Sentry. OK, if you want to follow along with Lazar, you can follow him at YouTube dotcom slash Nikolov Lazar. We'll put the link in the show notes, but it is youtube.com slash N-I-K-O-L-O-V-L-A-Z-A-R. Lots of cool live streams, lots of cool videos.

Starting point is 00:37:15 Check them out. Again, the link is in the show notes. Also, check out our friends at Sentry, sentry.io slash changelog. Use our coupon code to get six months of the team plan for free. That's so awesome. Six months free. Use the code changelogmedia again. Sentry.io.

Starting point is 00:37:35 That's S-E-N-T-R-Y.io. What's changed? When you first got into the game, what's changed and what stayed the same? You've been in the industry a long time. Certain things change. As the song goes, the more they change, the more they stay the same. What's different today, 2023 versus when you were first in computer science and then what's still the same or at least echoes? Well, the big change since I started and I started programming in 1957 is that now nobody can understand programs. Okay.

Starting point is 00:38:27 I mean, you just have to build things on top of stuff that you don't understand. Things have just gotten so complicated because, well, I like to say, things have advanced by evolution and not by intelligent design. Ah. And that works great if you've got a million years to do your designing. But, you know, when you're doing it over a matter of decades, it doesn't work too well. It's certainly been compressed. I agree with that. I'm sort of amazed that there hasn't been a... Extinction level event? down. They're a really serious one. I suppose they're, well, I guess you sort of see some of

Starting point is 00:39:06 them. You read about the state of California. I forget what it was, whether it was for their motor vehicles, armored motor vehicles or not, something, $100 million project that just failed and it was just abandoned after a few years. So, you know, it happens on that scale. On the other side of the coin, you know, we have young people today who can pick up a computer and a programming language like JavaScript or like Swift, if you're talking on iOS. And in a matter of days or weeks, you know, they can go from never having written any software to deploying a game or an idea of theirs into the world and that's kind of cool i mean i agree with like every layer of the stack has made it more complicated and more fraught and we've understand less and less

Starting point is 00:39:58 but you can do a lot with a little, even though you're actually standing on a lot, and that is shaky ground. But you can get a lot finished. Yeah, well, I mean, coding languages have improved a lot since I started. Not difficult since I started in machine language. But yeah, there's a lot of good things in programming languages. But TLA+, that's the main thing that you're here to talk about, the main thing that you want to pass on to our listeners. Well, not just, I would say not TLA+,

Starting point is 00:40:33 I would say, you know, thinking outside the box of code. TLA+, is one way of doing it. You know, it has its domain of applicability. The place where it shines is when you're faced with problems with concurrency. But that's not the only way of thinking above the level of code. Well, at the top, I mentioned this Turing Award, which you won in 2013 for your work with distributed systems. I want to ask you about the bakery algorithm. That seems to be both your favorite and probably the one that you can explain to our listeners without, you know, multiple hours. I'm not going to ask you to go through the Paxos paper,

Starting point is 00:41:13 for instance. But when it comes to distributed systems, you know, I've been thinking about this a little bit. And I'm wondering what your take is on the thought of when a system becomes distributed. Like if you were to draw a line in a system going from like a very simple processor to all the way up to like the World Wide Web, somewhere in there it got distributed. And I'm curious where you think that threshold is of when a system becomes a distributed system. I think I would probably say that distribution is in the mind of the beholder. Okay. I mean, this laptop that I have in front of me, I view it as just a single thing, system. Yeah, it's all right there. But to the hardware designer, this is the distributed system. You had all these chips talking to one another,

Starting point is 00:42:02 and even inside the processor chip, I mean, that processor chip might have been designed with somebody who was using TLA+. People at Intel used to use TLA+, and they're designing their chips, and I don't know if they still do. But in particular, the memory of the chip is a distributed system. I mean, it really is messages going back from one part of the chip to another part of the chip. There's two levels of caches and the main memory, and that's a distributed system. The distinction I make is between traditional programs and concurrent ones. Traditional programs, you have one process specifically that's just doing something, computing of an answer and stops. And in a concurrent system, you've got multiple independent processors acting and they're not supposed to stop. They're just supposed to keep doing

Starting point is 00:42:59 something. And so how you describe what they're supposed to do is very different from how you can describe what a traditional program that just takes some input and computes an answer does. That's very simple. You can just describe that as a function. But I spent a lot of time thinking about how you describe what a concurrent system is supposed to do. And that's built into TLA Plus. I mean, not built into it, but in the sense that there's no language for distribution. There's nothing distributed about it. You look at the, it's just math, but the math was designed in such a way to make it possible and make it easy, as easy as possible to say what a concurrent system is supposed to do and at a high level what it does.

Starting point is 00:43:52 Yeah. An apt metaphor might be for concurrency might be the idea of movies. So you have a movie that has a series of events and then it starts, things happen, right? 24 frames per second, however many seconds, and then it ends. And that's usually pretty normal, unless, of course, you're Quentin Tarantino, now we're going to go out of order and all that kind of stuff. But that's pretty simple to think about. But then you have like...

Starting point is 00:44:16 That's actually, that's beautiful. That's an example I love, because that's the way described systems. Basically, think of them as a movie. And what distinguishes a digital system from an automobile is that you can't think of an automobile as operating in distinct steps. It's going continuously from one place to another. But you can think of a program or a distributed system or the internet as proceeding in discrete steps. And the way physics, science describes cars or the universe is in terms of its state. And that's the same way I describe a distributed system in terms of its state. You can think of the state of a program as the value of its variables, but there's also a lot else in its state that you don't usually think of as part of the state, but is really

Starting point is 00:45:15 there. And so you just think of that as a movie. But the other important thing and the thing that a lot of people didn't used to do, I don't know what they're doing now. I've stopped paying attention to what theoreticians or people who are designing specification language are doing, but they would think of the movie as a movie of this particular device. And then they have to worry about, well, how do you put these movies together? But what I say is, no, this is a movie of the entire universe. And you describe the system behaving correctly is a property of just one particular part of that universe. And there's one thing about, and this is a technical point, but it's a way that I think

Starting point is 00:46:07 differently from most people, is that if you take a movie and you add, you repeat frames, since the other thing about these, to move the frames, is they don't represent any particular time scale because you might be able to describe a clock, an hour minute clock by a frame, which shows one frame per minute. But then if you also want to describe a clock with a second hand, if you've set yourself to say, oh, this is one frame per minute, then you can't describe it. You can't combine it. But what I say is that the frames don't met any particular time scale so that a frame, if a system, if you can describe this frame, a system with a certain movie, then if you add frames that simply repeat what the previous frames have done, you haven't changed the description. Your system should still satisfy

Starting point is 00:47:11 that. And that basically solves the whole problem of how you've written a specification. Well, like most people, they write a specification of an hour and clock with an hour and minute hand. And if they add a second hand to it, it no longer satisfies that specification. But when I write the specification, it satisfies it because my specification of the hour and minute clock allows frames where the same minute keeps being repeated. So if you add a second hand, it still satisfies that specification. I realized I had that insight in 1983. And to my knowledge, TLA plus is still, well, TLA that TLA plus is based on is still the only formalism that I know of that people use that has that property, that if you write a specification of an hour and a minute hand, an hour and a minute

Starting point is 00:48:11 second clock satisfies it. And what's the name of that property? What would you call that property? Well, I call it stuttering insensitivity. I think of these as stuttering steps. That's what I call them, stuttering insensitivity. It's insensitive to the stuttering steps. That's what I call them. Stuttering insensitivity. It's insensitive to the stuttering steps. And the reason is because, I'm asking you, I'm not telling you, the reason why it's insensitive to the stuttering is because the way that you think about it is about the order. It's about the order of events, not as much as the actual distinct time stamps. Is that right?

Starting point is 00:48:43 You're thinking about order. You're not thinking about the absolute time distinct timestamps. Is that right? Like you're thinking about order. You're not thinking about the absolute time or the absolute. Right. And if you want to talk about absolute time, you don't build it into your movie camera. You simply have a clock in your movie. Right. And it doesn't make any difference to that clock,

Starting point is 00:49:02 whether there is one step between one second and the next or a thousand steps between one second or the next. And what that does is it makes the math so much simpler. And simplicity is the key to being able to deal with complexity. You have to have a really simple way of, if you're dealing with complex things and you're using a complex language to describe it, boy, you're in trouble. And if you're building a language where you have to do something special when you've written a specification of an hour minute clock, simplicity means that should also work be a specification of an hour minute second clock if it's not you've got this extra complexity to deal with was this idea in the bakery algorithm it sounds like it was no no no it's no separate from that it's separate for that

Starting point is 00:49:58 that seems to if we real quick describe your bakery algorithm we'll i'll try to contextualize it but it seems like this at least time and sensitivity or i don't know explain it first and i'll explain why i'm trying to tie it together i'll say this problem is stuttering and sensitivity it doesn't you don't need it when you're just looking at a single system so i'm if i if i'm only going to be looking at an hour minute clock i don't have to worry to worry about adding steps in which the minute doesn't change. It's only when I hike this hour and minute clock and somehow I want to compare with something else that things become complicated if I don't use stuttering and sensitivity. So for the bakery algorithm, if we're just looking at the bakery algorithm by itself, stuttering insensitivity is irrelevant.

Starting point is 00:50:46 Because of the algorithm or because of the circumstances it's trying to solve? Because of the circumstances, just looking at this. If I wanted to combine the bakery algorithm with something else, then stuttering insensitivity, then I want to make sure that I describe my algorithm the way that's stuttering and sensitive. Now, the thing about TLA is you don't have to think about it. That is, you don't have to write your... I'll write it this way. If I have a thousand line specification of a system, to make it stuttering and sensitive,

Starting point is 00:51:20 I add about seven characters to the specification. Right. Doesn't the specification. Right. It doesn't take much. Right. So when you're writing the specification, you're not thinking about stuttering and sensitivity. You're thinking about the system you're describing. If I'm thinking about an hour and minute clock,

Starting point is 00:51:36 all I'm thinking about is hour and minutes. I don't have to worry about anything else. But when you write it in TLA, TLA+, it's just going to automatically be satisfied by an hour, minute, second clock. So when I write, describe the... I'm happy to describe the bakery algorithm pseudocode, but I know that when I write, describe that pseudocode mathematically, it's going to be stuttering and sensitive, but that has nothing to do with how I write, even think about that algorithm.

Starting point is 00:52:07 Right. It's a property of the implementation of the system that TLA provides for you. And as long as you're thinking about a singular system, then I guess in that case, it's irrelevant. So let's look at this bakery algorithm. This is one of the six bullet points of which they say the reason for your Turing Award. This was, talk about thinking differently. This is the one, you talked about it briefly at the beginning. We didn't name it back then, where there's a paper by two men, Paul, is it Paul Johnson? Help wasn't the bakery. They didn't influence the bakery. That was the Time Clocks paper. My bad. No, the bakery algorithm happened when... Actually, that was the first concurrent algorithm that I wrote. There was a paper... I became a member of the ACM and started getting the ACM communications of the ACM. It's the first computer science journal that I ever looked at. Actually, I wasn't even aware that there was such a thing as computer science journal that I ever looked at. Actually, I wasn't even aware that there was such a thing as computer science because that was in 1972 and there really wasn't computer science in

Starting point is 00:53:13 those days. News to you. Yeah. But at any rate, so they had this algorithm for solving mutual exclusion. Mutual exclusion is the problem that if you have a bunch of processes running and say some process wants to use the printer. Well, if you had two processes trying to print at the same time, and you get the output interleaved on the printer, it's not going to work. So you have to ensure that the processes, some way of ensuring that only one process at a time is using the printer. And that's called mutual exclusion.

Starting point is 00:53:49 And the problem was recognized as being important and introduced by Edsford Dykstra, I think it was in 1965. At any rate, it led to a whole bunch of algorithms, mutual exclusion algorithms, which had more and more different, better and better properties in some sense. And so I read this paper of this algorithm. It seemed awfully complicated. And I said, it shouldn't be this complicated. And I whipped up this little algorithm for two processes and send it to the editor. And a couple of weeks later, I get back a letter from the editor. Here is the bug in your algorithm. That had two effects. One is that it made me realize that concurrent algorithms are hard to get right and they really need a rigorous proof. And the second one is, I'm going to solve that damn problem.

Starting point is 00:54:46 A challenge. And I came up with the bakery algorithm. And I don't think I'll bother describing it here. But when I had written it, when I wrote the correctness proof, I realized that that algorithm had a remarkable property. All these algorithms work by processes of communicating by shared memory. One process would set a value of the variable, another process read the value of that variable. But how do you implement that reading and writing? Well, you need mutual exclusion because if somebody tries to read and write while somebody else is trying to read, who knows what's going to happen? In fact, some computer scientists said that that's a necessity.

Starting point is 00:55:33 You have to have mutual exclusion somewhere to solve the mutual exclusion problem. But I realized that the bakery algorithm works. It doesn't require mutual exclusion on the reading and writing of variables. Each variable is written by a single process and it can be read by multiple processes. So you don't have the problem of writers writing at the same time to worry about, just readers writing at the same time as the write is being done. And the beautiful thing about that algorithm is that if a reader reads while the value is being written, it doesn't matter what value that read gets. So if I'm changing the value of the variable from 99 to 100 and you come and read it, you might get 4,933 or zero. And the algorithm still works.

Starting point is 00:56:28 Did you know that at the time or you discovered this later? No, I discovered it when I wrote the proof. You did. Okay. That's one example of being a mathematician. That's why proofs are good, right? Well, and actually I went on to formalize the way I was writing those proofs in a way that I think is really nice and gives you a nice way of thinking about concurrency, but it's not a practical way of writing the kinds of specifications that I was talking about in the real world. Bakery algorithm is a nice, simple algorithm, and you can reason about it that way. But in fact, if I had been reading about it,

Starting point is 00:57:13 if I had been writing it in TLA, I probably wouldn't have discovered that it had this property. But because there was no way developed for reasoning about program back in current algorithms. I guess I had to be really careful to make it rigorous. I had to be really careful about the assumptions that are being made. And so I didn't make any assumptions, think about making assumptions about atomic actions that happen atomically. So two actions can't happen at the same time. I had to figure out, well, actions can happen. Reads can happen at the same time as writing. So my reasoning method

Starting point is 00:57:53 allowed that possibility. And that's why I was able to discover that it had that property. Nowadays, we write specifications, I think of it as pseudocode in a way where we assume that reads and writes are separate the actions they can't happen at the same time as one another and so we don't worry about it and we leave it to the implementation figure out how to do that right so i heard you describe that particular algorithm as A, your favorite, but I also heard you describe it as one that you said you discovered versus not the inventor, but the discoverer, which I thought was kind of interesting and somewhat poetic. And I'm just curious, not necessarily to distinguish the two, but what did it feel like? What were you thinking? How did you go about the process? If you can go back to that day, that day of discovery of that algorithm, or did it take longer than a day? Describe to me what you were doing, how it came to you. Were you in the shower and it popped into your head?

Starting point is 00:58:55 Were you scribbling furiously on a notebook? What was it like coming up with something like that? No, I cannot tell you what happens with the microsecond where something happens. I want to know about the moment of inspiration. Well, it wasn't a moment in a sense. The reason it's called the bakery algorithm is when I was a kid, it was a bakery. And these days it would be called the deli algorithm, where a bunch of people want to get served.

Starting point is 00:59:23 And you have this machine that gives out tickets, it gives out numbers. And so you just take the number and people get served in order. And so I thought about that. And of course, the problem with that as a solution to the problem is that you have no ticket distributor. And so the idea struck me as, well, what happens if a people chose their own numbers and they did it by looking at everybody else's number and picking a bigger number? And that was the basic idea. And from that idea, I would say everything was fairly straightforward. You first get the obvious problem, well, what happens if two people chose the same number? And then in that case, you serve people in the order of the alphabetic order of their names. So that's a problem that's easy to solve. And so I wrote this algorithm and the obvious way of writing it gave a bug, and I saw that bug pretty quickly. And then I figured, well, here's a simple way of fixing that bug. And this very rare event, usually you see a bug in a concurrent algorithm, and you can fix that bug, and that's going to introduce another bug, and you fix that bug, and that's going to introduce two more bugs.

Starting point is 01:00:41 Right. But fixing that bug, and it worked. So that's how it got discovered. There was no moment other than perhaps that moment that brought back the memory of that bakery. That was the only store in the neighborhood that used this way of giving out tickets. Yeah, we have one of those still here in our local town where you get a ticket and they, but they have, I mean, it's parallel. They got some currency going on

Starting point is 01:01:10 because they have multiple strings of numbers going on. And because we're always sitting there trying to, and it's a very popular place. So like I'm 275 because you're wondering like, when's my order coming up, you know? And most places you just wait for 274

Starting point is 01:01:22 and then you're like, hey kids, we're next. But this place has multiple strands of numbers. And so like they wait for 274, and then you're like, hey, kids, we're next. But this place has multiple strands of numbers, and so they'll go 274, 163, 375, and we never know when we're going to get our meal. But they still get it to us, and seemingly in the correct order and mutually exclusive. They don't deliver ours to everybody who's sitting there waiting, so they've got it figured out as well. But that's what I was thinking of when I read your bakery algorithm. I thought, okay, I can see where maybe he hearkened to his youth

Starting point is 01:01:53 of sitting in line waiting for that deli sandwich or something. So when you got the solution all figured out, did you then take it back to that editor, the one who had sent you the bugs? Well, I resubmitted it to ACM. That had to feel good. What happened next? Was it like immediately? Went through the reviewing process and got published.

Starting point is 01:02:12 Simple as that. But the thing is there are various, you know, studying that algorithm, but also, well, both the other method of reasoning that I developed based on it. And I'd say the next three years or so, three or four years of what I did was all based on that algorithm. I'm studying that algorithm. For example, if you read my number while I'm writing it and I'm writing it from nine to 10 and you get a million, which is possible, and then you write that. Well, numbers might start increasing very fast, so you don't want that. So, what I did is I devised an algorithm that if I'm changing it from nine to 10, it will guarantee that the number you read will not be any bigger than 10. That was a non-trivial algorithm.

Starting point is 01:03:12 And so that's an example of the things that- Refinements, that's for performance reasons. Yeah, but in developing those other algorithms, I developed my skill as thinking about concurrent algorithms and thinking about how to reason about them. Well, that certainly wasn't your end game. That was very much, you know, getting you into the game. You went on from there to publish many things. We talked about that Paxos paper. A lot of your work is out there driving Azure, driving, you know, way down in there in the algorithms of AWS, of all these distributed systems. So lots of cool stuff came out of those efforts. I'm curious, as we close up here, Leslie, appreciate all your time today. Don't want to take too much of it. I'm curious what

Starting point is 01:03:56 you're up to these days. You got this book in the works. And what does your work look like when you're not? I mean, obviously, if you're authoring a book, I know what that looks like. But like, do you have daily work that you're doing inside Microsoft, pushing research forward or working on algorithms for them? Obviously, you don't have to divulge any trade secrets, but like, what does a work day look like for you at this stage in your career? Well, my work day seems to involve answering a lot of email and a lot of things that take up time. But other than writing the book, I'm doing various things that involve, what I would say is that sort of management type things involving TLA Plus. I told you this colleague, Marcus, who is doing a lot of heavy lifting these days, I quote, managing him. I'm not his official manager, but we work together, keep tabs on what he's doing and learn from him what's going on in the world of TLA users.

Starting point is 01:05:04 He's the one who these days interacts with users. I very rarely do that. And basically Microsoft gives me the freedom to be able to devote my time primarily to the book. Awesome. Definitely looking forward to that book. Well, Leslie, I will let you go, like I said before, honored to have you on the show. Appreciate you spending some time with me today. To our listener, we do have the links to all of the things mentioned here today in your show notes. Of course, check out TLA Plus. Check out the resources. There's some tooling there and some videos as well. If you want to learn from Leslie directly a little bit on how to think algorithmically with TLA+. Any final words from you, Leslie, before I let you go?

Starting point is 01:05:55 Anything you want to say to our audience? Well, I guess the advice I give is think. And really thinking means writing, so write. We have some awesome episodes coming down the pipeline. Kaizen 11 with Gerhard Lazu drops on Friday, followed by 30 Years of Debian with the project lead, Jonathan Carter, next Wednesday. We've got Justin Searles and Landon Gray coming on to help us see through the lens of intergenerational conflict among software devs.

Starting point is 01:06:39 We've got Andreas Kling from SerenityOS and the Ladybird Browser Project. We've got Stephen O'Grady joining us to discuss the importance of open source licensing in the wake of Meta's Llama 2 announcement, and it just keeps rolling from there. I guess what I'm trying to say is, subscribe to the pod, why don't you? And if you're a longtime listener, do us a solid by telling your friends about the changelog. Word of mouth is still the number one way people find us. Thanks once again to our partners, Fastly.com, Fly is still the number one way people find us. Thanks once again

Starting point is 01:07:05 to our partners, Fastly.com, Fly.io, and Typesense.org. And to our beatmaster in residence, the mysterious Breakmaster Cylinder.

Starting point is 01:07:15 That's it. This one's done. But we'll talk to you on Changelog and Friends on Friday. 🎵

Your Ad Here

The Changelog: Software Development, Open Source - Thinking outside the box of code (Interview)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.