Future of Coding - The Case for Formal Methods: Hillel Wayne

Starting point is 00:00:00 Hello, and welcome to the Future of Coding. This is Steve Krause. So today we have a guest on the podcast that, if you've been listening carefully to the other interviews, has actually been mentioned at least two, maybe more times by other guests. Hillel Wayne is best known for his work trying to explain and promote TLA Plus to a broader audience of more practical engineers, people who might not think that what we call, quote, formal methods would apply to, you know, building products, building web applications, building technologies for the startup they work for. If you aren't familiar with the term formal methods, this is, I think, a really great podcast to get your foot in the door. We start by contextualizing what formal methods are.

Starting point is 00:00:54 We break up the field into four quadrants, and we go kind of quadrant by quadrant, and think about what each of the different techniques is used for and the practicality of it. Well, I think I might be overselling how ordered this conversation is. We kind of, Hillel will explain something and I'll think I understand it and then maybe 10 minutes later, I'll be like, wait a second, how is that different from that other thing we were talking about? he kind of has to backtrack and and clarify for me but i i think

Starting point is 00:01:30 you'll be able to follow and i think you'll get a greater sense for this small but active community of research that has a lot to offer to the future of what software engineering could look like. So before I bring you Hillel, a quick message from our sponsor Replit. Replit is an online REPL for over 30 languages. It started out as a code playground, but now it scales up to a full development environment where you can do everything from deploying web servers to training ML models, all driven by the REPL. They're a small startup in San Francisco, but they reach millions of programmers, students, and teachers. They're looking for hackers interested in the future of coding and making software tools more accessible and enjoyable. So email jobs at

Starting point is 00:02:16 REPLIT if you're interested in learning more. And without any further ado, I bring you Hillel Wayne. Welcome, Hillel. Thanks for having me on. Yeah, it's really great to have you. I'm excited for this conversation. So I think I originally heard of you as the TLA plus guy. Potentially, I think I may have heard of you for the first time via another interview or two on this podcast. i think we have a few mutual friends in common yeah i think um it was it was a couple people i think it was kevin and i think james coppel both i think they both interviewed they were both interviewed by you and they're sort of like we're sort of in the same circle so i imagine it's one of those two yeah and i think it may have

Starting point is 00:03:00 actually been both of them okay you really have'm popular. Yeah, I think you are. At least with the people I talk to. You have the illusion of popularity, given who I talk to. So given that I know you as the TLA Plus guy, I'd be curious to hear about your origin stories as this TLA Plus superhero. Were you originally bitten by a radioactive Leslie Lamport or did it happen some other way? God, you sort of put me on the spot there because now I've got to think of like a really clever comeback to that, but I can't. So actually it wasn't really anything that interesting. I was doing some work at a web development company and ran into like a really

Starting point is 00:03:43 complicated distributed systems problem with their product. And what happened is I was looking for ways to make it a little bit more manageable. I stumbled on TLA Plus and it worked out really well in my favor. And I'm like, hey, this is really great. Like this is incredibly useful for my problem. Nobody would expect it to be that way. Why is there so little documentation? So I figured I'd write some documentation for it. And then I wrote documentation. Then I figured I'd write some documentation for it. And then I wrote documentation. Then I figured I'd give a talk on it. And then I figured I'd write a book on it. And then it just kept going from there.

Starting point is 00:04:10 Yeah. So, okay. So you have, I think the first thing I saw was learntlaplus.com. Was that the documentation you're talking about? Yeah. It was supposed to be a tutorial because there weren't any like easy tutorials online that I could find. And then it just kept going from there. But that was the first thing.

Starting point is 00:04:24 That was like early 2017. Yeah. and then I saw some talks and then you also have a workshop too yeah I also published a book on it actually just a few months back practical TLA plus with a press cool and then um I saw that the consulting that's like TLA plus specific you like do trainings is that well it's it's it's it's a lot of TLA plus but I also do a few other things like I do alloy I've done some consulting in a mini zinc constraint optimization it just essentially whatever like I feel comfortable with like in the formal method space that I feel like I can like really teach well it's a lot of those stuff too yeah I just find it fascinating um that find it fascinating that you came across formal methods in your startup work.

Starting point is 00:05:11 And it seems like you just like, well, now I just want to do this and only this for my life or my career. Is that kind of how it happened? You fell in love with this topic? Well, sort of. Because, I mean, it's definitely a really interesting topic, like, obviously really into it, but I think what I really enjoy, sort of, technical writing and, like, technical communication, and, I mean, if you sort of, like, see a lot of the writings I do, a lot of it is on formal methods, because that's, I think, what I'm best at, but, like, some of it's on accent analysis, and, like, lightweight specification,

Starting point is 00:05:40 and the history of programming. I just really like communicating and teaching ideas. And I think formal methods at this stage has this highest sort of strength obscurity ratio where it's way more useful to a lot of people than know about it. And that's sort of what I focus on teaching for that reason. Got it, got it. That makes a lot of sense. Yeah, I can tell from the way you write

Starting point is 00:06:00 that you enjoy it, or at least it feels that way that you enjoy writing. I enjoy your writing and I think it comes across. Nobody actually enjoys writing. It's more of a compulsive thing for most people. For most writers. Well, I liked one of your essays. You talked about how you asked your editor

Starting point is 00:06:18 to be needlessly cruel and that he gets extra points if he makes you cry. Yeah, he didn't. I win. Did he make you cry? Yeah, he didn't. I win. Did he make you cry? No, he didn't. I won. You won.

Starting point is 00:06:31 Yeah, I feel empathy for you because a few days ago, maybe we'll talk about it later, but I think you'd have interesting things to say about this. But I wrote, I spent a lot of time writing an essay this past week and I got some terrible reviews that almost, but didn't, I spent a lot of time writing an essay this past week, and I got some terrible reviews that almost, but not, didn't quite bring me to tears. So it is, it is hard. Yeah. Getting, getting cruel feedback or not cruel, but just harsh feedback. Yeah. Okay. So let's, let's get into the, the, the formal method stuff.. I thought it would be useful to start by situating ourselves

Starting point is 00:07:09 and defining terms. I think your recent essay, Why People Don't Use Formal Methods, that was on the front page of Hack News, I think you did a really wonderful job. Thank you. Because there are a lot of separate topics, and I think the words that correspond to those topics are pretty good.

Starting point is 00:07:31 And I'm glad that they're like separate words for all these things. So can you roll off the top of your head? Or I wrote some of them down if you want me to kind of tee them up. So I can talk a bit more about this. So one thing to keep in mind is that like any sort of field in like programming, formal methods is a big field. So saying I do formal methods is kind of like saying I do web. It kind of gives somebody an impression, but there's a lot of nuance there. But also given that is that a lot of fields of programming are sort of very big. So for example, with web, I mean, there's probably more people who do like web development than, I don't know, live in New York.

Starting point is 00:08:07 So the state. But like phone methods is extremely small. And it's also very fractured because people like everybody sort of is in it often knows like one or two things, but doesn't really know the whole space of it. And the consequence of that is that there's a lot of ideas in there and some of the ideas overlap, but the people who overlap with their ideas don't necessarily share the same terms. So I ended up inventing a lot of new terms for that essay, not necessarily because I think these are better terms, but just because since, again, not everybody sort of shares the same terminology, it was easier for me to talk to a public audience about it by just inventing terms of being clear that they were terms that I just invented on the spot to talk about differences.

Starting point is 00:08:49 Oh, okay. Yeah, that's a good clarification. Yeah. So yeah, I divided it into two categories. Again, these are very, very fuzzy categories. There's a lot of overlap and there's like things that don't belong to either category, things that belong to both categories of thinking about code and thinking about designs. And then we divide each of those into two separate categories of how do we specify and how do we verify. Specifying being how do we describe what we want to be the case, and verification being how we show that what we want to be the case is the case. And that's pretty much all formal methods is specification and verification to one or two degrees.

Starting point is 00:09:27 Got it. Okay, so just to recap. So we have, is it like four, should we imagine it as four quadrants? Specifying, verifying, and then on one axis encoding design on the other? Yeah. So there are like four, yeah? Again, yeah. Very, very, very, very like broad, probably wrong, but wrong in a very useful way.

Starting point is 00:09:47 Okay. So yeah, do you want to talk about each of the four quadrants a bit? Or what's the next important distinctions to make? Yeah, so actually, I should probably just mention this right now is that because I realized I didn't actually define the term formal methods, is sort of the study of how we can show that things are correct in ways that are sort of irrefutable. So, for example, you might be familiar with, say, testing, right? So testing works, but it only shows very limited amounts of verification. Like, if you prove your thing works for inputs 1 through 100, maybe it fails for input 101. So formal verification is a way of sort of saying, okay, we're going to test every possible thing and we're going to show that no matter what

Starting point is 00:10:29 you put in, it will always give what we expect. Yeah, well, so I think there's a, you used a few terms there that are interesting. You used like, in particular, irref yeah i think it's an interesting which is also incredibly misleading okay yeah um okay well i'll let you go give the high level and then we'll drill down into the specifics yeah so so i drilled out into irrefutable or give the high level with like the rest of the space first which you refer yeah yeah sorry let's give the high level and then we'll drill down into some of the specifics later. Okay, so in those four quadrant-ish things, and again, this is sort of a formal methods thing of just like always qualifying all of my statements. Qualifying again, like this is more of just like a very rough model than anything else.

Starting point is 00:11:20 So for like code specification, you have a few different things. You have external theorem, which is essentially writing your code, then writing in a separate like file, essentially, here are the properties of this code. That's very similar to what we call testing, but more rigorous. We have really strong type systems, like dependent types or refinement types. Again, similar to static types, but harder to check, but more comprehensive. And then we have this thing called logics and conditions, originally called horologic, but now there's a bunch of different branches where you essentially say in a function, given these inputs, these probably should be true of the outputs. And this corresponds to something called contracts in programming, which is a very powerful verification technique, but it's of the three ways that we verify code, like informally, it's the most obscure by far.

Starting point is 00:12:09 Essentially the easiest way to describe it is assertions. Yep, yep, okay. And so just to this question probably should have been asked earlier, but you used just in the last description informally versus formally. And so the way you're using that term informally means kind of like eyeballing it informally means a computer is checking something. Informally, here what I'm using informally to mean, it can be automated. In fact, it usually should be automated, but it's automated in a way that doesn't give you a hundred, it doesn't give you complete confidence. Essentially, informal verification is still automated verification. It's just done in a way that's one, much easier and two, not as is still automated verification. It's just done in a way that's, one, much easier, and two, not as comprehensive as formal verification. Okay.

Starting point is 00:12:51 So I guess it's a spectrum-y thing? Yes, it's a spectrum-y thing. Okay. So formal, is that like 100% or it's just like past 50%? It's basically 100%. In fact, that's like, I think one of the things that people miss is that for the most part, formal verification is,

Starting point is 00:13:10 while the most powerful sort of way of sort of verifying that stuff is correct, probably not the most productive in most cases. Because to get to 100%, you have to go much, much, much harder than it takes to go to like 99%. But 99% is informal. Yes-ish.

Starting point is 00:13:27 I mean, it's really hard to sort of put these things on a spectrum because like what does it even mean to be 99% correct versus 98% correct, right? Yes, I see. Well, I guess I'm just trying to figure out what the whole field is because, you know, when you talk about formal methods versus informal i don't know but maybe maybe this was a bad digression i'll i'll let you get back to the uh the high level i know we should probably drill into that digression um at some point so in any case so now the thing is that like all these ways that we can spec we could test both formally and informally

Starting point is 00:13:59 like for example like if i write so for example this is the specification of my function i could test it using formal verification or i could write like a million like auto manual tests or put it through a really intense code review. But if I want to sort of like formally verify it, what I could do is I could write, I could write, for example, what's called a proof, which is essentially a mathematical statement sort of showing from our basic premises how we can conclude that this is going to be correct in a way that a machine can check. Often these days, that's considered like really, really hard. So when we can, we usually shell out to a solver, for example, a SAT solver or what's called an SMT solver to automate some steps for us.

Starting point is 00:14:48 That's pretty much the main way that we verify code is correct formally. And it works, but it's also really intensive and labor intensive. I think the fastest anybody's ever done it was four lines a day using cutting edge all the resources of Microsoft combined. And that's one of the reasons why for the foreseeable future, code verification is probably going to remain in the realm of experts. Code specification is really powerful and I think could be more widely used, but code verification at least verbally isn't really on the horizon for mainstream use. Got it. Okay. So just to recap, because I feel like that got a little bit messy there yeah no it's a mess it's it's a messy topic yeah yeah and I interrupted with

Starting point is 00:15:30 sorts of things I thought maybe let's just like start over in the sense of with an example well I was thinking maybe like just a picture in my head even just for me the field is called formal methods. Yes. And then that field has four subcategories, code verification, design verification, code specification, design specification. Is that correct or am I not? Four ways of thinking about it. Usually people do either code or design.

Starting point is 00:16:02 So they do both code verification specification or design verification specification. I mostly make that division to make it a bit clear, like what we're talking about, we talk about sort of like proving correct versus specifying correct versus whatnot. Okay, so what you're saying, usually any given person would only, it only deals with code or design, usually?

Starting point is 00:16:24 Usually, and I'm waving my hands very dramatically here when I say usually. Any given person would only, it only deals with code or design. Usually. And I'm waving my hands very dramatically here when I say usually. Okay. So, and does that hold for you too? I mostly do design verification and specification and specification. I mostly do design work. Design, design, design and specification are like the same thing. So when I say sort of I do design, I like most people when they say that they do formal spec.

Starting point is 00:16:51 Okay, this is weird. Um, usually if a person says that they primarily work with specification languages, they mostly do formal specification, they usually mean that they do design specification and design verification. Yep. Okay. Which you'd not be surprised because they said formal specification, why are they doing verification? But you know, again, fuzzy

Starting point is 00:17:10 terms, small field, fractured field, lots of different like niches to it. Okay, one question that I have here in my notes I wrote down was what's the difference between verification and validation? So verification is basically taking a description of how the code should be and proving the code matches that. So for example, let's say I say that this code should always sort a list in ascending order. The specification would be that, say, every two indices, if one indices is longer than the other one, it's going to be higher than the other one.

Starting point is 00:17:42 And then verification is basically showing how that's going to always be true, right? Validation is when you say, wait, did we actually need a sorting function? Maybe we actually needed like a maximum function. Maybe that we are doing the entire wrong thing entirely. It's sort of at the level of what are the human requirements we need and how do we show we match

Starting point is 00:18:01 the human requirements of the total system? So yeah, and that's usually outside the scope of formal methods because that deals a lot more we need and how do we show we match the human requirements of the total system so yeah and that's usually outside the scope of formal methods because that deals a lot more with sort of social systems and like understanding customer requirements oh okay so it sounds like you have on on one level you have reality and you're trying to match the specification to reality and then below that you have the code and you're trying to match the specification to reality. And then below that, you have the code and you're trying to match the code to the specification. Yeah. That's, I think, a really good way of putting it. And then the former would be validation. The second would be verification.

Starting point is 00:18:36 Okay. Got it. Yeah. So we're trying to validate our business specification with the market and then we're trying to verify that our code meets the business specification exactly got it and so and that is for and that involves both code verification and design verification because because once we write in order to validate that the specification meets the business needs we like then we'll do the design verification stuff. And then once the design is verified, then we'll write the code and then try and verify that the code meets the specification. So then you can do all of them. In an ideal world with an infinite amount of resources, yes, that's how it would look.

Starting point is 00:19:22 Got it. Okay. And so today, what you were alluding to is that, you know, at the fastest, you could do four lines of code. Potentially in 100 years, we'll have some improvements

Starting point is 00:19:32 in theoretical understanding of this stuff or just machines will be a lot faster and potentially we'll be able to do all this stuff for every line of code and it'll be just as fast

Starting point is 00:19:41 as we write code today. I don't know. I'm hesitant to sort of make predictions about like 10 years from now, because again, remember 50 years ago, we first started like doing formal verification. We're like, oh yeah, in 10 years, we're going to have everything verified. And so we're like, no, that's crazy. In like 20 years, we'll have everything verified. And now it's been 50 years and nothing's verified. So I mean, it's really hard. I mean, look, proofs are hard. Like, validation is hard. We often don't really know how to represent specs. These are all really difficult topics.

Starting point is 00:20:08 It's hard to sort of, like, make predictions of how things will go out. I think that there's, like, definitely going to be expansion of design verification. But just because I think that right now we've seen it be really successful. But code verification, I think, for the foreseeable future, will remain sort of a niche topic that's done in special cases by experts. And I don't know if or when it will ever be a thing that everybody's doing. Hmm. Okay. Interesting. Yeah. Well, I guess to drill down a bit, You talked about irrefutable proofs,

Starting point is 00:20:45 like things that are checked by the computer. Yeah. One of the things I found in your writing that relates to this is that you were talking about how 20% of published mathematical proofs aren't actually correct. There's like an error that the person who wrote it missed and the reviewers missed. At least according to that one reference I found.

Starting point is 00:21:02 It could be that reference is wrong or it could be that reference is understating things. Yes. Well, yeah, I agree. For you to question – in the prover is correct, assuming that any auxiliary tools that you're using with the prover is correct. And finally, assuming that you have all the requirements, at which point we show it's irrefutable for the specific context you're talking about. So a common thing that's said is you can never actually prove a thing will always work because for all you know, you're going to start the equation. Somebody's going to hit the server with a baseball bat.

Starting point is 00:21:49 Yes, of course. And so I guess that's I think the word irrefutable is interesting because you could say the same thing about mathematics. Like a proof is irrefutable if the people who reviewed it for the journal didn't make any mistakes. Yeah. If you assume that to be true. Because that's essentially what you're doing when you're assuming that the code verifier has no bugs. Yeah.

Starting point is 00:22:17 Yeah. So I think, I believe that like some of them, for example, I believe the core for Coke, and don't quote me on this, but the core for the Coke prover has been sort of like proven by hand to be correct. So that we know that the core is essentially irrefutable. But all the auxiliary tooling is sort of like hodgepodge together as academic projects. So that's less trustworthy. Well, so I'm struggling. Why is Koch irrefutably correct?

Starting point is 00:22:44 Because it was done by hand? So here's the thing. This is what I've heard, and this is what I've heard from people who've worked on it. I cannot say exactly how that's the case, and I cannot – if you basically put a gun to my head and said, is this true? I'll say maybe. I don't know enough about the topic yeah again like i do have to clarify here that one of the things that has been affected by me like talking a lot about formal methods like working with it is that i don't really i'm not really comfortable like saying things i don't know like are absolutely the case so like given that i've not worked directly with cock and i haven't like looked at the papers i don't know enough about how they verified it to tell you how it's been verified i see i see so i guess what I'm driving towards

Starting point is 00:23:25 is that the main difference between formal methods, like computer methods and mathematical proofs, is whether a human or a computer is doing the checking. Mathematical proofs tend to be less rigorous than formal methods. Yeah, why is that? Because most mathematical proofs aren't automated. So the thing is that if we have sort of a computer checking the things,

Starting point is 00:23:50 assuming that we built this all correctly, assuming, assuming, assuming, we can essentially say whether every step is a correct or incorrect inference, given that we have to break down the level of the computer. But with mathematics, often the purpose of mathematical proof is to convince people not to like 100% prove something's the case. So there'll be things like this one step, we can sort of show what this heuristic argument, and everybody looks at the argument and goes like, yeah, that makes sense. They can sort of skip that one step of it, of the mathematical proof. And that's often done because, you know, you don't want to sort of sit down and sort of say, okay, in this context, we can prove

Starting point is 00:24:22 that we can associative addition. In this context, we're going to prove that we can use induction and that induction is actually a reasonable theorem to have in the first place. Whereas in the computer, you have to do all that stuff. So it reduces the chance that you will accidentally assume something is possible or easy when it turns out that in this

Starting point is 00:24:40 one very particular instance, you can't do it. I see. That's quite a claim. I think I've heard it before, but I just want to repeat what you said that mathematics is about convincing and explaining to other humans. It's not about making sure that you're not fooling yourself? Or is that kind of what you're getting at? I mean, you're trying to convince other humans who are very very invested in not fooling themselves i see yeah i think i think one good example of what i mean and this was something i read i think it was by terence tau is that the one difference between sort of the recent proof of the abc conjecture and the proof of like for my last theorem is that like the first five pages of the proof of france last theorem people were

Starting point is 00:25:21 getting interesting results from it so even though like it was like a really, really, really huge proof, very early on in the proof, people were saying, oh, this is interesting. This gives us some like really cool new machinery to work with. This is already being useful to us. And that convinces people that there's something empirically interesting here. Whereas with the ABC proof that I think has recently been claimed to be like invalid, you had to read the entire thousand page document to get any value out of it whatsoever. So that made people less convinced, mathematicians

Starting point is 00:25:48 less convinced that it was all correct and that it was a useful document. Does that make sense? I understand the story. I'm not exactly sure how it relates, what I was supposed to get out of the story. Yeah, basically just the idea that mathematics, like anything else we do is sort of also like, also in addition to, like, being a technical institution, it's also a social institution.

Starting point is 00:26:10 It's all about how, like, mathematicians interact and how, like, we all do things as a group. And similarly, like, formal methods is also a social institution as well as a technical institution. And one of the consequences of this is that with mathematics as a social institution, some amount of mathematics is in the social act of convincing and rhetoric, which is how it should be given that like we're not machines. Whereas with formal verification, often the only thing that we care about is sort of making something past the formal verification tooling, which means that it's almost entirely in that one context about making sure that every single thing is correct. Got it. Okay, well, I think this is a good point to transition to the difference between theorem provers, which I think most of what we've been talking about, and model checkers.

Starting point is 00:26:56 So essentially, there's two ways to sort of show something's correct. You can either sort of like construct a rigorous argument showing it's correct, or you can sort of show how it's impossible to be incorrect by brute forcing the entire possibilities. So I guess here's a simple example of what I'm talking about. Let's say you have sort of something that works over 32-bit floats, right? Like you have some function that takes two 32-bit floats and returns a float, right? Sure.

Starting point is 00:27:22 So there's only about two billion, there's only about like two, three billion 32-bit floats, right? Sure. So there's only about 2 billion, there's only about like 2, 3 billion 32-bit floats, right? So you could literally just go by hand and check every single one of those combinations. And if you do that, you can actually just check and brute force and make sure that every single thing does what you expect it to. And that does sound like a bit of a burden, but proving stuff is kind of also a giant cluster. So does that make sense? Yeah, that makes a lot of sense. Yeah.

Starting point is 00:27:59 So and then the last kind of thing I'll ask in this abstract thing before we get concrete is I think a very common programmer question, or I don't even have to speak for other programmers, I could just speak very personally. When I hear about specification and verification, I really want these things to be tied to my code. I don't like having to duplicate the effort of specifying and then writing code and then having to eyeball or verify.

Starting point is 00:28:27 Basically, I wonder if I could just write the specification and then the compiler writes the code for me or the specification is the code. So I feel like you have terms for that. Yeah, so the term for sort of taking a specification and generating the code from that is called synthesis. This is not something that we can do mainstream. It's still like a very niche academic topic and one that a lot of people are obviously working really hard on. But it turns out that generating code is really hard. I can actually link some stuff because Nadia Polakobora is one of the big people doing a lot of the really cool work in this space.

Starting point is 00:29:06 And she recently did a talk at Strangeloop about some of the work she did. And it's all really cool stuff, but she's also very clear this is not going to be mainstream in the next 10 years. It's not going to be close to mainstream in the next 10 years. So yeah, it turns out that that's a lot harder to do than just sitting down and showing that code matches the specification, which is itself a lot harder to do than showing that code almost most likely matches the specification informally. So yeah, there's all these tiers of difficulty,

Starting point is 00:29:32 and I think one of the things that happens is that people get fixated on sort of the golden mean, the sort of end state, but to the point where they sort of like ignore like all the really big benefits that we can get in between. Okay. Yeah, that's, I, yeah, I know what you mean. And so I think let's start, let's start talking about some of those benefits you can get even now, even in the not end perfect state, in the like imperfect world we live in, you have been singing from the rooftops

Starting point is 00:30:07 the virtues of using TLA Plus for design verification. So let's hear more about that. Yes. So one thing I do want to say first really quickly, we do also get benefits like shared-term variance with code verification. For example, you get things like that get integrated in programming,

Starting point is 00:30:24 just like a lot of type systems do some partial verification. Rust has, you get things like that get integrated in programming, like a lot of type systems do some partial verification. Rust has a borrow checker and that basically has to do a lot of verification automatically. So we are making a lot of steps to make certain aspects of verifying code more accessible. And we've seen a lot of success with a lot of those steps. So it's not just sort of working with designs where we see immediate benefits. Got it. So you're saying that even though, so both for designs and for code, even though we're not at the end stage, like intermediate or 80, 20 versions of these formal methods are useful both in code and design.

Starting point is 00:31:01 Yes. But I think what happens is that a lot of sort of the code verification stuff has basically been tied to a language, which is really good. But design verification has not been tied to languages as in you don't have to use a particular language in your code base to get the benefits of design verification, which is one of the reasons I think it's so valuable. One of the reasons I think things like TLA plus and alloy have a lot of really good uses even today oh interesting so if you're using agda you can get those benefits but if you're not then you're kind of screwed so that and that's the the um appeal of something like tla

Starting point is 00:31:39 plus you can use it with any language yes part of the appeal anyway part of the appeal i see yeah um yeah oh yeah that definitely makes a lot of sense and i guess it's similar to um like test driven development like unit tests can be done in any language exactly uh yeah okay yeah same idea or or agile agile can be done in any language yeah and that ends up being really being really important for the development socially of a lot of these ideas. Because if you can start getting the benefits without having to change your entire code base, you're more likely to do it than if you have to rewrite everything from scratch to get some value out of it. Yeah, of course. Okay, so let's finally dig into it.

Starting point is 00:32:23 TLA+. So for those who aren't familiar, could you do your, you know, whatever, two minute spiel of what TLA plus is, like the motivations behind it, how it came about, all that jazz. Sure. So I really got to work better on, it's actually one of those interesting challenges, sort of like, how do you explain this without demos? I found that the easiest way to describe it is to show people demos, but obviously we can't do that on a podcast. So co so, okay.

Starting point is 00:32:49 So obviously when we sort of are designing, when we're sort of building systems that involve, say multiple actors or multiple programs or like client servers, we have the code, right? That actually embeds all these things, but the code is simply how we do these implementation. It doesn't sort of show our high level understanding of what should be going on and what is going on. For example, imagine that you sort of even have something as simple as, say, like a web app that has both a front-end and a back-end, and then servers in a deployment system. You're sort of looking at a space that can't really be encoded in just a single code base. You're at the very least looking at multiple code bases all interacting with each other, right?

Starting point is 00:33:30 Yep. So code essentially, so none of this, like, of the code that you've written really expresses or is aware of the full design of your system. And because of that, it can't really help you with verifying the design itself. So people sort of implicitly understand this. That's why people do things like whiteboarding or draw UML diagrams or sort of like talk about like doing acceptance-driven development. It's sort of this like initial understanding of, hey, there's this like broader design that has its own challenges beyond just how

Starting point is 00:34:05 we're each in the line of code is like working or not working. But if we have this idea that we have a larger scale design that we care about, why not specify it? And then why not test that specification for issues? And that's sort of a lot of the motivation behind TLA Plus, which is by Leslie Lamport, the same guy who did LaTeX and basically half of distributed computing. So it sounds like you can almost think about TLA plus as a direct replacement for documentation or whiteboarding or UML diagrams.

Starting point is 00:34:39 Augmentation, not replacement. Still write your documents, please document stuff. Oh, okay. not replacement still write your documents please document stuff oh okay so the because the specification isn't understandable you can't just read a specification and understand a system you still need documentation oh you you know you can totally read us like often people you can read a specification it can give you a lot of insight but like it's it's like that thing where i i think it was that is um david McIver who was like, sure, caffeine can help you replace sleep,

Starting point is 00:35:08 but caffeine isn't sleep. Things like tests and specifications can help you understand a system, but they're not documentation. Documentation exists at a human level, like even higher than any specifications you can write. Still, write your documentation, write your requirements analysis, and then write your specifications.

Starting point is 00:35:24 Yeah, well, I... I'm not really selling you on using Kaliopla, am I? I'm. Well, I, I, I'm not, I'm not really selling on using Kelly plus semi. I'm basically just like going like, no, it's not, it's not that great. It's not that great. Well, I guess what, um, what I'm reacting to is, um, w w it seems like we just keep layering things on, you know, we have our code and then, okay, well actually you have to write these tests for your code. Oh, well I have to do this, all this Agile stuff to write the code in the right way. And then, oh, actually, you also need these integration tests.

Starting point is 00:35:52 And, oh, actually, now you need to document your code. And then, also, you have to write, now there's TLA plus specification for your code. So it would be nice if one of these things could replace some of the other ones so we can simplify some of these other things. It feels like we're just going to keep layering on things and eventually we'll all be stuck writing four lines of code a day. I mean, there's a reason we're paid a lot of money as engineers. This is hard stuff. I mean, this is really fundamentally hard stuff. And there's a reason we're paid a lot of money to do software, right?

Starting point is 00:36:28 Well, I think that's an interesting claim. Are we paid a lot of because just to clarify, it's like you're sort of asking like, it seems like we have to do all this extra stuff. Is it worth the effort? Because you're sort of like talking about it from a productivity perspective. You're worried about it sort of slowing everything down, right? worth it because it's more that it feels like the each each thing we add on unit tests integration tests formal specification documentation feels a bit like an ad hoc solves one part of the problem and it's not like a unified solution to it's it's not like does that make sense at all it doesn't it doesn't feel like yeah okay so my thoughts there is that a unified solution would be nice. It sort of solves everything for us.

Starting point is 00:37:29 Historically and empirically, almost all the ones we tried have not worked out. It turns out that complicated problems often do unfortunately require complicated solutions. Yeah, well, actually, now that just hearing myself talk and hearing what you just said, it reminds me of the No Silver Bullet essay, which most people misunderstand. But the central metaphor of that that I remember is medicine, how before germ theory, we thought there'd be like some magical cure, some simple magical cure to diseases. But then once we finally accepted germ three, we realized that there would be no one big solution. It'd be like a lot of tiny little solutions that'd be hard to find. And so I guess that's kind of what you're saying with software. There's going to be no unified one solution.

Starting point is 00:38:20 It's going to be a bunch of little add-on things that we'll have to keep adding on to software to make it better incrementally over time. Just like we have to take a flu shot every year, and we also have to take a tetanus shot, and we also have to take a polio vaccine. There's no one magical shot that'll do all of those vaccines. We have to take them all. Yeah, and I think that that's true with almost any sort of human system. Like almost any, I think that with almost every sort of system, like system you're going to look at, whether it's sort of software engineering or like medicine, or I I'm assuming, and I could be wrong as we all know, like we do not know other fields very well with like other kinds of

Starting point is 00:38:56 engineering and also with like lawn sludge. It's just that there's like really complicated problems that like there's a million small solutions for no one's ever like no one ever finds like one magical thing that just fixes everything yeah so i i hear that for sure and um then i feel like on the other hand there are times when um you have uh the uh the geocentric theory theory and you add epi cycles and epi cycles and epi cycles. And then all of a sudden you realize, Oh wait, if we just make it a heliocentric theory, we get rid of all those epi cycles and everything's more elegant.

Starting point is 00:39:34 And we've, we've replaced these ad hoc things with, with a new elegant foundation. So that happens too sometimes. And then you have to start figuring out the processions of stuff. And then you realize you have to add in general relativity and special relativity to sort of adjust for like other things which are verified but also incredibly complicated yeah yeah well i guess um yeah that's kind of the beginning of infinity thing like we'll never we'll never quite be able to explain

Starting point is 00:39:59 everything but yeah uh i guess to go back to my original skepticism, it's really a skepticism of like ad hoc Ness. And that in you? Yeah, but could you add hackness is a hard thing to define. But I guess what I'm getting at is, if I asked you to list all of the practices you would recommend for an engineering team, like unit tests,

Starting point is 00:40:30 and maybe just list them, for example. What would be all the, so writing code, version control, maybe just list some of the things you would recommend. I guess some of the ones that I think would be recommended would be, would be... So formal specification, I think obviously I've got to sort of say that. Obviously writing code, you have to do writing code. Version control is important.

Starting point is 00:40:56 Code review is extremely important. It's one of the few things that we are empirically sure with multiple studies is a great idea. Code review, probably. Yeah, sorry, code review. Did I say something else oh i thought maybe you said um sorry never mind i'm sure yeah so basically you're just dropping out of it yeah my mistake um so yeah basically code review really really good um taking time to do stuff adequate sleep exercise good relationships with clients, like constant feedback, really careful post-mortem

Starting point is 00:41:29 system analysis, like really careful pre-mortems. I realize a lot of this isn't actually in the code level. Do you want sort of what I think would be effective things for coding? Oh, well, yes. Then I think at the... Yeah. Yeah, go for it. Coding probably like unified style, like randomized testing. Although that sort of is interesting because like we're not quite sure what are the best tests to write.

Starting point is 00:42:00 Like I think a lot of people are really fond of unit tests. I think those are great and you can write them really fast. But there's also other things that are like really powerful, like large scale, like testing, probably some like measure of observability. Like I'm not sure if this is really supporting your point or mine here. Yeah. I'm not sure either, but I'm kind of, I kind of like where this is going. I think neither of us really know where it's going. Yeah. I, yeah, I guess maybe this is just how it is. To pick another example, if you were to say, if you want to be the best tennis player in the world, what are all the things you have to do? I guess the list would kind of be long and complicated. And then someone would be like, no, actually, you have to add this thing too. And now that we know that Racket,

Starting point is 00:42:47 this feature about Rackets are important, you have to worry about that too. And actually, we didn't realize that gluten was bad or whatever, or gluten is good or whatever it is. I guess... Oh, I do see one simplification that I think you would find as interesting as a simplification. I think a lot of unit test integration tests, not all of them, but a lot of them can be folded into a combination of property tests and contracts.

Starting point is 00:43:09 Okay, tell me about that. Yeah, tell me about that. Sorry, are you familiar with contracts? Yes, but let's assume not. Okay, so essentially a contract is an assertion that you make as either usually a precondition or post-condition for your function. So say if you have something that takes the tail of a list, you can make the post-condition saying that it will have one less element than the original list. And also if you append the head of the list to the output, you'll get the same thing. So essentially these are like essentially specifications that ride in the code itself.

Starting point is 00:43:44 And they can be used for formal verification, but they can also be, and are more commonly used for runtime verification. Every time you call the function, you just check the preconditions and post conditions. And if they're wrong, you just sort of stop execution and raise an error. And it turns out that if you do this, one, it's really effective, but two, you can now start to test by just randomized like randomized inputs pumping it through a system. And just if you have a bug, the appropriate contract will stop and raise the issue. So you start to get really simple integration tests from that.

Starting point is 00:44:16 Oh, okay. Yeah, that's very interesting. So the Chaos Monkey approach. Well, Chaos Monkey, I guess, is more about letting servers do that and stuff. But essentially the randomized testing with fine-grained responses well the randomized testing reminds me of haskell's quick check where you can generate tests based on types and then these are runtime assertions which i guess in in a in the type world would be kind of like um like dependent types um Sort of. So the thing is, is that contracts, so I guess a quick description between contracts and

Starting point is 00:44:48 types, a lot of overlap between the two ideas. The main difference is that types aim for what's called legibility. They aim for being able to really easily analyze them statically, while contracts aim for expressivity. They aim for the ability to encode arbitrary assertions. So, for example, if you really wanted to, you could write a contract that says this function is only called by functions that have palindromic names. I see. Which is probably not something you want to do, but you can easily do things like, say, refinement typing. Like, this should only be called with values

Starting point is 00:45:25 that are greater than zero, and it will always return values that are less than zero. That's more of a type level thing, but you're saying that contracts are much more expressive. Yeah. It sounds like contracts have access to not only the static AST, but also the actual code, so you can get the name of the function, and it's also have access to runtime information. Maybe it even has access to past runs. If I've been tested three times before, then fail? I think that's probably not something you want to be doing, but it's more like, it's more like long blind. I guess here's a more reasonable thing. Like after this is run, this mutation should happen in this class kind of thing. Yeah. Yeah. Okay. Got it. So yeah, well, I think that's, I agree that that's something that,

Starting point is 00:46:18 that excites me. I like the idea of, of simplifying. It's like a mathematical idea, like, you know. Being able to describe the same amount of things or more things with less words. But then, yeah, I also understand the silver bullet solution of you just have to...

Starting point is 00:46:40 It's a complicated thing. We just have to keep layering on improvements over time. And maybe at one point we'll get a paradigm shift and we can like have a new foundation but um yeah i mean and also like the thing is is that you might you might end up in a case where things are simplified but on the whole things are more complicated like it might turn out that we can like fold five ideas into one but also we need six ideas total. So I'm not doing a great job explaining this. I'm sorry. No, no. Well, yeah, I guess there's a reason that there's a reason I'm much better at writing than public speaking. Yeah. Well, you are quite a good writer. So, um,

Starting point is 00:47:16 well, I guess let's focus on the things that, um, I think we, we do a good job of focusing on the discussiony kinds of things because your essays, and I've done this a lot in other podcasts too, when I interview people who are good writers, I try and start with their writing as a foundation and then kind of go where I find interesting. And hopefully it's where other people who will read your things and wish that they could have asked you this question. Oh, I see. Okay. So in that case, let me try changing sort of my answer to that. Sure.

Starting point is 00:47:52 Because, okay, so I think that I'm going to actually ship this about like, why do we have to keep on lambing stuff? And I'm going to sort of ship this in a slightly different direction then, in something that I think might be more interesting for discussion. Are you familiar with systems theory? Let's drill into it, regardless. Okay, so it's a mixture of really interesting ideas and really cultish ideas that sort of started forming around early last century

Starting point is 00:48:18 and sort of developing, which is the idea that often we approach systems, we approach problems and think, oh, this really complicated problem, there's actually a simplification to it that we can make that makes it easy and we can sort of abstract it out. And then systems theory is the group of people going like, wait a minute, what if that's actually not always the right approach? Maybe there's a better approach, which is what if we sort of look at this complicated problem and say, hey, this complicated problem

Starting point is 00:48:42 is actually still a complicated problem. And what we should be doing instead is trying to find all the patterns inside the complicated problem that helps make it complicated, and then try to sort of think of the complicated solutions or the simple solutions that address all these like interconnected issues. So the idea here is that instead of sort of saying, oh, there's really, there's an easy answer here, we say, no, there's no easy answer. And now say, no, there's no easy answer. Now that we accept that there's no easy answer, how does this help us to have that revelation? Does that make sense? Yeah.

Starting point is 00:49:14 Maybe can you walk us through a concrete example? Okay. Sure. Here's a concrete example. And this is the stuff that I've mostly been interested in applying to is system safety, which is how properties of safety and security arise in a system emergently. So actually, yeah. Actually, I think I might have actually had a better idea for an example here, which is feature interaction, which I think is a little bit easier to talk about.

Starting point is 00:49:41 So imagine you have a system where you've got the following. People can sort of sign up and register for an account and they have to validate their email address, right? Yep. So they have to validate their email address to basically do stuff. And you also have a check basically in the registration form that they cannot validate an email address if it belongs to say a spammer email address right like you know those like things like teleworm and like use like you mail which lets you get a bunch of free email addresses like in bulk yep yep okay so now imagine you have that requirement and now you also have the requirement that if they get an error um if basically like the

Starting point is 00:50:23 email gets lost they can click one that resends the validation email, right? Okay. And now let's imagine that we finally have a requirement that you can change your email address. Do you suddenly see the bug that we now have? You can give them a good email address and then change it afterwards? Yes. And not only that, but often you can end up without, even if we have the requirement that you can still change your email address to a bad email address, they can still register

Starting point is 00:50:49 with a good email address, change it to a bad email address, and then resend the validation. And this is actually a bug I've encountered in the wild. And then the question is, though, is this a bug? Is this actually a bug? Would you say it is or would you say it isn't? If they don't resend the validation? If they can do this, where they can now validate a bad email address by resending a... By basically switching to a bad email address, then hitting the resend button. It sounds like a bug.

Starting point is 00:51:18 Okay, so where's the buggy code? Oh, yeah, well, the bug is in the design. Yeah, but like, here's the thing, seriously, all the code is correct. All the code is satisfying all of its requirements. But this idea that there's an emergent bug that happens that we're not even sure if it's a bug or not until we actually talk to like, figure out the broader goals of the system. This is what we call feature interaction. And it's the idea that multiple working things interacting can lead to a global bug that's not a local bug. That's why I'm sort of talking about my systems theory, this idea that

Starting point is 00:51:50 we have to be thinking about all of the complexity that emerges from what we do to actually figure out whether things are correct or not. Does that give you a better intuition of sort of what's going on here? Yeah, a little bit. Maybe do you want to try one more example or move on? I can try one more example. Give me a second to sort of think of a good example here. So, and this is more... Sorry, before you do, one thing that might be just misleading me is, Nicky Case talks about systems and he talks about like cyclical systems yeah he has like these diagrams where um you have like rabbits and foxes yeah sorry and you can like model them it sounds

Starting point is 00:52:34 i imagine that his systems are maybe the simplified versions that you're saying we should think about think about things more come in a more complicated way? Or is his systems kind of what I should be thinking about? Yeah, they're similar. So what he's talking about is stocks and flows, right? And that's a particular way that people analyze dynamic subsystems. So it's a way that we essentially try to sort of find models of complicated systems that helps capture some properties of them. It's one of the many tools that systems theorists do have.

Starting point is 00:53:05 Another one that's actually really similar to like, that's like really close to things is that state machines are used for systems analysis outside of programming. Okay, got it. So state machines is, I think, a really good example. So state machines is used for systems analysis. Yes. Got it.

Starting point is 00:53:22 Okay, so I think I was just a little bit maybe confused when you mentioned that. Because you said that sometimes we were, systems theory was kind of a reaction to oversimplifying. Yes. State machines seem like an oversimplification, but maybe I just haven't seen complicated enough state machines.

Starting point is 00:53:40 Right, right. But the idea of the state machine is it's a simplification, but it's one that explicitly doesn't capture all of the logic of a system. And it tries to capture also its more complexity, if that makes sense. It's a way of representing a complicated system in a simplified way, but also doing so without assuming that the system is actually the simplification. The map is not the territory. Oh, okay. So that's the key insight, that the map is not the territory. Yeah.

Starting point is 00:54:12 Okay. Sorry, I feel like I lost the stack. So why were you going into systems theory? Because one of the ideas that if things are this complicated, then it sort of suggests that even if we can find these simplifying paradigms of things like programming or like really anything, innately the fact that we're trying to build a system will make it complicated. There is no way to easily simplify the entire battery, the connections, all the various aspects that it means to validate and verify if we want to actually match what the global system is doing to what we need it to do. So what this is sort of suggesting is that even if we do find these ways to like replace geocentrism and epicycles with heliocentrism we're still going to have to deal with a lot of complicatedness in the new model maybe less than the old one but

Starting point is 00:55:11 we're still going to deal with a lot got it okay i see um let's let's go back i i'm i enjoyed this conversation i just want to go back to some of the maybe more code-related stuff. Yeah. So you alluded to earlier in this conversation that even though we don't have maybe the tools we'd want for formal methods, we have good enough tools for verifying the important parts of your system. So it sounds like an 80-20 rule kind of thing. And so I'd be curious to hear about what your heuristic is for when a given problem what properties of code or of design kind of begs for formal methods?

Starting point is 00:56:05 And when you say to stay away, is it like a function of, you know, how well funded you are, how big your company is, how long the code has been around? You know, how do you think about when to apply formal methods and when to not? Okay. So this is going to be a lot of like, but talking right now, because I mean, so first of all, like there's a consequence between formal design and sort of like um butt talking right now because i mean so first of all like there's a consequence between formal design and sort of like formal coding um and also that there's the fact that

Starting point is 00:56:30 i'm going to be majorly biased here on account of the fact that like again this is my job this is what i do for like a living but essentially there's costs to there's there's there's risk of failure and cost of failure right so like any given system there's going to there's there's there's risk of failure and cost of failure right so like any system there's going to be a chance that it's going to go wrong and the chance and probably that when it goes wrong it'll have these effects so the common example everybody uses to defend formal methods which is sort of a incorrect example for a lot of reasons is a pacemaker if a pacemaker goes wrong someone dies right so you'd want to formally verify that turns out there's almost no formal verification on medical devices it's not done um so that's actually not the greatest example but uh lots of reasons but mostly because like it's an industry that isn't well regulated in general

Starting point is 00:57:18 like when it comes to the software aspects of it and it's kind of lagging behind and also like there's a lot of different like there's like no standardization among like medical devices. And there's like so many different kinds. Like it's, it's, it's weird and messy. Most industries don't use it that you think would use it. That's scary. Yeah. Um, but in any case, there was actually a, I remember there was actually a FDA advisor that came out like a year or two ago saying half a million pacemaker devices can be hijacked if you're close enough to them and you've got a radio. So, yeah. Try to exercise. Exercise.

Starting point is 00:57:52 Exercise is good. We know that works. And if it keeps you from needing a pacemaker, it reduces your chance of getting hacked. Wow. Yeah. But in any case, so. But there's also like a lower level costs. For example, if it could turn out that if your like system fails, then your clients will be really angry

Starting point is 00:58:10 at you and they might threaten to sort of pull your contract. Or if the system fails, you're going to have to scramble and spend two weeks fixing it and trying to figure out where it failed and like what the bug is. Why is it crashing all of a sudden? Right? So, yeah. So I think my general stance is that with design specification and verification of designs, we've often found that it is cost effective for reducing the risk of angry clients and frantically spending time fixing fires in production. That's my personal experience. That's the experience of most people I've talked to. It saves a lot of time there.

Starting point is 00:58:46 So that's why I think that that is more broadly useful. Specification of code, just the specification of the verification, is a really good way of thinking about it and figuring out what you want to test informally. And then the verification of code, I think, is challenging enough, at least formally, that it's not something most people should be doing.

Starting point is 00:59:04 So I'm realizing here that I don't really know what the specification of code is about. Because I know what verifying code is about. It's showing that the code meets the spec. I know what verifying the design is and specifying the design is. What's specifying the code? Let's find a good example here. The example I commonly use is sorting. Like, okay, so how would you test whether something's sorted? Well, I guess, so you gave the example of you pick two indices. Oh, wow. Yeah. I was hoping you'd have forgotten that by this point. Okay. So maybe another example.

Starting point is 00:59:47 You've got a sequence of numbers, and you need to find the maximum of it, right? Okay. What's the maximum? How would you test that? How would you test that I actually got the maximum? Yeah. I guess it's the and of less than of all the things.

Starting point is 01:00:09 I don't know what words I'm allowed to use in order to specify something. Exactly, yes. I'm realizing that might also... Wow, it's really hard to come up with a good example. I've already burned my sorted one. So, yeah.

Starting point is 01:00:24 You'd basically say it's going to be the one that's less than all the other ones right sure yeah so you so you have this description of how it like works what it means to be the maximum of a number now most people what they do that their solution is going to be to write tests right like to write unit tests to say like okay the max of this list should be this the max of this list should be this the max of this list should be this right the max of this list should be this, the max of this list should be this, right? Sure. And that essentially creates an ad hoc specification. And the specification of the function is now, for this input, it gives this output.

Starting point is 01:00:52 For this input, it gives this output. For this input, it gives this output. But that doesn't really describe what it means to be the maximum of a list. It's what you actually said, that it's the thing such that everything else is less than it. Yep. So essentially, the specification is then a way of describing what we actually expect that to do in a level that actually totally encompasses what we want out of it. Okay, so I see.

Starting point is 01:01:15 And then how is that different than the design specification? Design specification is sort of happening at a much higher level where you sort of hand wave away. The example I use here is that if you're trying to like sort of write something that figures out whether something's a park or a bird, it's kind of hard to specify that at the code level because like you have to sort of figure out how to objectively describe as a specification what a park or a bird is. But at the design level, you're basically just hand waving saying, okay, let's assume this can definitely figure out what parks or birds are.

Starting point is 01:01:46 How does this work when we try to combine with our message queuing system? And how does this work when we combine with sort of the user constantly, like trying to cancel stuff, it works at a higher level with like much more hand-waving and looking at sort of the larger scale, like interactions between things versus like each line of code and how it affects the local. So maybe another way of thinking about it is like local design versus global design. I see. Yeah. Yeah. Okay. Got it. Yeah. Thanks for the reminder. It's hard to keep all this stuff straight. Yeah. I mean, it's, and again, the reason I'm using sort of like a lot of like specific terminology here is because a lot of times people sort of like have this

Starting point is 01:02:26 endless assumption of like how things are with this and I find that often these are some confusion when talking with people because they're not sure whether, wait, is this formal verification? I thought this was formal verification. So I try to like very clearly split into like all those different subcategories I said, which. Yeah. Yeah.

Starting point is 01:02:45 Um, and then, uh, another term then another word that comes up here that I'm wondering if you could tease out for us is the difference between TLA plus and Alloy. Are they used for the same things? There's several specification languages. So just as there's multiple programming languages, there's multiple design languages. And usually if somebody says they're working in a specification language, they almost always mean a design language, um, that like is used to write designs. Then you can sort of like do things called refinement to go from design to code.

Starting point is 01:03:14 But again, that's a more complicated topic. Um, but in any case, so TLA plus is like the, is like one of the big ones these days. And it's the one that sort of like I started really working on. Um, Owl is another one of the big ones these days. And it's the one that sort of like I started really working on. Owlway is another one. It uses some different base assumptions on what we want to be specifying and how we do it. And that leads to different design decisions. And it's also really, really good. I'm actually on the board for that. And we're working on writing more documentation for that too. Then there's other ones too. Those aren't the only two. But those are the two that I personally get the most excited about. Got it.

Starting point is 01:03:48 Well, so I feel like I can't resist. It's just like such – Go for it. I feel like it's a light and I just keep bumping into it even though you've already kind of told me. Go for it. I mean, this is a discussion, right? The idea of specification code is just – it just keeps attracting me and so i'm going to do this thing again that you you said is kind of not a waste of time but it's uh it's it's a bit maybe a little distracting

Starting point is 01:04:12 right now so to me the end state of where we want to go is um the the abstract specification stuff seems to be like the right level of talking about things. And the essential versus incidental complexity distinction, I think that's a good frame. Maybe it's not precise enough. But anyways, it feels like specifications are almost by definition essential complexity. And so if we're trying to get rid of all the incidental complexity, a design language that's based on math seems like a really great way to go. And then, of course, there might be reasons why our specification language or the way in which we mathematically model something isn't computable or isn't computable or isn't computable efficiently and so i feel like

Starting point is 01:05:06 the the dream is that somehow you have a compile an interactive compiler experience or interactive refinement experience where it you know aids you in um taking your design and making code that's somehow efficient it like exposes why the specification you wrote isn't efficient and then it gives you ideas and you can work with it. I have three things to say to that. First of all, I want to say that the idea of talking about essential versus essential complexity, I never considered in those terms that's a really good way of thinking about it. I really like that, thank you. Oh, oh sorry you aren't familiar

Starting point is 01:05:45 with the fred book brooke no no i'm i'm familiar with i'm familiar with books like i love mythical man with i've read it like multiple times i just for some reason i've never connected the idea of trying to think about specifications as ancestral versus ancestral complexity just for some reason never was connection i made in my head so that's that's what like i'm like hey thank you for that that like really works out as like a great metaphor and a great model of understanding things. Oh, okay. Yeah. Hey, I don't know a whole lot.

Starting point is 01:06:13 So then second thing is, and this is just my opinion. I'm not talking about any party line. This is just my own philosophical ramblings. I think that has been a thing that we've been doing for a while. C is a higher level specification of assembly, and then Python is a higher level specification of C. And then Prolog can be a higher level specification of all of that. So I think that essentially as we get better and better at sort of synthesizing specification into code, we start to call them new programming languages and higher level ones. Yeah, well, so in that line of thinking is, but TLA plus doesn't compile down to...

Starting point is 01:06:55 Yeah. Right? So it's kind of broken that line from itself kind of on purpose. It's saying like, my job isn't to compile down to Python or whatever. My job is just... Yeah. And the reason we do that, and again, TLA Plus isn't the only thing that does it, and it's not even the first thing that does it.

Starting point is 01:07:15 But the reason we do that is so that way we can talk about specification at any arbitrary level of complexity. I can write a specification where I'm talking about, okay, the actors in the system, one of them is our server cluster, another one is our talking about, okay, the actors in the system, one of them is like our server cluster. Another one is our giant ETL pipeline. The third is our data warehouse. The fourth is all of our users. And the fifth is an adversary.

Starting point is 01:07:32 And then the sixth is, I don't know, an earthquake. Like that ability to sort of jump between different concepts of like what an actor is and what a system is makes it really easy to specify really complicated properties very concisely. But the consequence is how do you sort of convert your adversary or an earthquake into code? I see. I see. Yeah. So I guess one way to think about it is if we have this progression of getting more and more abstract high level specification E languages that compile down.

Starting point is 01:08:05 So we're kind of plotting along getting more and more abstract high level specification e languages that compile down so we were like kind of plotting along getting more and more abstract and then leslie lamport was like you know what like you guys keep doing that and eventually you'll you'll get to something like tla plus but i'm just going to go ahead and jump all the way over to tla plus and make this and it won't connect to that pipeline but it'll just be in its own little world to help you reason about those other things? I'd say yes with two caveats. Leslie Lamport wasn't the first person to do this. I believe, and I can't remember who did this, but one that was definitely much earlier is this one called Zed. And actually, TLA Plus was in part a way of trying to write a specification language that was simpler than something like Z, as was Alloy. And two, I sort of name dropped this a couple of times, but there's this concept called refinement, which basically involves taking a spec and then writing a more fleshed out,

Starting point is 01:08:55 more complicated spec that matches it. And then you keep doing that until you get to a point where like you're basically at code and then you just translate that to code. That's a concept called refinement. And it's also something you can do. But going from a really high-level spec, refining it all the way down to, like, implemented code is really, really hard. It's basically just as hard as, like, proving code correct in the first place. I see.

Starting point is 01:09:18 Yeah. Okay. Cool. Well, thanks for doing it again, even though we kind of already talked about that. I felt like I got more out of it the second time. I mean, I will be honest, the way I'm sort of seeing it and like us talking about this multiple times is every single time we talk about this, it is evidence to me that I'm not explaining it well enough. Like, so one of the really fundamental challenges about this space and the reason I write so much on it is that we've not done a really good job of both explaining what it is to people and teaching it to people. Like, have you tried looking at any of the documentation about formal methods?

Starting point is 01:09:50 Yeah, that's what I spent today doing to prepare. Yeah. It's not that accessible outside of this discipline, right? Your stuff was pretty good, actually. Yeah. First of all, thank you a lot. Second, that's sort of why I write so much, because I think that it's both we don't do a good job explaining it, but also it's a hard thing to explain. So that's why I sort of care so much about explaining it well. And that's why every time like we come back to this in this podcast of like, you have like a question like, wait, you don't understand this one thing. I'm just taking notes here being like, okay, how do I explain that thing better? How do I make this more intuitive? How do I be clear about the difference between A and B? So it's not, you should not be the one apologizing or I should be the one apologizing for explaining it poorly in the first place. Well, anyways, thanks for that. Yeah,

Starting point is 01:10:36 I appreciate that you're so concerned with the reader. It shows you have a lot of empathy for your students. You're a true teacher. Okay, so there's one topic that's kind of a meta topic that I want to switch to. Go for it. Is there anything else you want to say about TLA Plus or formal methods before we jump there? It's cool. I think it's really cool. It's not a panachea. Nothing is, but it's cool. I think it's interesting. I think it's useful. I think it's powerful. Most people will probably get as much benefit out of like exercise and getting adequate sleep than about using any

Starting point is 01:11:25 sort of tool no matter what it is in programming that's my personal stance wait so you're saying that if you don't get do exercise and get good enough sleep and you don't know formal methods you should first get exercising and get enough sleep and then do formal methods yeah okay good just just making sure we have our priorities straight. I say this as a person who's permanently sleep deprived. I do not, I do not practice what I preach here. Oh, okay. That's it. That's good to know too. Well, I guess so, so speaking about practicing and preaching my, the, the, the topic that I want to talk about next is related to a really fun Twitter exchange you and I had recently. You know where I'm going?

Starting point is 01:12:10 I think I know where you're going about this. Do you want to explain? I think you can explain. You can explain. I think it's better than I could. Well, I'll explain my part, and then you'll explain your part. I wrote this 19-tweet rant about the comprehensibility of software and how when you have millions of lines of code, nobody can understand all the code,

Starting point is 01:12:32 particularly because of the way code is written, and it's scary. That was kind of the thesis. And in there, like maybe number 12 or something of my tweets, I made an analogy to how this is much worse than in other kinds of engineering, such as building bridges, where there aren't 12 million lines of code, and so other people could understand

Starting point is 01:12:54 via the specification how things work. And so then Hillel saw this, and that one tweet in particular, the comparison to architecture, you kind of ran with it. So maybe you take off the story from there. So what I did at that point, what I did was, so I'll explain why that kind of like was the one that I reacted to. But basically, I'm like, wait a minute. So you're basically at this tweet about how you're like, well, in like, say, like in bridge building, if sort of every single person left, they solve all these blueprints and all these like construction plants, they know where like a new group could come in and basically start where the old group left off. And what I ended up writing was like, I was talking about how in software, or I was alluding to how in software,

Starting point is 01:13:45 the word legacy is like this bad word. Nobody likes legacy code. Nobody likes to interact with old code that nobody else can explain to them. Basically, I was arguing that in software, most of the knowledge of how something works is stored in the people's heads, and it's not actually in the code. And so if all the people die or leave, and someone else is forced forced to like a new team is forced to understand a million lines of code it's going to take forever it's like uh it's they almost have to start over from scratch or like nobody's going to want to touch it because it's just so scary and so i i thought or i asserted but uh i think i i think you helped show me i was wrong that it's better in other kinds of engineering. Right. So basically what happened was I responded to that. And I was like, I'm not sure this is the case.

Starting point is 01:14:29 I don't, like, I'm not an engineer. I'm not like an architect. I don't know how it is in other engineering fields. So, and then I'm like, what it looks like to me is that it's probably very similar to how it is in programming where so much is in people's heads. And what's not is basically thousands of pages of documentation. And I'm like like does anybody else like know this who's like done both and one thing if there's one thing I'm really good at it's

Starting point is 01:14:50 getting people who are really cool to follow me on Twitter that's like my main skill more than anything else that I think is like valuable to me and I've just got like a lot of really great Twitter followers and a lot of them turns out to have been both people who have done programming and done architecture like as our professional architects and professional software engineers and they're all like yeah they're about the same like they're both equally terrible at like restoring the stuff and then it basically got into this thing where like just a bunch of like engineers from different fields were just like chatting about how terrible documentation and like legacy projects were in their fields.

Starting point is 01:15:25 Yep. Well, so the conclusion that I took, which, again, this could be wrong, but the conclusion I took was that potentially the reason why it appears to me that software is so much worse than other fields, or, well, like you, I don't know anything about architecture, but why software seems so bad just in general to me, knowing nothing about other fields, I think is an uncanny valley problem. And the uncanny valley is that when you have graphics that aren't supposed to look like reality, there's like an uncanny valley where if you get graphics that look really almost lifelike then they look really bad because you uh your point of reference is the difference as opposed to um there's no not having a point of reference so my new thesis is that programming is so good or not quite so good or has the potential to be so good that um that all the places in which it's

Starting point is 01:16:25 not good uh are like glaringly obvious and so just to spell that out one of the your amazing followers um commented about how um in architecture observability is way worse than in software for example we take pipes and we put them behind walls where you can't see what's going on if you want to see you have to like break down a wall and then you want to see, you have to break down a wall and then you have to do all this work to repair it. And in software, Brett Victor will complain about how we don't have observability and you have to run the code with console.log statements. And that's a million times better than having to break down a wall.

Starting point is 01:16:57 So my thought is that... But then, of course, we can do way better than console.log statements. And it's kind of obvious how we can do better than console.log statements. The imagination is so much easier. When how to do better than pipes in the walls, I guess it's like putting sensors on all the pipes. It's less obvious. So maybe the reason why I complain about why software is so bad is because it seems like the improvements are easier. You can see how it can be better.

Starting point is 01:17:21 Yeah, exactly. Does that resonate with you more now or you still want to test that yeah i mean i do want to test that more but i also tend to agree i tend to also get really frustrated all the time we're like how software could be better than it currently is so i'm really not i'm really like biased here in your favor um um i do want to go to the um the great theorem prover showdown because because I think that also has a lot of these meta concerns. So I think, yeah, you'd probably tee that up better than I would.

Starting point is 01:17:51 I just want to start by saying I loved what you did there. Thank you. A fascinating kind of challenge. So I'm actually writing a talk on this because somebody asked me to give a talk on this and I think it would be a lot of fun. But generally, so my general stance is as follows. Let me just sort of tell you my stance that sort of justified this and like why I'm where I'm coming across on this.

Starting point is 01:18:16 But um, so my general stance is that, well, first of all, at a formal verification level, nobody's really sure what's best. Like some people do the stuff with like highly procedural stuff for various reasons. Some people do like functional stuff. I know, for example, Liquid Haskell, like one of its major decisions that it made was it actually reduces the amount of stuff you can say and express in it to get deterministic solver times, which everybody's like, oh, that's a great idea. Why didn't we think of that? But so basically that's it's sort of like very fuzzy. And then my opinion is that outside of like formal verification, it's also really fuzzy. And I think that what's the best language is like

Starting point is 01:18:53 sort of a, like sort of a straw man or a side topic for correctness, because I think as, as I made the very strong claim multiple times, um, which language you use probably has less effect on like your ability to like write correct code than how much sleep you've gotten um which is a claim i will bet money on happily um if yeah we had we had a long discussion for people listening to this podcast we had a long discussion that i think was cut out about like betting money for like what you believe in but in any case so one thing that i see a lot that happens a lot is that people sort of have this de facto position that functional is better than imperative. Static types is better than dynamic types when it comes to correctness. And I disagree with that for, well, I don't disagree with that core idea.

Starting point is 01:19:41 Like I don't know whether it's correct or not. I tend to work primarily in imperative languages and slightly more often dynamics. So obviously I'm biased towards those two. But I firmly believe that we don't necessarily know which is quote unquote better for like bugs simply because we haven't done really careful analysis of what we actually mean by that and really careful like studying of that. And most of the studies that we have done are inconclusive. So my position is we

Starting point is 01:20:05 don't yet have enough information to really make a decision if a decision is even makeable. Yet at the same time, I see a lot of people making these claims as in like, obviously true, like that you're immoral if you use sort of like dynamic typing, or that functional is obviously going to be more correct by definition than imperative. So my sort of, and I've often gone to these arguments where I'm like, look at all this amazing stuff people have done with imperative or like dynamic typing. That's correct. And the response is usually, well, it doesn't count for reasons X, Y, and Z. The most common one I hear is that, and the one that kind of triggered this was an argument

Starting point is 01:20:34 where I'm like, look at all this amazing stuff people have done with formally verified imperative code. And the response was if they did functional programming, they wouldn't have needed any of that verification would have been obviously correct. So that kind of got my hackles going a bit. And I decided to after this and a few other similar arguments, some very amiable, some more like aggressive, I decided to basically say, okay, enough was enough, I want to actually have everybody put this to the test.

Starting point is 01:20:55 And I wrote the theorem prover showdown, which was literally just saying, hey, it wasn't even supposed to be a theorem prover showdown. It was just be like, hey, if you really believe functional is going to be more obviously correct than like imperative, write these three functions I did imperatively, but also I formally proved them correct with these specific specifications. You have to prove the exact same specifications in your functional code. And it was three relatively simple things. One was left pad. One was getting the unique elements of a list. And one was doing what's called the fulcrum, which is finding the point where you can cut a list in half to minimize the difference between the sum of the left half and the sum of the right half.

Starting point is 01:21:31 Minimize the absolute difference, not the raw difference. Yep. Which is itself an interesting question. So, um, so those are the things I posted. I did them in this language called Daphne, which is essentially designed to like allow for like separation logic in like imperative coding, very C sharp, like, and it basically got like a bunch of different people coming in. Like some people were sort of trying to argue that while they couldn't do it, here's arguments anyway for why like functional is better than like imperative. Some people were like, Hey, let's try it. This is going to be easy to do imperative some people were like hey let's try this is gonna be easy to do and then they're like wait this is

Starting point is 01:22:06 actually really hard to do um some people were like it doesn't matter like i don't even want to bother doing this is too easy to bother doing and by far the most interesting people in my opinion were the people who did hey i did it here's my solution here are my thoughts on this entire thing um which was really cool because a lot of people have really interesting insights and we also got like, and this is when it started to turn into a theorem paper showdown when actual serious luminaries

Starting point is 01:22:31 like Ranjit Jha and Edwin Brady got on board and started writing explanations and proofs in their own proving languages they invented. And that was really cool to see.

Starting point is 01:22:40 It was a lot of fun. I think everybody sort of had a blast with it. I got banned from Twitter like five times, I think for unrelated reasons, a blast with it I got banned from Twitter like five times I think for unrelated reasons but I don't know and I ended up doing a write-up which got pretty popular

Starting point is 01:22:51 and it also led to this thing that we should probably link called Let's Prove LeftPad which is a way that people can compare different for their own education different like theorem proving languages where we just have a bunch of proofs of LeftPad in different styles

Starting point is 01:23:04 so it's the proofs of left pad in different styles so it's the proofs that you're comparison it's almost like to do mvc but for theorem well well so so the original challenge so left so the original challenge was really like can you do this it didn't matter who you were it didn't matter like how long it took you was just can you do this in a functional language um and then the left through left was one things that came out of that as a way of comparing and contrasting the different theorem prover languages. So out of my attempts to be a huge jerk on the internet, something useful came out of it, which is kind of impossible on the internet, but happened once. Yeah, it is kind of cool that on a meta note um part of what's so uh fun about this whole story is that

Starting point is 01:23:45 like in a series of tweets you like spawned all this wonderful internet collaboration about yeah yeah about coding um but i think one what i want part of what i want to talk about is um that i find myself to be like i found the criticisms you made about people who, who say that functional programming and purity is just obviously better. I find those criticisms to be, uh, describing me to some extent. And so I'm, so I spent, I spent maybe like an hour today trying to like think through, like if I'm being ideological or in which ways i'm being ideological and how i can better or and like is it a question of i'm just not explaining myself well enough like i'm being too dogmatic or is it that my beliefs actually aren't justified and so i have to like go and do some soul searching and i think about what what truly is reality versus you know uh

Starting point is 01:24:40 religion so anyways uh i guess one thing i want to clarify, it's fine to... I can pause there and say thank you, because that... Oh, welcome. So it's fine to have opinions, it's fine to believe something. Like, I'm not saying it's wrong to believe that, say, functional is better than imperative. My main criticism was more just, people who say that as fact, as obviously fact, and getting mad at people who don't believe the same thing like i guess it was more of an attack on like dogmatism than like or i guess rabid dogmatism than like beliefs i mean i have beliefs about programming too i'm not going to like force on other people but like i recognize that i've got beliefs too yeah well i guess part of why this really struck a chord for me, or maybe struck a nerve would probably be the right way to phrase it,

Starting point is 01:25:30 is I think I alluded earlier in this conversation to this essay that I wrote, which really was trying to be not quite a manifesto, but an essay that's trying to explain the benefits of functional programming to people who already kind of like functional programming. This essay was trying to like show them how it could be, you know, it has much wider reach than they thought, or it has much more generality than they thought. So that was kind of the purpose of the essay. And I got back, basically, the opposite of what you're talking about. basically everyone was like like wow you're taking all these fp things to be true when they're clearly not um so anyway so it just struck me that the like dogmatism or basically the communication gulf between between people uh and like religion and dogmatism i don't i i'm not exactly sure where I land at this point, but I just know that it's on my mind.

Starting point is 01:26:32 That's actually a really interesting question. And I'm going to once again refer to history here, if you don't mind. Please. So have you ever looked at C2 the c2 wiki yeah so one of the interesting things here is that there's similar like equally fervent like language wars on the c2 wiki like it happened over the past like from like 1990 to like i think 2010 was when it was most updated but they were about things like object-oriented versus procedural, or small talk versus not small talk, or Lisp versus not Lisp.

Starting point is 01:27:09 So it's really interesting because it seems like almost every time you get a community that's different from the mainstream in some way, you get really fervent arguments about which is better. So I don't think it's anything particularly special about imperative versus functional. It's more like between people. Yeah, well, I would agree with that, that it's definitely not specific to just, you know, I think it might just be that in today's, in the air today like um functional programming is kind of having its heyday um yeah but like not everybody's on board with it so there's that's kind of on everybody's mind

Starting point is 01:27:51 but in the past yeah it was like object oriented versus whatever else it had to compete yeah yeah so i think and i think like one of the things they also see is that like once you get like a smart community um it tends to be a common thing where like people only remember the assholes in the community so like when i sort of think about imperative or functional i'm not sort of thinking about the person who's going like i'm not i'm not thinking about the person who's going like um well i think that this is like better for coding for like these specific reasons and like these here's all these contexts where i think you're going to have this thing with like very careful compare and contrast and like talking about different aspects of functional programming compared to different aspects of imperative and styling.

Starting point is 01:28:28 I'm going to remember the guy who's like, imperative programming is immoral, and if you do it, you're actually a bad person, as happened with actually one of the reasons I tend to be like so very, very careful about what I say about say, on the formal methods, because I don't want to be that person who's the asshole who's like, convincing everybody that formal methods is full of assholes. Ah, okay. Well, yeah, that makes sense that you feel like you're kind of a representative for a community. And so you want to be extra careful that you come across as yeah and i think what happens is that like functional functional versus imperative is so heated not because like it's actually that heated of an argument as much as that there's tends to there's people on both sides um who just tend to be huge assholes about it well yeah that that's fair i think one one of the one of my hunches is that um i think it's related to what you were saying about a small community.

Starting point is 01:29:26 When you have a community of people who all agree with each other, the conversations of the inner community are very high banded with conversations because we don't have to explain why we believe a certain thing. We all just have common terms and beliefs. And so when someone who exists in this wonderful community leaves and has to talk with someone who's not in that community, I feel like that's where you get into trouble. And the internet is ripe for that because it both fosters tiny communities and also is totally open and anyone can just stumble into a place where they don't know what the words mean. Yeah. And that's also the reason I tend to obsess over like really, really strong and very precise, like clear arguments, because like, I feel like those are the arguments that are best at convincing people. Like it's easier to sort of explain to a person why I say recursion is good if you talk about walking a file system than if you talk about Fibonacci. Yeah. Well, I think you had a really great post about on your blog about like different kinds of examples,

Starting point is 01:30:29 and they're examples where you're trying to explain things versus convince people of things. Yeah, I should probably update that post. I think that's a really good post for me, actually, because I was writing a post to explain a concept, and everyone who read it thought I was trying to convince them. And they were like, this article does not convince me. I am very much not convinced. I was like, this, like, I don't know how to like, what should I put in the header to be like, don't read this if you already disagree with these things. Read this only if you already agree and want to like understand something new. Honestly, just that. That's, I'm actually working on this piece about adversarial modeling in formal methods.

Starting point is 01:31:07 And it's pretty much going, it's like, hey, this piece assumes you already know TLA plus pretty well. It's not gonna try to convince you that it's a good thing. It's talking to people who already wanna do it. So just letting you know. Yeah, okay. So yeah, I think then there you go.

Starting point is 01:31:22 I just need a- Disclaimer. Yeah, disclaimer at the top. Yeah, then there you go. I just need a disclaimer. Yeah, disclaimer at the top. Yeah, like you're welcome to read it either way, but like. Yeah, yeah, that's good. Okay, well, this was very fun. I like at the end to give my guests an opportunity to pitch or list any ways they want to be interacted with on the internet, any things they're working on they want help with, anything like that that you want to

Starting point is 01:31:51 share now is the time. So I guess the first thing is that the easiest way to contact me is I'm on Twitter at Hillelagram. We'll probably include that link. And I also got a website, hillelawain.com. I'm really lazy about thinking about names. I'm just like, you know what? That's my name, Hillelawain. Done. I got a website, hillowayne.com. I'm really lazy about thinking about names. I'm just like, you know what? That's my name, Hillel Wayne, done. I got the website.

Starting point is 01:32:09 So yeah, either of those are great ways to like find or contact me. If you work for a company that's interested in this stuff, I do consulting. Okay, actually, here's one thing to pitch. So one thing I do really enjoy learning about is both really weird software niches and really weird like

Starting point is 01:32:26 and like really interesting aspects of software history so um if you end up having one of those two that you're just burning to talk about feel free to dm me on twitter i love getting this stuff great um well thanks so much for the uh taking the time to do this it's a lot of fun oh yeah thank you so much for having me on. All right. Bye. Bye. If you enjoy these conversations, I bet you'd fit right into our online community. You can join our Slack group at futureofcoding.org slash Slack, where we chat about these topics, share links, feedback, and organize in-person meetups in various cities, New York, London, now San Francisco, Boston, and also Kitchener, Waterloo, and Canada.

Starting point is 01:33:06 If you'd like to support these efforts, there are a few ways you can be helpful. I would really appreciate any reviews you could leave for the podcast, wherever it is that you listen. If any episode in particular speaks to you, I would encourage you to share it with your friends or on social media. Please tag me if you're sharing it on Twitter at Steve Krause so I can join the conversation. I also accept support directly on Patreon at patreon.com slash Steve Krause. And as always, I am so thankful for the ultimate form of support, your constructive criticism. So please don't be shy with your advice. Thanks again for listening and I will catch you on the next episode.

Your Ad Here

Future of Coding - The Case for Formal Methods: Hillel Wayne

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.