The Changelog: Software Development, Open Source - Thinking outside the box of code (Interview)
Episode Date: August 9, 2023Leslie Lamport is a computer scientist & mathematician who won ACM's Turing Award in 2013 for his fundamental contributions to the theory and practice of distributed and concurrent systems. He also cr...eated LaTeX and TLA+, a high-level language for "writing down the ideas that go into the program before you do any coding."
Transcript
Discussion (0)
This week on The Change Log, I'm joined by Leslie Lamport, a computer scientist and mathematician
who won ACM's prestigious Turing Award in 2013 for his fundamental contributions to
the theory and practice of distributed and concurrent systems. Leslie also created LaTeX and TLA+,
a high-level language for writing down the ideas
that go into the program before you do any coding.
Quick thank you to our partners at Fastly
for shipping all of our shows super fast all around the world.
Check them out at fastly.com.
And to our friends at Fly.
Deploy your app servers and database close to your users.
No ops required.
Find out how at FLY.io.
What's up, friends?
I'm here with Jeremy Tanner, dev rel at Tailscale.
And I've been using Tailscale for the past, I want to say the last year.
And I consider it critical home lab infrastructure for me.
As you may know, I'm a home labber, so that means I take pride in a solid home network,
Wi-Fi, services that make the home run, home assistant for automation, lighting.
And of course, I run PiHole to block ads at the network level and to protect my family
from ad tracking. And PiHole has this web UI that you can access from the land. But for a while
there, I just didn't have access to this service or other services like Portainer or just anything
else that had a web UI or something I would access via IP on the local area network. But
Jeremy, tell me about a world where via Tailscale,
you can access Piehole, Portainer, or any of these other services you might run in your home lab
and access it externally from a local area network.
Jeremy Yeah. You hit it when you said anything,
anything and from anywhere. And so Piehole being a DNS black hole, if you notice the difference
between the comfy experience that you had at home
and when you were out of your house, whether it was for work, a coffee shop, or even on your
mobile devices, being able to set up a tail scale on your mobile device and have access to the
same pie hole and filtering rules that you do at home wherever you have an internet connection is a thing.
Same thing when talking about streaming.
Instead of exposing your network storage to the internet and getting that ransomware,
keeping that on your private network, but extending the private network to any device that can run the client.
And so having anything that can run the client includes your Steam Deck. That's just a Linux box with, uh, the controllers grafted onto the sides of it.
And so if you're somewhere that has internet connection and wants to watch some of your video,
plug it up to a large TV. And if you're in a hotel room, use that as your, uh, as your media center,
that's playing the, uh, the shows that you're trying to catch up on from home instead of
whatever hotel TV has on it.
Instead of buying a service or paying a small monthly subscription, you have you have compute
and storage at home. It's just an easy way to get back to those instead of usually inventing a bunch
of things that you'd love to do. But those sorts of things become accessible to you on your phone
while you're outside the house instead of something that's neat to use, but then you miss while you're away.
Yeah, for sure. You know, I run Plex, so I'm in big agreement with that, of course.
But just being able to do what you can do with Tailscale is so cool without having to
learn all the VPN things. It's just such an easy tool to use, easy tool to install
on anything really Linux, Windows, Mac. So easy to use. Big to install on anything really Linux Windows Mac so easy to use
big big fan of Tailscale and for my Homelab users out there you can totally use Tailscale 100%
for free you get three users up to 100 devices which is like way more devices than I think I
would ever need to use Tailscale on the free forever tier is amazing and a great place to start and
especially to use it for your home lab needs alone check them out at changelog.com slash
tail scale the free forever tier is waiting for you to try it out again changelog.com
slash tail scale and make sure you tell them your friends at changelog sent you at changewellcenture. All right, I'm here with Leslie Lamport,
a distinguished scientist at Microsoft Research,
creator of LaTeX, not the rubbery substance,
but the documentation system that you've probably heard of or used.
It's used in academia.
It's used in scientific circles, math circles, and a Turing Award winner. Leslie,
thanks so much for coming on the show. Thank you. You might want to try that again and pronounce it
either latech or latech. Ah, I was reading about this controversy, the pronunciation
controversy. So as a layman, I see L-A-T-E-X, and I do think of that rubbery substance. But the initial author was a tech, right? It was supposed to be T-E-C-H was the sound. And your take on that is La Tech. That's how I should say it?
Or La Tech.
All right.
Either one.
Now, where do we get tech out of T-E-X. That's the curious bit. Oh, that's because LaTeX or LaTeX is built on top of
Don Knuth's tech typesetting program. Okay. And Knuth called it tech, even though it was T-E-X.
Is that right? Yeah. He says it's the Greek pronunciation of the original word. Not speaking
Greek, I wouldn't know. Ah. Well, you seem to have some Greek influence though, right?
Paxos, a lot of Greek mythology going on.
Is that an interest of yours?
No, it was just, well, for the Paxos paper, it was just something that seems a good idea
at the time.
But before I get away from that, I should say that Knuth did all the heavy lifting and
building tech.
You know, that was a real remarkable thing that he did.
What I did was, you know, in comparison, quite simple,
just building something, a macro package on top of it.
I see.
So Knuth did all the heavy lifting.
However, in terms of the tooling, like it's still used.
It's still prominent.
Is it still the majority tool that is used in those communities
for for specifications and writing or is there something that's usurped it since no i think it's
still uh math and physics at least it's still the the standard i don't i don't know about other
fields some fields it is some some fields you know i'm sure chemists have never heard of it
right well as a working software developer i say i definitely heard of it. Right. Well, as a working software developer, I say I definitely heard of it.
I don't know it well enough to pronounce it correctly, as you can tell.
But I hear it talked about often.
Some people love it.
Other people use it in anger.
But everyone seems to use it.
It still seems to be the best.
And to me, what speaks to me about it is the separation of what you're saying versus how it looks.
That's a key aspect, right?
Oh, yeah.
That was why tech needed something on top of it.
And Knuth intended for things to be built on top of it, but I'm not sure he expected
so much of the underlying tech to be hidden from the user.
By the way, I should say that the whole idea of basically separating the ideas from the typesetting, that was done by Brian Reed and Scribe System.
I used Scribe, and it seemed perfectly obvious to me that that was
the thing to do. One of the failings, one of my failings in my career is building on things that
seem natural, some seem obvious to me, and without realizing that there were new ideas and that they were significant ideas. I think a prime example of
that is in my Time Clocks paper, where it was inspired by a paper from Paul Johnson and Robert
Thomas, where they had an algorithm for something, and I realized that their algorithm
wasn't quite the right one because it basically violated causality. You could have something that
caused something else, but the things were ordered as if they happened in the opposite direction. And so, one of the two significant contributions of that paper
was pointing out this notion of causality, which again, seemed very straight, very obvious to me.
But what they had done, they had used timestamps, added timestamps to the messages. And I just
assumed that that was a well-known idea that people did that and sort of
not even worth mentioning. But as it turns out, I think they were actually the first people who did
that because their paper is not as well-known and mine was. People have credited me with inventing
adding timestamps to messages.
So a couple of themes here.
The first one being is you building on the work of other people who've come before and maybe getting, in some sense, some of the credit on accident, deflecting as I can tell.
But also this idea of your ideas, which you've put into papers and have been read now and implemented around the world, seeming obvious to you, but not necessarily to the rest of us, right?
You find it hard to explain your ideas to people or to bring them along with you?
Yeah, because I just expect more of them than it's reasonable to expect of most people. Fortunately, I've had some really
smart colleagues who are smarter than me, in fact. And I think if I've been more successful than them,
I think it's because of my different way of looking at things, not by being brighter than them. So, you know, I'm not, you know, I'm not a genius.
You know, I'm not in the, you know, in the league with, you know,
people like Einstein or Feynman or, you know, I can name, you know,
any number of people.
If we were to focus in on the looking at things differently than other people,
can you describe that?
Like what your angle is or how you see the world that's askew or that's tangential to maybe the way your colleagues were looking at things? What's that different perspective? people is because I think like a mathematician and I approach things mathematically, not computationally.
Can you give a for instance, like maybe in a basic case of a situation where maybe me
as a programmer would approach something imperatively or computationally, step by step, whatever
it is, and you as a mathematician,
maybe we would look at it differently. Because I can't look at it your way, I want to. And so,
how do you look at things? When you say, I think about it in math, to me, I'm just like,
you're going to add numbers together? Like, what do you mean exactly?
Yeah, it's funny. I don't have a ready example, except something that happened fairly recently with a colleague
who asked me how to define something.
And he said it was really hard to write this definition.
And he was defining some property of sequences, actually infinite sequences.
And it seemed really easy to me.
So I wrote the definition. And what blew his mind was that it was like a recursive definition, just like he was trying to do. That is, you define it for one and then n equals one. And then so we get for n equals one, you define it for n equals n plus one. But the value that you assign to a
given n, the value assigned to n plus one depends on the complete rest of the sequences,
infinitely far past n plus one. And that's perfectly obvious to a mathematician.
But because he's what I call trapped inside the box of computational thinking, that never occurred to him because you can't compute something by looking infinitely far ahead to find the next value.
Right. That's where I'm sitting right now.
But he wasn't trying to compute it. He was trying to write a definition.
A definition of the problem or of the solution?
No, just a definition of some property of infinite sequences that he wanted to define.
I see.
I mean, you know, this is, you know, he was a theoretician, you know, working as, you know, this is theoretical computer science in some sense.
Right.
It's talking about, if you like, properties of programs rather than how to write programs. So you represent the execution of a sequence as an
infinite sequence of either states or steps, depending on how you want to do it.
And then you can talk about programs scientifically by talking about those
sequences mathematically. Gotcha. So this kind of is a pattern or at least echoes what we talked
about earlier with Knuth's tech and subsequently LaTeX was the separation of concerns, right? In
that case, it's from the ideas and then the display of the ideas, the typesetting. You're
talking about separation of concerns, right? You're talking about specifying the ideas
as separate from the implementation, right?
When you say programming,
I know in the Quanta video,
which we'll link up in the show notes as well,
that featured you,
you made a distinction between coding and programming.
This idea that, okay, there's the idea
that you're trying to express
because what good is a program
if it doesn't do something
or express something or change the world somehow
or provide some sort of value?
So you have the idea and then you have the implementation.
And many of us, and I hold myself in this category,
they're kind of munched together
and kind of like grow out of the same milieu.
But the way you look at it,
the idea, the program is a separate thing altogether from the implementation.
Is that correct?
Am I thinking about it the way that you think about it?
Well, yes, exactly.
I mean, I wouldn't say it's separate.
Well, one informs the other.
Well, the thing I'd like to say is coding is to programming what typing is to writing.
I mean, obviously, it's somewhat hyperbolic. There's a lot more
intellectual effort goes into coding than typing. But I think actually, maybe in the days of,
before the days of computers, the analogy would be writing and typography. Typography is an art. A lot of effort goes into designing a book,
but that's completely separate from the writing of the book, the ideas, what the book is trying
to express. But it's not completely separate because good typography will reinforce the ideas. So in the coding programming aspect of it, you can't think about what the program is going to do without an understanding of whether you can code it or not to understand what you can do with code.
Actually, there are two things that you should do before you start writing code.
The first one is figure out, is decide what the program is supposed to do.
And, again, you can think of that in terms of what you have to tell the user of the program the program is supposed to do. And again, you can think of that in terms of what you have to tell
the user of the program the program is supposed to do. Now, that program very often is something
that's code to be used by somebody else in his program or her program. And I have a very simple
criterion for whether you've done this properly is that you should
write this description of what the program is supposed to do in such a way that someone
can use it without having to look at the code.
And the idea that I can be able to give you something that's a bunch of code and you should
be able to use it without having to look at the code is foreign to programmers. It's obvious. It's completely obvious to me. And I think it's
completely obvious to a lot of software engineers, but it doesn't seem to be obvious to programmers
because they're not programmers, they're coders. And if all you think you're doing is writing code,
well, of course, somebody has to read your code to understand what the code does. But that's at the heart of so much of what's wrong with software being built today.
So you're referring to like a formal specification.
It doesn't have to be formal.
Well, it has to be usable without the code. So how you can do that informally? Well, by formally, I mean, you know, writing it down in mathematics or in language like
TLA plus.
Yeah, that would be formal in my mind.
I mean, I do this whenever I, you know, if I'm built, have a coding, you know, a class
and, you know, there's some method that in this class, I will write a description of
what that method does. And in 95% of the cases,
it's just English, just pure prose. Maybe a formula or two tossed in. Occasionally pseudocode,
usually not, but sometimes perhaps pseudocode, whatever works. But it has to be precise enough that somebody can use the code without having to read it. And if what it does is you should be able to see whether what you write, you know, as you write it, you'll know whether what you're writing is making sense.
If you want to understand something, you've got to write it.
Because there's this cartoonist, Gwynn, that has this wonderful caption of a cartoon, which is, writing is nature's way of showing you how fuzzy your thinking is.
I like that. I had this one example.
There's one certain aspect of writing a parser for a TLA plus.
Somebody was doing it, and he asked, how do you compute this little condition of some check of correctness of the specification?
And I thought, oh, well, this is quite simple. I would
write down a piece of usual informal English, a little math. And the more I started writing,
the more tricky I realized it was. And I decided that I needed to write a TLA plus specification
of it. Fortunately, the person who was building the compiler knew TLA plus. So anyway, he didn't have to explain the TLA plus for him. And he coded it
from the TLA plus. And I actually sort of asked him about that. TLA plus is a lot different from
Java when he was coding it. And how much effort was that? And he said, oh no, it was really
straightforward. And actually, at some point later, I modified the parser and looked at the code
and had to modify the code. And it was clear. It was just completely obvious that you could look
at this TLA plus specification and the relation to the code was completely obvious. It made it
very easy to modify the code. Then the wonderful thing about that code is about once or twice a
year for the next several years after he built that parser, I thought, oh my God, here's a special
case of this. Oh, I bet the parser doesn't get it right. And every time I looked,
checked it, the parser did the right thing. You know, that stuff really works.
Okay. So this process that you're describing, I guess it kind of matters if we're talking about
it in the small, like at a function level or down in the details versus like large systems thinking. And we can, we can discuss that,
but it's certainly counter-cultural to the way that software generally speaking is developed
in the world today. I mean, we have agile methodologies, we have move fast and break
things. We have the only constant is change, you know, head west, young man. I mean, there's lots of like, let's get coding.
Coding is thinking.
A lot of these axioms that describe how we go about building software, in many cases
today, are like exactly opposite of what you're describing.
We don't say you can't think about it, but we say kind of you think as you go.
You figure things out.
The only thing we know now is that we're going to know more later.
We know the least right now.
Specs change.
Requirements change.
The world changes.
And so it is quite a different way.
Well, yeah.
I mean, first of all, you're right.
The world's changed.
If you build a program and it's really used a lot, You're going to want to modify it at some point.
And the real world is you're not going to go back and start from scratch and redo everything.
You're going to start patching it.
And we've seen that happen.
And as it goes on, that program gets harder and harder to deal with, harder and harder to modify.
And eventually, either the code is going to be thrown away,
or it's going to be rewritten from scratch. Entropy, that's law. So, you start patching things,
it gets sloppy. But if you don't do this design right from the beginning, then every piece of
the code you write is a patch, and you start out from day one with a mess.
I love this quote of Eisenhower, who said,
no battle was ever won according to plan, but no battle was ever won without one.
I like that.
Okay, that's compelling.
You start off with a mess.
Every line of code you write is a patch.
I think there's wisdom in there.
You might want to go back to what you said about the small versus the large. Every line of code you write is a patch. I think that that's, I think there's wisdom in there.
You might want to go back to what you said about, you know, in the small versus in the large.
Yeah, let's do that. The only difference in the small versus the large is that in some sense, bigger something is, the more formal you have to be because it's going to be more complicated. So a lot of the users today are
people in Azure, basically the web services of Microsoft and Amazon Web Services that's
also a big user. They understand how hard it is to get something right because it's big and also
because it involves concurrency. Once you're dealing with something big, it involves concurrency. And that, it's a whole enormous complexity to things.
But it also works at the small level. But the people who write the small pieces,
they don't understand that yet. The people who write the web services understand it. My colleague, Marcus Cooper, who's the one who is currently basically
maintaining the TLA plus tools, he has this wonderful talk, which is titled,
How to Save Two Hours of TLA with Two Weeks of Debugging. People will just say, oh,
using this formal stuff is just so hard. It's just so much work. You know, I don't need to do that. And it's not a good attitude, especially if you're dealing with concurrency. And if there's concurrency involved, you just can't fly by the seat of your pants. You know, you're not going to get it right. I guess one of the perhaps tragic, sad but true things is that oftentimes, based on this description, we don't know if we need these tools.
We don't know if we need this upfront design, the formalized spec, the math stuff, until we're already feeling the pain down the road.
Maybe it's two weeks.
Maybe it took us half my career to realize where are the source of all these bugs
are right we're because i think as uh industry you know as uh pragmatic workers like bugs are
just part of the game i mean we're just like we've kind of resigned ourselves to the fact that our
programs are going to be wrong aren't haven't we but with the your tools sounds like you can
actually prove that they're going to be right before you start to write.
Well, there are two kinds of specifications you write.
One is what the program is supposed to do.
That's not telling you anything about coding it.
Then the other spec is how the program does it, which is a high-level view of how the code works. Now, the way I like to think about it is
a high-level specification. Think of it as an algorithm and think of the code as an implementation
of that algorithm. Then the only difference is that you think things as algorithms because you
read them in algorithms books because they're algorithms that are useful for lots of things.
But you're arriving at an algorithm that's just good for that particular problem you're solving.
But getting the algorithm right doesn't imply that you're not going to make coding bugs.
But if the algorithm is wrong, doesn't matter how good your code is, the algorithm is wrong. So it's eliminating
one class of bugs. But it's a class of bugs, and especially with concurrency, that testing is very
poor at finding because the bug can occur just in terms of how relative speeds of which two things are happening.
And so you could have some system that you've tested the hell out of and it's been running
great for five years. And then you do something that changes some piece of hardware that makes
it run at a different speed than some other piece of hardware and the bug will appear.
But also there are a lot of debugging tools for finding bugs in code.
They're not good at finding bugs in algorithms. So, it's not that I'm saying, oh, coding is no
problem. They're not going to be bugs. I'm saying, get the algorithm right before you code it.
Yeah.
And for that, there are tools that are just as formal as code and ways of testing it that are just as good as those tools that work at fighting bugs in code. of the plan. And so your plan is your algorithm, right? There's your thing that you're going to
make sure you have right in the first place. And just because you got the plan right, doesn't mean
you're going to execute it right. You can still make mistakes later, but if the plan is wrong
from the get-go, well, the entire endeavor is a waste. Yeah. And when you do find, I mean,
if you do, you know, in testing find that it's wrong, it's a hell of a lot more work to fix it
that it would have been
had you found it when you were writing the algorithm.
Yeah, the sooner, the faster you can find those problems, the cheaper they are
to route around. So this whole thinking in algorithms, this is something you say is very
few programmers think in terms of algorithms. And I think you're right about that. But my question to
you, for me, is how would one begin to change this predilection? Like, how would you, you know,
take me and help me start thinking in algorithms before I'm thinking in programs?
Well, what I do to, you know, for engineers is I force them to think mathematically. I'm not the only person who
talks about things about the need for specifications, but I think just about everybody
else wants to make writing the specifications easy for the programmers. So he writes their
specifications in what looks like a programming language. And I'm very hard-nosed in this.
You don't want to do that.
I want to get people to think outside the box.
And people say that all the time, but they never say what the box is and everything.
What is the box?
But here I can tell you exactly what the box is. The box is thinking programmatically as expressed in what I'm now
calling coding languages, instead of thinking outside of that box. And the way to think outside
of that box is to think mathematically. So a TLA plus spec is a formula, a real mathematical
formula. It may be hundreds or even thousands of lines long,
but it's a formula. And you say something to people, oh, a thousand line formula,
how could I possibly understand that? You're giving me a thousand line program. Oh, that's
trivial. Well, the math that's being used is simpler than programming languages. If people
give a semantics to a programming language, they translated it to math.
And that math is pretty complicated.
But the language TLA plus is really simple in the sense that it is math and math is really
simple.
But math is incredibly powerful.
And you have to think outside the box.
You have to really learn to use that power. here in these United States, as I was, and go into the industry and begin our work.
Math is part of it, but it's not like a main part of it.
It's like it's there, you know, you take calculus, et cetera.
But, you know, we don't have a lot of math.
I don't have a lot of math.
And I've been a working developer for 20 plus years.
And I don't have that much math.
And also, the math that you learned hasn't been taught very well.
Well, a lot of the stuff I learned wasn't taught very well, but that depends on your school, I guess. No, simply because you've learned about sets and functions, right? Everything I write in a spec is sets and
functions plus arithmetic. I presume you do pretty well with arithmetic.
I got that part down, yeah. You've learned what sets and functions were,
but you haven't learned how you would use
them to describe what your program is supposed to do. And that's the part that you have to learn.
That's the part that you're forced to learn. You start writing in TLA, TLA plus. Now I do what I
can to educate, but as I've said, I'm not that good as a teacher. So, you know, and I have sort of a video course on TLA Plus,
and some people like it.
I don't know.
I probably only hear from the people who like it.
But, you know, one way you learn is by just reading other people's specs.
Is there open source TLA Plus specs out
there that people can read? Oh, yeah. Go to the TLA Plus website. There's a GitHub repository with
a few dozen specs there. One of the things I hope to do before I retire is, after my current project, which is writing a book, but after I
finish that, I hope to start organizing a set of examples that will help people learn. Whether I
will do that again, whether I'll finish the book, those are unknown questions. And as I like to say,
I'm no better at predicting the future than anyone else,
even if it's my future.
Well, if I might ask you to predict the future a little bit,
I just noticed, I mean, you've been in industry,
you've been at Microsoft Research, of course.
So in the academic circles, in the industry circles,
for many, many years, we've had recent shows.
We interviewed, we spoke with Kelsey
Hightower. He just retired from Google at 42. Doesn't mean he's retired like from anything,
but he's just retired from Google at 42. Steve Yegge just unretired at 54. So he retired at 52,
unretired, went back into the workforce. And you're at 82. You've had a long, illustrious career.
I mean, lots of productive work.
Do you have plans on hanging it up?
Are you going to work until your time on Earth is ended?
What are your thoughts on that?
Because you have a book in the works
and you want to do something after that.
No, it's not going to be too long.
Not many years.
I'm not ready to announce a retirement date,
but it's not going to be too far off.
I'm here with Lazar Nikolov, developer advocate at Sentry.
Okay, let's talk about your live streams.
You're constantly building something and live streaming it.
Give us a peek behind the curtain.
What do you do in your live streams?
What can we expect if you watch?
Yeah, so at Sentry, that's even how I started.
I started to build a mobile application or tracking expenses in four different frameworks
because I wanted to you know explore
basically the DX of the four most popular frameworks SwiftUI, Jetpack Compose, React
Native and Flutter. Then I moved on during October of course we did the Oktoberfest where we
tried to fix a bug in the React Native SDK and that was really cool. And what else for example
right now I'm streaming on and I'm usually streaming on YouTube. I started building a really cool project that I want to call
the Errorpedia. So it's like a Wikipedia of errors. So my idea was to just build a website that
enumerates famous frameworks, like used frameworks, and what errors can be encountered within those frameworks
with a little bit of explanation why they happen and also how to fix them. I had an interesting
choice of technology so like Astra for the whole website because of its ability to embed react
components or view components or solid svelte components and these are frameworks that I want
to cover the errors from so like the whole website
the whole doc site would be just astro and markdown but when the interactive example needs to happen
I'm just going to export that from a package in the monorepo so that was interesting and I started
building that and it put me into this mode of thinking about errors and I was like okay we can
do these errors and we can do these errors I started to compile a
list of errors that we can document and I started thinking about you know what about errors that
don't necessarily break the page yeah I mean they break the page but they don't produce
errors in the console right there could be responsiveness errors right uh like mobile or
tablets something like that something gets pushed off the screen there's like an overflow hidden
on the body you can't really access that you know gets pushed off the screen. There's like an overflow hidden on the body.
You can't really access that, you know?
So it breaks the flow, the operation flow for the user,
but it doesn't produce anything in the logs, right?
Maybe there's, maybe we're talking about, I don't know,
Firefox or Safari CSS issues.
Cause we know, especially Safari,
when you're building something for Safari,
those who do front end, things usually break,
but they don't produce an error.
So I was thinking about that and I was like, OK, we have all the tools in Sentry.
So, so, yeah, that's what I'm doing right now.
I'm streaming, building that widget that lets you, you know, start the recording and send it to Sentry.
OK, if you want to follow along with Lazar, you can follow him at YouTube dotcom slash Nikolov Lazar. We'll put the link in the show notes, but it is youtube.com slash N-I-K-O-L-O-V-L-A-Z-A-R.
Lots of cool live streams, lots of cool videos.
Check them out.
Again, the link is in the show notes.
Also, check out our friends at Sentry, sentry.io slash changelog.
Use our coupon code to get six months of the team plan for free.
That's so awesome.
Six months free.
Use the code changelogmedia again.
Sentry.io.
That's S-E-N-T-R-Y.io. What's changed?
When you first got into the game, what's changed and what stayed the same?
You've been in the industry a long time.
Certain things change.
As the song goes, the more they change, the more they stay the same. What's different today, 2023 versus when you were first in computer science and then what's still the same or at
least echoes? Well, the big change since I started and I started programming in 1957
is that now nobody can understand programs.
Okay.
I mean, you just have to build things on top of stuff that you don't understand.
Things have just gotten so complicated because, well, I like to say,
things have advanced by evolution and not by intelligent design.
Ah.
And that works great if you've got a million years to do your designing. But, you know, when you're doing it over a matter of decades, it doesn't work too well.
It's certainly been compressed. I agree with that.
I'm sort of amazed that there hasn't been a...
Extinction level event? down. They're a really serious one. I suppose they're, well, I guess you sort of see some of
them. You read about the state of California. I forget what it was, whether it was for their
motor vehicles, armored motor vehicles or not, something, $100 million project that just failed
and it was just abandoned after a few years. So, you know, it happens on that scale.
On the other side of the coin, you know, we have young people today who can pick up a computer and a programming language like JavaScript or like Swift, if you're talking on iOS.
And in a matter of days or weeks, you know, they can go from never having written any software
to deploying a game or an
idea of theirs into the world and that's kind of cool i mean i agree with like every layer of the
stack has made it more complicated and more fraught and we've understand less and less
but you can do a lot with a little, even though you're actually standing on a lot, and that is shaky ground.
But you can get a lot finished.
Yeah, well, I mean, coding languages have improved a lot since I started.
Not difficult since I started in machine language.
But yeah, there's a lot of good things in programming languages.
But TLA+, that's the main thing that you're here to talk about,
the main thing that you want to pass on to our listeners.
Well, not just, I would say not TLA+,
I would say, you know, thinking outside the box of code.
TLA+, is one way of doing it.
You know, it has its domain of applicability.
The place where it shines is when you're faced
with problems with concurrency. But that's not the only way of thinking above the level of code.
Well, at the top, I mentioned this Turing Award, which you won in 2013 for your work
with distributed systems. I want to ask you about the bakery algorithm. That seems to be both your favorite and probably the one that you can explain to our listeners without,
you know, multiple hours. I'm not going to ask you to go through the Paxos paper,
for instance. But when it comes to distributed systems, you know, I've been thinking about this
a little bit. And I'm wondering what your take is on the thought of when a system becomes
distributed. Like if you were to draw a line in a system going from like a very simple processor to all the way up to like the World Wide Web, somewhere in there it got distributed.
And I'm curious where you think that threshold is of when a system becomes a distributed system.
I think I would probably say that distribution is in the mind of the beholder.
Okay. I mean, this laptop that I have in
front of me, I view it as just a single thing, system. Yeah, it's all right there. But to the
hardware designer, this is the distributed system. You had all these chips talking to one another,
and even inside the processor chip, I mean, that processor chip might have been designed with somebody who was using TLA+.
People at Intel used to use TLA+, and they're designing their chips, and I don't know if they still do.
But in particular, the memory of the chip is a distributed system. I mean, it really is messages going back from one part of
the chip to another part of the chip. There's two levels of caches and the main memory, and
that's a distributed system. The distinction I make is between traditional programs and
concurrent ones. Traditional programs, you have one process specifically that's just doing something,
computing of an answer and stops. And in a concurrent system, you've got multiple independent
processors acting and they're not supposed to stop. They're just supposed to keep doing
something. And so how you describe what they're supposed to do is very different from how you can describe what a traditional program that just takes some input and computes an answer does.
That's very simple.
You can just describe that as a function.
But I spent a lot of time thinking about how you describe what a concurrent system is supposed to do.
And that's built into TLA Plus. I mean, not built into it,
but in the sense that there's no language for distribution. There's nothing distributed about
it. You look at the, it's just math, but the math was designed in such a way to make it possible
and make it easy, as easy as possible to say what a concurrent system is supposed to do and at a high level what it does.
Yeah.
An apt metaphor might be for concurrency might be the idea of movies.
So you have a movie that has a series of events and then it starts, things happen, right?
24 frames per second, however many seconds, and then it ends.
And that's usually pretty normal, unless, of course, you're Quentin Tarantino,
now we're going to go out of order and all that kind of stuff.
But that's pretty simple to think about.
But then you have like...
That's actually, that's beautiful.
That's an example I love, because that's the way described systems. Basically, think of them as a movie. And
what distinguishes a digital system from an automobile is that you can't think of an
automobile as operating in distinct steps. It's going continuously from one place to another.
But you can think of a program or a distributed system or the internet as proceeding in discrete steps.
And the way physics, science describes cars or the universe is in terms of its state.
And that's the same way I describe a distributed system in terms of its state.
You can think of the state of a program as the value of its variables, but there's also a lot else in its state that you don't usually think of as part of the state, but is really
there.
And so you just think of that as a movie.
But the other important thing and the thing that a lot of people didn't used to do, I don't know
what they're doing now. I've stopped paying attention to what theoreticians or people who
are designing specification language are doing, but they would think of the movie as a movie of
this particular device. And then they have to worry about, well, how do you put these movies together? But what I say is, no, this is a movie of the entire universe. And you describe the system
behaving correctly is a property of just one particular part of that universe. And there's
one thing about, and this is a technical point, but it's a way that I think
differently from most people, is that if you take a movie and you add, you repeat frames,
since the other thing about these, to move the frames, is they don't represent any particular time scale because you might be able to describe
a clock, an hour minute clock by a frame, which shows one frame per minute. But then if you also
want to describe a clock with a second hand, if you've set yourself to say, oh, this is one frame per minute, then you can't describe it.
You can't combine it.
But what I say is that the frames don't met any particular time scale so that a frame,
if a system, if you can describe this frame, a system with a certain movie,
then if you add frames that simply repeat what the previous frames have done, you haven't changed the description. Your system should still satisfy
that. And that basically solves the whole problem of how you've written a specification.
Well, like most people, they write a specification of an hour and clock with an hour and minute hand. And if they add a second hand to it, it no longer satisfies that specification.
But when I write the specification, it satisfies it because my specification of the hour and minute
clock allows frames where the same minute keeps being repeated. So if you add a second hand,
it still satisfies that specification. I realized I had that insight in 1983. And to my knowledge,
TLA plus is still, well, TLA that TLA plus is based on is still the only formalism that I know
of that people use that has that
property, that if you write a specification of an hour and a minute hand, an hour and a minute
second clock satisfies it. And what's the name of that property? What would you call that property?
Well, I call it stuttering insensitivity. I think of these as stuttering steps. That's what I call
them, stuttering insensitivity. It's insensitive to the stuttering steps. That's what I call them. Stuttering insensitivity.
It's insensitive to the stuttering steps.
And the reason is because, I'm asking you, I'm not telling you,
the reason why it's insensitive to the stuttering is because the way that you think about it is about the order.
It's about the order of events, not as much as the actual distinct time stamps.
Is that right?
You're thinking about order. You're not thinking about the absolute time distinct timestamps. Is that right? Like you're thinking about order.
You're not thinking about the absolute time or the absolute.
Right.
And if you want to talk about absolute time,
you don't build it into your movie camera.
You simply have a clock in your movie.
Right.
And it doesn't make any difference to that clock,
whether there is one step between one second and the next or a thousand steps between one second or the next.
And what that does is it makes the math so much simpler.
And simplicity is the key to being able to deal with complexity. You have to have a really simple way of, if you're dealing with complex
things and you're using a complex language to describe it, boy, you're in trouble. And if
you're building a language where you have to do something special when you've written a
specification of an hour minute clock, simplicity means that should also work be a specification of an hour minute
second clock if it's not you've got this extra complexity to deal with was this idea in the
bakery algorithm it sounds like it was no no no it's no separate from that it's separate for that
that seems to if we real quick describe your bakery algorithm we'll i'll try to contextualize
it but it seems like this at
least time and sensitivity or i don't know explain it first and i'll explain why i'm trying to tie
it together i'll say this problem is stuttering and sensitivity it doesn't you don't need it
when you're just looking at a single system so i'm if i if i'm only going to be looking at
an hour minute clock i don't have to worry to worry about adding steps in which the minute doesn't change.
It's only when I hike this hour and minute clock and somehow I want to compare with something else that things become complicated if I don't use stuttering and sensitivity.
So for the bakery algorithm, if we're just looking at the bakery algorithm by itself, stuttering insensitivity is irrelevant.
Because of the algorithm or because of the circumstances it's trying to solve?
Because of the circumstances, just looking at this.
If I wanted to combine the bakery algorithm with something else, then stuttering insensitivity,
then I want to make sure that I describe my algorithm the way that's stuttering and sensitive.
Now, the thing about TLA is you don't have to think about it.
That is, you don't have to write your...
I'll write it this way.
If I have a thousand line specification of a system, to make it stuttering and sensitive,
I add about seven characters to the specification.
Right. Doesn't the specification. Right.
It doesn't take much.
Right.
So when you're writing the specification,
you're not thinking about stuttering and sensitivity.
You're thinking about the system you're describing.
If I'm thinking about an hour and minute clock,
all I'm thinking about is hour and minutes.
I don't have to worry about anything else.
But when you write it in TLA, TLA+,
it's just going to automatically be satisfied
by an hour, minute, second clock. So when I write, describe the... I'm happy to describe
the bakery algorithm pseudocode, but I know that when I write, describe that pseudocode
mathematically, it's going to be stuttering and sensitive, but that has nothing to do
with how I write, even think about that algorithm.
Right. It's a property of the implementation of the system that TLA provides for you.
And as long as you're thinking about a singular system, then I guess in that case, it's irrelevant.
So let's look at this bakery algorithm. This is one of the six bullet points of which they say the reason for your Turing Award. This was, talk about thinking differently. This is the one, you talked about it briefly at the beginning. We didn't name it back then, where there's a paper by two men, Paul, is it Paul Johnson? Help wasn't the bakery. They didn't influence the bakery. That was the Time Clocks
paper. My bad. No, the bakery algorithm happened when... Actually, that was the first concurrent
algorithm that I wrote. There was a paper... I became a member of the ACM and started getting
the ACM communications of the ACM. It's the first computer science journal that I ever looked at.
Actually, I wasn't even aware that there was such a thing as computer science journal that I ever looked at. Actually, I wasn't even aware that there was such
a thing as computer science because that was in 1972 and there really wasn't computer science in
those days. News to you. Yeah.
But at any rate, so they had this algorithm for solving mutual exclusion. Mutual exclusion is
the problem that if you have a bunch of processes running and say
some process wants to use the printer.
Well, if you had two processes trying to print at the same time, and you get the output interleaved
on the printer, it's not going to work.
So you have to ensure that the processes, some way of ensuring that only one process at a time is using the printer.
And that's called mutual exclusion.
And the problem was recognized as being important and introduced by Edsford Dykstra, I think it was in 1965.
At any rate, it led to a whole bunch of algorithms, mutual exclusion algorithms, which had more and more different, better and better properties in some sense. And so I read this paper of this
algorithm. It seemed awfully complicated. And I said, it shouldn't be this complicated.
And I whipped up this little algorithm for two processes and send it to the editor.
And a couple of weeks later, I get back a letter from the
editor. Here is the bug in your algorithm. That had two effects. One is that it made me realize
that concurrent algorithms are hard to get right and they really need a rigorous proof.
And the second one is, I'm going to solve that damn problem.
A challenge.
And I came up with the bakery algorithm. And I don't think I'll bother describing it here.
But when I had written it, when I wrote the correctness proof, I realized that that algorithm
had a remarkable property. All these algorithms work by processes of communicating
by shared memory. One process would set a value of the variable, another process read the value
of that variable. But how do you implement that reading and writing? Well, you need mutual
exclusion because if somebody tries to read and write while somebody else is trying to read, who knows what's going to happen?
In fact, some computer scientists said that that's a necessity.
You have to have mutual exclusion somewhere to solve the mutual exclusion problem.
But I realized that the bakery algorithm works.
It doesn't require mutual exclusion on the reading and
writing of variables. Each variable is written by a single process and it can be read by multiple
processes. So you don't have the problem of writers writing at the same time to worry about,
just readers writing at the same time as the write is being done. And the beautiful thing about that algorithm
is that if a reader reads while the value is being written, it doesn't matter what value that read
gets. So if I'm changing the value of the variable from 99 to 100 and you come and read it, you might get 4,933 or zero. And the algorithm still works.
Did you know that at the time or you discovered this later?
No, I discovered it when I wrote the proof.
You did. Okay.
That's one example of being a mathematician.
That's why proofs are good, right?
Well, and actually I went on to formalize the way I was writing those proofs in a way that I think is really nice and gives you a nice way of thinking about concurrency, but it's not a practical way of writing the kinds of specifications that I was talking about in the real world. Bakery algorithm is a nice, simple algorithm,
and you can reason about it that way.
But in fact, if I had been reading about it,
if I had been writing it in TLA,
I probably wouldn't have discovered that it had this property.
But because there was no way developed
for reasoning about program back in current algorithms. I guess I had
to be really careful to make it rigorous. I had to be really careful about the assumptions that are
being made. And so I didn't make any assumptions, think about making assumptions about atomic
actions that happen atomically. So two actions can't happen at the same time. I had to figure
out, well, actions can happen. Reads can happen at the same time as writing. So my reasoning method
allowed that possibility. And that's why I was able to discover that it had that property.
Nowadays, we write specifications, I think of it as pseudocode in a way where we assume that reads
and writes are separate the actions they can't happen at the same time as one another and so
we don't worry about it and we leave it to the implementation figure out how to do that
right so i heard you describe that particular algorithm as A, your favorite, but I also heard you describe it as one that you said you discovered versus not the inventor, but the discoverer, which I thought was kind of interesting and somewhat poetic.
And I'm just curious, not necessarily to distinguish the two, but what did it feel like? What were you thinking? How did you go about the process? If you can go back to that day, that day of discovery of that algorithm, or did it take longer than a day?
Describe to me what you were doing, how it came to you.
Were you in the shower and it popped into your head?
Were you scribbling furiously on a notebook?
What was it like coming up with something like that?
No, I cannot tell you what happens with the microsecond where something happens.
I want to know about the moment of inspiration.
Well, it wasn't a moment in a sense.
The reason it's called the bakery algorithm is when I was a kid, it was a bakery.
And these days it would be called the deli algorithm, where a bunch of people want to
get served.
And you have this machine that gives out tickets, it gives out numbers. And so you just take the number and
people get served in order. And so I thought about that. And of course, the problem with that
as a solution to the problem is that you have no ticket distributor. And so the idea struck me as, well, what happens if a people chose their own numbers and they did it by looking at everybody else's number and picking a bigger number? And that was the basic idea. And from that idea, I would say everything was fairly straightforward. You first get the obvious problem, well, what happens if two people
chose the same number? And then in that case, you serve people in the order of the alphabetic
order of their names. So that's a problem that's easy to solve. And so I wrote this algorithm and
the obvious way of writing it gave a bug, and I saw that bug pretty quickly.
And then I figured, well, here's a simple way of fixing that bug.
And this very rare event, usually you see a bug in a concurrent algorithm, and you can fix that bug, and that's going to introduce another bug, and you fix that bug, and that's going to introduce two more bugs.
Right.
But fixing that bug, and it worked.
So that's how it got discovered.
There was no moment other than perhaps that moment that brought back the memory of that bakery.
That was the only store in the neighborhood that used this way of giving out tickets. Yeah, we have one of those still here in our local town where you get a ticket
and they, but they have, I mean,
it's parallel.
They got some currency going on
because they have multiple strings
of numbers going on.
And because we're always sitting there
trying to, and it's a very popular place.
So like I'm 275
because you're wondering like,
when's my order coming up, you know?
And most places you just wait for 274
and then you're like, hey kids, we're next.
But this place has multiple strands of numbers. And so like they wait for 274, and then you're like, hey, kids, we're next. But this place has multiple strands of numbers, and so they'll go 274, 163, 375,
and we never know when we're going to get our meal.
But they still get it to us, and seemingly in the correct order and mutually exclusive.
They don't deliver ours to everybody who's sitting there waiting,
so they've got it figured out as well.
But that's what I was thinking of when I read your bakery algorithm.
I thought, okay, I can see where maybe he hearkened to his youth
of sitting in line waiting for that deli sandwich or something.
So when you got the solution all figured out,
did you then take it back to that editor, the one who had sent you the bugs?
Well, I resubmitted it to ACM.
That had to feel good.
What happened next?
Was it like immediately?
Went through the reviewing process and got published.
Simple as that.
But the thing is there are various, you know, studying that algorithm, but also, well, both the other method of reasoning that I developed based on it. And I'd say the next three years or so, three or four
years of what I did was all based on that algorithm. I'm studying that algorithm. For example,
if you read my number while I'm writing it and I'm writing it from nine to 10 and you get a million,
which is possible, and then you write that. Well, numbers might start
increasing very fast, so you don't want that. So, what I did is I devised an algorithm that
if I'm changing it from nine to 10, it will guarantee that the number you read will not be any bigger than 10.
That was a non-trivial algorithm.
And so that's an example of the things that- Refinements, that's for performance reasons.
Yeah, but in developing those other algorithms, I developed my skill as thinking about concurrent algorithms and thinking about how to
reason about them. Well, that certainly wasn't your end game. That was very much, you know,
getting you into the game. You went on from there to publish many things. We talked about that
Paxos paper. A lot of your work is out there driving Azure, driving, you know, way down in
there in the algorithms of AWS, of all these distributed
systems. So lots of cool stuff came out of those efforts. I'm curious, as we close up here,
Leslie, appreciate all your time today. Don't want to take too much of it. I'm curious what
you're up to these days. You got this book in the works. And what does your work look like
when you're not? I mean, obviously, if you're authoring a book, I know what that looks like.
But like, do you have daily work that you're doing inside Microsoft, pushing research forward or working on algorithms for them?
Obviously, you don't have to divulge any trade secrets, but like, what does a work day look like for you at this stage in your career?
Well, my work day seems to involve answering a lot of email and a lot of things that take up time.
But other than writing the book, I'm doing various things that involve, what I would say is that
sort of management type things involving TLA Plus. I told you this colleague, Marcus, who is doing a lot of heavy lifting these days, I quote, managing him.
I'm not his official manager, but we work together, keep tabs on what he's doing and learn from him what's going on in the world of TLA users.
He's the one who these days interacts with users.
I very rarely do that.
And basically Microsoft gives me the freedom
to be able to devote my time primarily to the book.
Awesome. Definitely looking forward to that book.
Well, Leslie, I will let you go, like I said before, honored to have you on the show. Appreciate you spending some time with me today. To our listener, we do have the links to all of the things mentioned here today in your show notes. Of course, check out TLA Plus. Check out the resources. There's some tooling there and some videos as well.
If you want to learn from Leslie directly a little bit on how to think algorithmically with TLA+.
Any final words from you, Leslie, before I let you go?
Anything you want to say to our audience?
Well, I guess the advice I give is think.
And really thinking means writing, so write. We have some awesome episodes coming down the pipeline. Kaizen 11 with Gerhard Lazu drops on Friday,
followed by 30 Years of Debian
with the project lead, Jonathan Carter, next Wednesday.
We've got Justin Searles and Landon Gray
coming on to help us see through the lens
of intergenerational conflict among software devs.
We've got Andreas Kling from SerenityOS
and the Ladybird Browser Project.
We've got Stephen O'Grady joining us to discuss the importance of open source licensing
in the wake of Meta's Llama 2 announcement, and it just keeps rolling from there.
I guess what I'm trying to say is, subscribe to the pod, why don't you?
And if you're a longtime listener, do us a solid by telling your friends about the changelog.
Word of mouth is still the number one way people find us.
Thanks once again to our partners, Fastly.com, Fly is still the number one way people find us. Thanks once again
to our partners,
Fastly.com,
Fly.io,
and Typesense.org.
And to our beatmaster
in residence,
the mysterious
Breakmaster Cylinder.
That's it.
This one's done.
But we'll talk to you
on Changelog and Friends
on Friday. 🎵