CoRecursive: Coding Stories - Tech Talk: Software in Context with Zach Tellman

Starting point is 00:00:00 You know, if we say something is over-engineered, what we mean is it's too complex, or it's too robust, or it handles a bunch of situations or scenarios that are not relevant to how we're using it. It's okay for us to create narrow things. It's okay for us to create PowerShells instead of Bash sort of environments, because that narrowness gives us the ability to go and do things we might not otherwise be able to do. Hello, this is Adam Gordon-Bell. Join me as I learn about building software. This is Code Recursive. Zach Talman wrote a book about designing software called The Elements of Closure. I love the interview I did with Runar episode 27 on abstraction. Zach also has a lot of things to say about abstraction, but they're very different. They're not incompatible with Runar's thoughts,

Starting point is 00:00:50 just totally different. One big takeaway I have from talking to Zach is that we need to consider the context in which software is built. Something can be over-engineered or it can be a hacky solution only in the context in which it's built. And in fact, it can be both over-engineered or it can be a hacky solution only in the context in which it's built and in fact it

Starting point is 00:01:05 can be both overengineered and hacky in different dimensions and in different contexts also zach just looks at software as being closer to literature which is a it's a foreign perspective for me it's not how i think about things it's a very interesting talk I hope you enjoy it. Yeah, so thanks for joining me, Zach. I bought your book Elements of Closure. And I don't know a lot about closure, actually. But the kind of latter parts of the book I felt were kind of more generally about building software. And I was curious why you decided to write this book. I called it Elements of Closure because it was meant to be this callback to Strunk and White, the Elements of Style, which is a flawed book in a number of ways, but still, I think, served its purpose, right? It helped people sort of stop overthinking things and allowed them to

Starting point is 00:02:01 just sort of get into the process of writing. And with that goal in mind, I sort of started out writing a book about names and sort of what was a good policy for naming. And I very quickly realized that I, the writer, could not go and give hard and fast rules for that because that required some understanding of the domain that people were trying to find names for. But I did, around that same time that I finished the first chapter, quit my job so I could focus on it full time.

Starting point is 00:02:30 And I thought that was going to be a three month project, right? You know, if I just put my head down and really worked at it. And then I was going to go and work on a whole bunch of other projects that I've been thinking about. And it turned out that it was a 16 month process of writing the book because I kept on trying to do more. Right. Or I kept on coming up against this thing where I felt like I understood what I would do in a certain situation, but I didn't understand it well enough to explain why. And there is a concept that I actually encountered while I was doing research for the book called tacit knowledge, which was first coined by this guy, Michael Polanyi, who was a chemist, and then later a philosopher and historian of science. And he talked about sort of why it was

Starting point is 00:03:12 that hypotheses turn out to be right so often in science, right? I mean, there are infinite many hypotheses, and you know, most of them are wrong in practice, but we get it right a lot more often than we should. And so it's clear that upstream of what you would kind of call the scientific process, right, where you have a hypothesis and you test it, and then you try to invalidate it. And if you can't, then it gets adopted. There is this much more ineffable kind of thing going on there, which is a thing that we know, but can't explain. You know, we name things constantly, but we don't ever actually study the active naming. We constantly draw lines between things, but we don't actually focus on that process. This is this thing that we kind of assume that if we just do it long enough, we'll stop being bad at it. Right. And it's a little peculiar, right? We regularly say,

Starting point is 00:03:58 or possibly joke that naming is one of the two hard problems in software. And nowhere in my undergrad program did anyone actually talk about how to name things. I think the only guidance I ever got was like, you know, single letter names are not good. So that seemed strange, right? It seems like if this is a hard problem, then we ought to go and actually talk about it rather than just kind of treat it as this thing, which is sort of beyond explanation or beyond refinement. And so that was kind of where I ended up, right? It was trying to go and put words to this intuition that I had. And the book that I created, I think, does an okay job of it. I know that this is probably going to take more than a lifetime to do sort of in a way that I would actually find satisfying. But, you know, it was a good first

Starting point is 00:04:43 cut. And I don't know, I mean, it was a very fulfilling process, I think, because I do feel like I'm better able to explain why I feel like something is good or something is bad. Yeah, there is a lot of things that we have strong opinions about, but can't explain, I guess. It's interesting. I think that starting with names, maybe that seemed easy to you before you got into it. It's like the hard area. So what did you find? Or what are your recommendations now that you've spent this time on names? So I mean, if we were to just kind of jump into it, I guess, you know, there are these very closely related concepts, which we often use the words for, but I think very rarely bother to define which are indirection and abstraction. And just as a disclaimer here, I'm going to define these

Starting point is 00:05:26 words, not in a way that I think everyone listening to this will like immediately agree with. I think that they represent sort of the plurality of usage in the industry, but I think that there are also cases where they're used in ways that I find to be sort of misleading. So make of that what you will, I guess. But for me, indirection, I think is best described as the separation between what we're doing and how we're doing it. And so, you know, the simplest explanation or sort of example of this would be the power button, right? We press the power button, something happens and like our computer turns on or some other piece of electronics turns on. And the key thing there is that if we press the button and nothing happens or the wrong

Starting point is 00:06:09 thing happens, we don't understand what is going on underneath the cover as well enough to be able to actually debug that, right? This is why we constantly turn things off and turn them on again, right? We're just referring to the one thing that we know how to do and hope that if we do it repeatedly, like things will finally like align with our expected outcome. And so, you know, this is both the power and the sort of danger of indirection, right? We are purposefully blinding ourselves to something because, you know, we either don't want to or simply can't understand what's going on underneath there. And that gives us a lot of power, but it makes our sort of ability to understand the world

Starting point is 00:06:45 and to interact with it more fragile. So I think of indirection, I mean, the first thing that comes to mind is a magician, like kind of doing something over here while they do something that you're not supposed to see on the other side. Well, I guess typically you'd call that misdirection. Oh, misdirection. That's trying to actively mislead somebody. I think indirection is just saying, if you want to go and understand this, you're going to have to go and reach this point, this sort of line that we've drawn, and you're gonna have to move past it. Right? So the power button is not trying to, at least an ideal case, completely make opaque what's going on at the covers. It's just an opportunity to stop asking, and then what? Right? It's a chance for us to be in curious.

Starting point is 00:07:28 And that's not to say that people won't. And, you know, often do, I think, have sort of tendency to say, and then what, or why, or, you know, to actually try to go and peel back that layer. But you want to go and have that sort of invitation be there, because some people want to stop digging down, and they want to start building up, right? And you need to have the ability to go and stop because even in a fairly short amount of time, I think that we've built enough from the sort of foundations of like the computer hardware that you could spend a lifetime asking, and then what, before you ever get a chance to actually start building up in the other direction. Yeah, it's like a black box or an abstraction barrier or something. Well, so's like a black box or an abstraction barrier

Starting point is 00:08:05 or something. Well, so yeah, so let's talk about abstraction, right? Because I think that this is these terms are often used interchangeably. And I think that there is a subtle difference there. So let's consider another example. Let's say that you have the code print line, hello world, right? So there is indirection going on here because we are telling the system, please print this text. And we are unaware of how that actually occurs. We're not even sure where it's being written to, right? It might be getting written to a console or to a log file or to a browser or like, you know, anywhere that our texts could possibly be streamed. It might be getting streamed there or to all of them simultaneously. And so there is a separation between the what and the how, right? And for us, all of those different mechanisms are

Starting point is 00:08:50 fundamentally collapsed into something which is the same, right? And that is how I define abstraction. Abstraction is treating things which are different as if they're the same. And it's critical to recognize that this is happening on both sides, right? Because this thing that is actually doing the printing, it is treating all possible text as interchangeable, right? We do not have a special hello world printing handler. We go and we take the text and we're able to go and we're able to deal with this. And so there is this generalization sort of on either side of this interface, on the one hand, with respect to implementation, on the other hand, with respect to the sort of entire input domain. And the reason that it's important to distinguish between indirection and abstraction is that abstraction occurs via indirection, right? But

Starting point is 00:09:35 it also occurs in a lot of other situations that have nothing to do with indirection, or at least not something that we would sort of recognize naturally as indirection. So in semiotics, there's this idea of semiosis, which is basically collapsing something down to a representation, and then sort of resolving that representation back into something in the real world. And this is a thing that we do unconsciously, right? This is, many people argue, what it is to be intelligent, is to be able to reason via representation. I think everyone who is a successful software engineer has a powerful, innate sensibility about abstraction. The people who have succeeded in this industry have that, which is not to say that people who haven't succeeded don't have this. I think that, you know,

Starting point is 00:10:20 if we were to go and talk about software, the hard problems of software being about sort of modeling and reasoning by analogy and all those other things, I think that that would open it up to a much wider group of people. But there's a unspoken kind of shared sensibility that I think that many of the people that we work with on our sort of day to day all share. And so when we look at something, we say, yeah, that's not a very good abstraction. There there's something going on there there's a shared understanding that's going on there and i think that there this is in some way innate to almost everyone and it's just sort of hyper developed and a lot of people who have sort of found their way to success in our industry i'm having a hard time connecting it back to software development so if i were to pick a abstraction

Starting point is 00:11:02 okay let's say unix pipes, right? So I can take like several programs and take the input of one and like pipe it to the other. Is that an indirection? Is it an indirection and an abstraction? Is it an effective one? How does it fit in your various definitions? Sure. I mean, you know, this is a classic one.

Starting point is 00:11:22 I mean, it is clearly indirection, right? We have this channel by which we're going and we're passing bytes, which are by convention presumed to be like ASCII text in most cases, but not all. And, you know, obviously there is abstraction going on there, right? Because a cat doesn't go and presume to know what sort of upstream of it, right? Nor does grep, nor do any of these other sorts of things. They have a certain assumption there. And in the case of grep, it is something that is, you know, grepable, but there is this sort of mix and match quality there. And so there is both, there is abstraction by indirection going on there. But there are some assumptions that are kind of going into the domain, right? Notably that the input is largely unstructured. I mean, there are tools like JQ and other sorts of things that sort of temporarily lift the input

Starting point is 00:12:08 from this sort of unstructured text domain into this sort of JSON domain, but that doesn't sort of follow along the other way, right? It goes and it's sort of inside of it, we'll go and do the decoding and then the operation and then this encoding because everything sort of drops down to this sort of lowest common denominator of data representation, which is this sort of unstructured text.

Starting point is 00:12:28 In that sort of case, right, we're treating all things as the same. And we're saying, and, you know, the sort of lowest common denominator that allows us to go and make that assumption is that there is no structured data encoding that is sort of common across all these things. And if you go and look at something like PowerShell, which is a Microsoft sort of.NET version, there's an assumption that everything is fundamentally a.NET runtime, which is going and is emitting sort of very specific sort of encoding, which is a far less general assumption. It's a much narrower domain, but you get something out of that, right? By going and not trying to be all things to all processes, you get a certain amount of power because you can now begin to sort of reason about a much more limited set of cases that you need to handle. pluses and minuses of both sides, right? I mean, maybe not the.NET runtime, but actually, if each process in a pipe needs to re-serialize and de-serialize it, there's a certain cost to

Starting point is 00:13:31 that, right? Well, there's a certain cost, and there's also certain things that are not representable, right? I mean, in a very sort of trivial and not very interesting way, you know, JSON is representable as a flat stream of characters. But there might be things that are not representable inside of.NET, notably, like if I have objects that have been encoded via some other sort of standard, like that's just like a totally foreign language for this. Like I said, there's a generality here, but it means that like, fundamentally, we are reduced to this sort of pigeon language when we're dealing with bash utilities often, which is, well, we have lines of text and it's like, well, are they structured? Well, maybe, but not in a

Starting point is 00:14:09 way that we can really deal with. Can we do grips that allow us to go and sort of say only in these sub structures, should we go in doing this match? Well, no, not at all. Unless we go and use some sort of special tool like JQ, which is itself only composable with other JQ things, right? This is, this is not something where we can do this and i mean you know notably what the bash utilities don't have which is maybe not related to pipes but like there is no higher order bash operator you know ls doesn't allow us to pass in a different bash process that does its sorting so this is why you have this kind of this explosion of parameters for a

Starting point is 00:14:46 lot of these things, because there's no ability to go and say, go and do this. And I want to go and sort by time. Like that's not a general purpose operator that bash gives us because again, there's no sort of concept of structure or a particular field or other sort of thing like that. And so every process has to kind of exist within its own self-enclosed universe where it gets unstructured text. It interprets that text. It applies like semantics to that text and is able to do operations on the basis of the semantics that it's described. And then once it's out the other side, that's gone, right? That's not passed along. There's that interpretation is only understandable in terms of like the raw text that it admitted. And so there

Starting point is 00:15:25 is a loss there, right? I mean, there's a reason that we don't go and write large complex applications using just shell scripts. I never thought about higher order functions. So yeah, being able to, instead of adding a flag that you should be able to say, no, I'm going to give you the thing that you would use to sort this. But yeah, like how would that work, right? Because again, the only contract we have between them is I'm going to give you the thing that you would use to sort this. But yeah, like how would that work, right? Because again, the only contract we have between them is I'm going to give you some text and you give me back some text. So how do we go and say, I want the text that is the time or something like that? Like you can imagine something that would do that,

Starting point is 00:15:56 but then you have a much more kind of narrow and complex set of data descriptions that are going on there. And you end up in something that is much closer to, I think, PowerShell or any other modern programming language where we actually have data types. So are abstractions good or bad? Which is better, Bash or PowerShell? It depends. I mean, it's the total top out answer, I guess, right? But I mean, something that I talk about in my book is, what does it mean for something to be over engineered? You know, if we say something is over engineered,

Starting point is 00:16:29 what we mean is, it's too complex, or it's too robust, or it handles a bunch of situations or scenarios that are not relevant to how we're using it. So someone gives you a watch, and they say, it can tell perfect time down to 1000 meters. And you're like, great, it's very heavy and I don't plan on, you know, doing any deep dives anytime soon. So I probably won't use it, right? This is, this is an over-engineered watch. And, you know, unless we really enjoy that aesthetic, it's probably less useful than a watch, which is more tied to the situations in which we plan to wear it. So when we talk about like, is an abstraction good, or, you know, something that I find even more kind of problematic, which is like,

Starting point is 00:17:10 is this abstraction correct? Right? I mean, what does that mean? Right? I mean, we can say that the code that we've written is self consistent, right? It has a set of invariants that it's defined, and those invariants are maintained in a sort of closed over all the operations we can do. But to say that an abstraction is correct is to imply that for all possible scenarios, this is the right abstraction. And I think that that is impossible because we're ignoring things, right? We are going and we are collapsing distinctions into something which is the same, and we can easily imagine a situation in which those distinctions matter, right? Now, those scenarios, right? For instance, we might go and

Starting point is 00:17:51 say, hey, well, you know, this clock that you've come up with. So an example I give in the book is I'm talking about how long it took for people to make maritime clocks, right? Clocks which keep accurate time on a ship. This is actually a longstanding prize that was sort of outstanding from the British empire because they really needed to be able to keep accurate time on ships, even though ships sort of pitched around a lot, there was very wildly varying temperatures and like gravity varies by about half a percent between the equator and the pole. And so there's a massive number of things that you need to sort of shield the inner workings of the clock from in order for it to keep time, you know, there. And it turns out that like, this is possible. But if someone gave you a maritime clock, and all you did is put it on your mantelpiece, it's not very useful. And so when you're kind of considering whether or not something is useful, you have to first say, well, how do I plan to use it? So when you say an abstraction is correct, you're asserting this is useful no matter what, which I think is only true if you have failed

Starting point is 00:18:55 to consider all the possible scenarios that are there, or you are more commonly asserting a bunch of implicit assumptions along with what you're saying in terms of how you expect it to be used, right? This clock I've made is a very good clock. I assume that you will never bother to take it onto a boat, right? But that part is often left unsaid because oftentimes people don't have a rich enough understanding of the world or of their users or of like the sheer possibilities that exist out there to have anticipated the ways in which people might knowingly or unknowingly misuse the thing that they just created. And I think that there's also a bit of a stigma attached to this idea that you built this thing, this open source library, this framework, whatever, and I try to use it for something and it didn't work very well.

Starting point is 00:19:40 And I think that there's a conversation that happens a lot on the internet where people are like, I tried it, I use it for whatever it is that I've been working on lately it didn't work very well it's a failure right that's a bad abstraction and you know you can say the abstract you did a poor job of explaining how it's meant to be used but I think that you can with a very straight face say like you're holding the abstraction wrong right like you're bringing it into a situation which is was not meant to be used in and that's okay it's okay for us to create narrow things it's okay for us to create power shells instead of bash sort of environments because that narrowness gives us the ability to go and do things we might not otherwise be able to do so can you think of an example of a time that something was like

Starting point is 00:20:25 misapplied? So that something had a certain abstraction that fit a use case and then like in terms of software and then was used somewhere where the abstraction didn't fit? Well, I don't know. I tend to always kind of focus on sort of the most trivial example I can think of because otherwise you just spend all this time explaining all the context. But like, let's talk about the SQL concatenation thing I was talking about. If we have something as an internal tool, right? Where we're just trying to go and create like a little bit of sort of macro compression of common manual tasks that people are doing. It's meant for people who have are already privileged users of the system and everything like that. Trying to harden the system

Starting point is 00:21:05 against malicious input is probably a waste of our time, right? If they want to be malicious, they have other sort of means to do so. Who cares if we're doing things in kind of a dumb, fragile way? Now that's perfectly valid up until the point where someone's like, oh, well, this actually is pretty useful. We're going to go and open it up to a bunch of like third parties, right? Or this team that was internal, now we're spinning it out into some sort of contractor team, whatever. And so, you know, very subtly, our threat model has shifted very subtly, all these sorts of assumptions that we were making about who was that was using this has been invalidated. Now, the degree to which we recognize that in the moment depends entirely

Starting point is 00:21:43 on whether or not we were very vocal about the assumptions that we're making early on, right? This is an internal tool. It is fragile. It is insecure. It is broken, except in these very sort of narrow set of circumstances. And so, you know, when I say that, like, it's about us fussing over these nuances, I think that what I really mean by that, I guess, is to say that we are talking about the intended use of this thing that we built, and the intended scope of changes that it's meant to sort of encompass. And, you know, by sort of explicit or implicit, just kind of being absent from how we talked about our tool, the things that it's not meant to deal with, right? The situations that it's not meant to go and be robust to the face of. That's a very good point. Something I don't think

Starting point is 00:22:28 I've heard people talk about explicitly before. That happens constantly, right? I can think of a really old example at a company I worked for and like just a checkbox that was in some sort of UI, not a checkbox, like a dropdown or something. And generally, there was like about five options, but they were configurable. But then some client joined who they had about 3000 options that needed to go in this, right? And all of a sudden, like, they're like, this is crap. This doesn't work. How can we navigate through this list? There's so many of them, I guess, of these kind of assumptions. I mean, it's hard, right? And I don't think that this is a question that has like a closed form answer. And I think that there are frameworks that people have put

Starting point is 00:23:09 around this sort of stuff, right? You know, there's the domain driven school of this. And so, I mean, I don't know what you did in the case of the drop down to 3000 items, but like you could just say, like, you know, oops, our bad, we probably shouldn't have sold this to you in the first place. Like that is the classic sort of startup move there. But I mean, I think that, you know, ideally you can come up with these sorts of metaphors or analogies, which are reasonable, like are well understood by a non-technical or quasi-technical sort of audience where those limitations are, if not obvious, at least not surprising. So you go and you can kind of talk about these things and, you know, you can go and talk about these sorts of use cases that you're imagining. And then hopefully someone who has this crazy complex sort of taxonomy will go and be like, these all seem substantively different than what we do.

Starting point is 00:23:54 And then they can go and they can ask and they can say, well, you know, let's go and talk in more details about what it is that we're trying to do. Is this a good fit? Right. But like that intuition that it's a bad fit has to be coming from the person who is the domain expert, not from you, the engineer who is the system expert. And so you have to come up with something that sort of survives that translation. And so again, this comes down to metaphors and analogies and things where, again, this is this attempt to communicate by this very reductive set of signals that we're getting, right? These very reductive set of analogies or metaphors or very small amounts of texts or something. And you hope that the analogies you've drawn are rich enough that they can be sort of understood. But that also has to go the other way, right? If you have this analogy that you're using to talk to the customer that has to drive the decisions that you're making in your own system so that those two things are aligned and they're not sort of subtly drifting

Starting point is 00:24:47 over time. It makes me think of that. There's some article about like 20 things that people think are true about names that aren't. It's like people's names that are not representable in Unicode or people's names that are things that most systems don't support, right? Right. My wife has an apostrophe in her last name, which for most of the 90s caused people like who try to get her onto a loyalty card system to kind of freak out when their computer crashes. Right. And I mean, and, you know, we have, I think, as an industry become better about this, right? We're not just kind of assuming that 255 characters is good enough for everybody, right? There is a maturity that exists there. But I think that also part of that is we have

Starting point is 00:25:32 people who have run into these problems over and over again. And, you know, in my book, I assert that what makes someone a senior engineer is not their expertise with the technology, it's their ability to predict what the world is going to throw at that technology, right? To be able to easily and upfront at the, you know, cheapest point in the design process say, oh, but what if, you know, this happens? And then people can decide in that moment, whether or not that is a case they want to cover or not, right? Whether or not that sort of falls within or outside the scope of the problem that they're trying to solve. And like, that's the real value, I think, of having someone who wears all those scars from all those mistakes,

Starting point is 00:26:10 all those things where they found very late in the game, oh, actually, people do have apostrophes in their name, or oh, people actually do send emoji, where we thought there was only just kind of plain text or, you know, what have you. What you're saying about senior engineers sounds like the key to building the right abstractions is building up your intuition about like what the assumptions might be. Is that the only way or is there a way to build better abstractions, like even without kind of having an intuitive sense of who your customers might be? Here's what makes an abstraction useful is that its assumptions match the context in which it exists, right? The things that it doesn't represent are things which are not relevant to like the thing that

Starting point is 00:26:51 it's trying to do. So how do you go and build good attractions? Well, one thing is that you become fairly expert in the environments that like you are kind of writing your software for. And, you know, depending on where you are in the system, that might mean like as a distributed systems engineer, you're expert at all of the weird stuff that can happen in a distributed system, right? It also might mean that if you're writing sort of like if you're a developer productivity engineer or something like that, you're expert at what your customers, right? The other engineers within your organization will and won't take to,

Starting point is 00:27:26 right, which may just mean that like, you know, they're very used to some technology. And so you have to write something that very closely resembles that technology. And then, you know, as you sort of get further and further out towards the periphery of the product, you have to become an expert in your customers. And in terms of what they're trying to do, and what sort of new customers you might want to go and try to target in the future. And is this thing that you're doing, is this only going to work for the current verticals that you're targeting or for the next three verticals you might want to target? And so the point is not like as an engineer, you have to become an expert in all things, but you do need to go and sort of understand a few of those concentric circles

Starting point is 00:28:02 outside of the specific thing that you're working on. Because the specific thing that you're working on can't tell you anything about what's outside of it. Yeah. Postgres has this new feature, computed rows. And I was reading online, people were arguing about it. It seems relevant to this because somebody was like, oh, I just use this to speed up my system because this value that we were, you know, calculating on read time. Now the database calculates at every insert. And then somebody else was like, like, this is a known anti-pattern. You only have one database. Like clearly you should do this at the level above it where you can kind of scale horizontally. So it strikes me that that's kind of what you're talking about. Those people have two very

Starting point is 00:28:42 different verticals that they're in or set of requirements. And one, maybe the database isn't so overloaded that you need to take everything out of it possible. Right. I mean, really what it comes down to is how many rights do you have and how many reads do you have, right? If you have thousands of times more reads than you ever do writes, going and caching that sort of derived value might make a ton of sense. And on the other side, if you don't, and then that doesn't even get into, well, what if you want to go and change what that sort of derived value looks like, right? Is it easier for you to go and do a database migration or to do these other sorts of things? Or do you want to have that sort of be part of the code? And then

Starting point is 00:29:18 also what teams own the database and what teams own the actual logic there? Are those, are they close to each other or are they only sort of incidentally attached? And so you want to go and make sure that on either sides of these interfaces, there is the ability to be mostly incurious, right? As I said, you can't be entirely incurious. You can't, every time that you do something and it doesn't behave as expected, you have to have at least some sort of sense of what to do next, other than just like turn it off and turn it on again, right? There's so many different sort of things that come into whether or not something is a quote unquote good decision. And then of course, all those circumstances that informed that decision are subject to change, right? People leave, new people are responsible for them. Institutional knowledge like disappears or, you know, bit rots or other sorts of things like that. Suffice it to say, it's, you know, a big trash fire and we're just like trying to go and do what we can to keep it contained. But I think that the discourse around like that is a good thing or that is an anti-pattern or this is, you know, great or bad or something like this. It's hugely distracting at best. And I think harmful in a

Starting point is 00:30:26 more realistic kind of reading of the whole situation. Yeah, no, I totally agree. It gives a lot of food for thought and discussions, right? If people could be labeled with some sort of metadata, and I could see the guy proposing the one idea, you know, he has a startup and they're selling to like small businesses. And the guy who's arguing with them, like he's targeting like fortune five companies and like, they just, they're not wrong. They just very much have different worlds. Right. They're just talking past each other. Evan Schaplicki, who created the Elm language, gave an interesting talk a few years back called the hard part of open source, where he talked about sort of a similar problem in a slightly different vein, which is people debating, well, in his case, like, you know, the design of the Elm language, but you know,

Starting point is 00:31:08 in any other sort of technical discussion. And he spends a lot of time sort of laying out what he sees as a problem. And towards the end, he made some really interesting proposals about how you might solve this. And one of them is, in addition to writing what you have to write, you have to go and sort of state your assumptions, right? There is actually like a required form there where you're kind of talking about this or saying in the context of talking about this, I really care about the performance of the language, or I really care about the expressivity of the language, whatever that means, or I really care about this particular domain or something like that. And to have that be a required set of,

Starting point is 00:31:42 you know, as you said, metadata around what these people are arguing would be hugely illuminating. And would I think make it easier for people to understand why they're talking past each other, at least be able to, you know, infer the worldview that kind of made someone say this thing that the other person so strongly disagrees with. We don't even consider our assumptions. Like the person who's building very high performant software for very large data use cases, they don't even think about the context of somebody who's building a Rails app or something that's just going to serve like so many people per server. I think, I don't know. I mean, I think that that's true. But then, you know, people can always talk

Starting point is 00:32:20 about the fairly small number of examples of like, you know, Twitter built on top of Ruby of Ruby because that was a reasonable thing and then it stopped being the reasonable thing. And again, you have this kind of, I don't know, I call it hacker news induction, which is like, well, I built this thing and then I built this other thing, which is almost exactly the same thing. And it worked or it didn't work. And therefore, I think that this must generalize across all possible applications of this thing, right? So I tried Rails, and it was great or it was awful, and therefore it is great or awful, you know, in all situations. And then people go and say, okay, well, you know, Twitter used Ruby, and then they stopped using Ruby, and they went on to the JVM. So shouldn't they have started with the JVM, right? Or shouldn't everyone go and do this? Or shouldn't we try to go and make something which is like all seeing and all dancing and it's both expressive and fast

Starting point is 00:33:09 and scalable? It's an understandable impulse, I guess. We are problem solvers and we see a problem and we're like, okay, maybe there's a technological solution to this. I think Rust is a very good example of that sort of possibility. And so I'm not saying that things can't be better. I just think that the idea that we are monotonically sort of approaching this kind of R language or R technology or R approach that is based upon all of the failures and has surpassed them, it's a weird perspective on how progress works, I guess. I actually really, really dislike the, why don't we build software the way that we build bridges, right? The sort of software as civic engineering metaphor, because I think that civil engineering is a static solution to a largely static problem. And so I think that if we're going and looking for a metaphor for software, it's much better to look at something which is a larger scope, which is like a city, right? Cities are constantly changing. And notably, if your city is successful, if it's growing,

Starting point is 00:34:15 people are going to be discontent with you just kind of keeping things the same, right? You know, if you are successful, people will demand more from you. And, you know, the decisions that you made that were appropriate at a certain sort of scale or for a certain kind of, you know, population or whatever will become inappropriate at some point. And you will have to go and deal with that. And that's not an indictment of the past choices that you made. It's just, you have to acknowledge that circumstances have changed and you need to go and understand which of your assumptions have survived that transition and which ones haven't. I've definitely been involved with software that fits the city metaphor because you need a historian to walk you through and say like, though, this is the old part

Starting point is 00:34:53 of the city and things here are done a little bit differently. And this is, there was a brief modernist area that swept through this zone and that's why. Right. And that's, you know, maybe that's not desirable, but it's like, you know, if you go and you look at, I mean, you know, as part of the research for my book, because I was quite swept up in this metaphor for a while, like I've read a bunch about city planning, right? And planned cities are often bad cities. Oh, interesting. Right? Because like a lot of the planned cities, which were birthed out of this sort of modernist movement, was the sort of idea that they said, a house is a machine for living, right?

Starting point is 00:35:31 And so we're just going to make it very efficient. And so you can get these economies of scale if instead of having people live in like smaller buildings, we'll create mega blocks, right? Where just all the people live. And then all the living will happen in this part of the city and all the working will happen in this other part of the city. And then the recreating will be in the, yes, still a third part of the city. And that's not how people live, right? So upfront design, architecture, bad. That's the lesson I'm intuiting here.

Starting point is 00:35:59 So you said before, right, in the Clojure community, there was this idea that there's like a right way to do things. I mean, I think Python takes this very seriously. That seems like counter to what you're saying. That sounds like central planning. It's like, this is the way to work with a list. Well, yeah, that's an interesting thing, because I should be more clear in terms of how I characterize this, because I think that Python is a, well, purports to be a one way to do it language. But I think that that's not really true. I mean, like, you know, you can go and you can do list comprehensions, or you can do the other thing, right? And you can go and say, like, this thing is idiomatic in this circumstance. But that's not a there's one right way to do it,

Starting point is 00:36:37 not there's one obvious way to do it. Right. I think that Clojure does a pretty good job, actually, of sort of putting people into a well-worn sort of groove in a number of different ways. But I think that there are other things that it completely punts on in terms of like having sort of a normative strategies. Like I don't think that Clojure is a language that necessarily has such strict guidelines that it's clear what the right thing to do is. Rather, I think that it is a language that puts a lot of importance on having done the right thing to do is. Rather, I think that it is a language that puts a lot of importance on having done the right thing. Because I don't know if you've seen Rich Hickey's talks at all, but he's a very thoughtful person. He is a very talented communicator about the various aspects of software design. And I think that people aspire to be as thoughtful as he is. And they

Starting point is 00:37:23 attribute their own failures to have like a clear understanding of what goes in the same namespace or doesn't as insufficient thoughtfulness. Yeah, that's an interesting dichotomy. I really like his talks. He clearly spends a lot of time thinking things through in depth. But my outsider perspective of Clojure is it's more... So let me ask you, actually. So is day-to-day Clojure programming closer to Bash or is it's more... So let me ask you, actually. So is day-to-day Clojure programming closer to Bash or is it closer to like some Haskell programming with very rigid constraints on interfaces? There are a lot of ways I could slice that question.

Starting point is 00:37:56 So, I mean, notably, right, Clojure has structured data. Yeah. And it is, or at least was when it first came out, I guess you could also kind of lump Elixir in with this a little bit, but it is a dynamic language that has immutable data structures. And I think that that's a very novel niche, right? Immutable data structures were the domain of Haskell and sort of the Haskell derivates. And I think that the insight that this is useful, right, that like dynamic languages aren't just as sort of like Leroy Jenkins kind of like dash into like,

Starting point is 00:38:31 let's just like slap it together until it works, that there is a principled way to use dynamic types was a novel insight on the part of Rich. And so I think that you spend a lot of time in Clojure dealing with things which are structured, but aren't necessarily named. So if I go and I'm building up a or I'm taking like an input, and I'm trying to like build up some sort of derived data representation of that, I'm going and composing together a bunch of these operations, typically using the sort of arrow combinator, which goes and takes like a thing and then instead of having the sort of the typical list inside out structure of the code which is I think for most people very very difficult to read you just have a left or right thing like I have this and I want to apply this transformation and I want to apply the next transformation and so you can just read it very sort of you know left or right or top down in terms of like step by step here's what I'm doing what you often don't

Starting point is 00:39:25 have though is a clear name for the data types that exist at every point within every intermediate point of that composed operation. And there's a looseness there where you're kind of doing the Wiley-E-Coyote thing, where you're walking off of this cliff, which is a named entity in your code base, and you're dashing across the empty space, hoping that you'll land and get another thing, which is like a named part of your code base, like these actual like distinct entities, which are well understood. And in between, you're just going and sort of doing a bunch of fancy footwork to get from A to B. And I think that that's actually a really reasonable thing to do. Because the one thing

Starting point is 00:40:03 that I learned about names when I was writing the first chapter was, if you cannot give something a name, go for it, right? Naming is hard. And naming is a really easy way to go and miscommunicate about what your code is doing. So if you can just go and say, I don't know, it's like it's a map of this one thing that you have a name for and this other thing I've named for. Like the relationship between them is sort of implied or whatever. But really, at the end of the day, if you just go and focus on like the terminus or the termini of the two different things, like, you know, where did I come from and where did I eventually end up? And so closure is something where you can have a very pluggable degree of specificity. And you can be very loose in places where looseness is valuable because you don't have a good understanding of the domain yet. You're just kind of trying stuff out,

Starting point is 00:40:49 or maybe the domain is something which is highly fluid, right? You know, oftentimes when you're going and you're working on a new product or with like a new kind of customer, there's a lot of premature calcification from my perspective, when you go and you have a much more strictly typed sort of language, because you're often going and having to name these things or provide some sort of rigidity, which is not reflective of like the degree to which you actually understand your domain. And when you find that you've been wrong, you have to go and like unwind that in a way that's much more expensive than when you're just like, I don't know, it's some stuff, right? You know, as I'm leaping from one cliff face to another, there's some things going on there. And if I have to change it,

Starting point is 00:41:31 like so be it. Notably, the job that I recently took is a Scala shop. And so I've been working with Scala and really more than just kind of like, you know, in having a passing familiarity with how it works. And that is reflective of what I'm saying. I mean, you know, in having a passing familiarity with how it works. And that is reflective of what I'm saying. I mean, you know, by that same token, if you have a very rigid domain or domain that is very well understood, the degree of specificity that is possible in a language like Scala far exceeds what Clojure is able to do. Right. Because it's very easy for you to be like, I don't know, some stuff happens. right? Even when you want it to be quite narrow, right? But that's a trade off there. And I think that it's quite possible that you can't have a single language that spans all of that, right? Is able to be wholly fluid and then completely rigid. There have been some interesting projects on sort of adding gradual typing to Clojure, but I'm unconvinced that that necessarily

Starting point is 00:42:25 goes and like really gives you that whole range. I think it might actually be this sort of step function where it's like you have something which is largely dynamic and something which is largely static. And it's very hard to go and find like the happiest medium between those two. That's super interesting. So there's like the with open macro, right? Enclosure. And it's pretty understandable, pretty simple, right? It's kind of like, hey, shove some code inside of this, try, catch finally, and then make sure you kind of open and close things. And so like there's a Scala equivalent. I mean, there's a couple, like the bracket is kind of something that people use, but I just found one on Google that was called

Starting point is 00:43:02 with resources. So it did almost the exact same thing. And it was much more verbose, mainly because it's not a macro. So it has to talk a lot about types. That's to say like, okay, I'm going to take in something that can be opened and closed. And then I'm going to take in the returns type of whatever. And then I'm going to take in a function that takes that type of whatever and then turns it into type. And so the body of it ends up the same. But the types are not something that a beginner like Scala person would probably come up with. It's a bit complex. And then there's tradeoffs, right? So the closure one is very understandable, I think, because you're just like, oh, this code expands to this.

Starting point is 00:43:40 But then like the Scala one can tell you like, hey, you pass something in that can't be closed, for instance. So I think that that is absolutely a trade-off. But I think that whether people prefer one thing versus the other, like obviously there's more to it. And obviously there are like very concrete benefits and drawbacks to both approaches. But I think that where you land in that, because there is such a multifaceted thing, is just like, what is your mental model for this?

Starting point is 00:44:05 And I've joked for a while that like there are two kinds of programmers. People think that software is fundamentally math and people think that software is fundamentally literature. Right. Yeah. And I think that if you were to go and sort of draw a line through the sort of axis of, you know, Haskell Scala versus Clojure, Ruby, what have you, that would be actually probably fairly predictive, but not wholly, right? Because again, it's much more complex than I'm giving it credit for here. So it's an interesting dynamic, right? Yeah, I feel like both a bit fringe things. So it's like, hey, you guys both say you do functional programming,

Starting point is 00:44:41 you guys should have a lot in common. You're well it's more complicated than that sure right and i mean and if it were simple then like we it probably wouldn't be even worth discussing right but like it is something where for all the reasons we discussed earlier i don't think that the conversations that are being had about this are at all productive right and i think that that's kind of the shame that we have here which is that it becomes this kind of tribal thing rather than an attempt to actually introspect and understand, like, what is it that makes me feel that this is like the right way to do this? I find it really exciting when I find something that's there. And to just think that, like, the rest of the world has nothing to teach you because software didn't exist until quite recently is, I don't know, it's a sad way to live. Yeah, I agree with that. I mean, type systems came about before software existed. Interesting fact,

Starting point is 00:45:32 you know, from a sort of logical, philosophical tradition. One question I want to ask you before we wrap up. So what do the kind of math, I assume you put yourself on the literature side, what do the math type level Haskell, et cetera, people get wrong or miss about abstraction and design? Well, I mean, I think that on the math side of it, you'll find a great many more people who are willing to go and talk about an abstraction being correct. A thing I talk about in the book and in the talk that you mentioned on abstraction is sort of the distinction between church numerals and con cells so you know both of them are fairly similar

Starting point is 00:46:11 right they represent sort of the accumulation of a thing by adding a thing that refers to what came before it right so you church numerals are just a bunch of functions wrapped around functions and con cells are a linked list that you prepend to and the point that i raise is like church numerals are timeless because math is timeless right like the or at least is extremely slowly moving and so something that's useful to construct a proof will likely be so for your lifetime give or take right whereas con cells are something which are much more contextual because like notably between when like the first Lisp implementations existed and now the ratio of clock cycle to memory latency has changed drastically, right? It used to be that you could go and read from main memory

Starting point is 00:46:56 in like a clock cycle or two, and now it's hundreds of clock cycles to actually reach all the way out into main memory. And so you want to have more coherency in terms of your data, right? You want to actually not be jumping all around the places going on there. And so, you know, notably in Clojure, it doesn't really use consoles for much of anything. It uses these chunked sequences that have like 32 contiguous elements before it goes and has like a reference to the rest of it. And that's just a reflection of like where we are, right? Computers are not super fast moving, right? There's still sort of these like von Neumann things and they still work in much the same way

Starting point is 00:47:30 from a sufficiently abstracted point of view. But like, you know, it's changing fast enough that we can't just go and say, there's something timeless about Lisp. Like from a proof perspective, it is, but like to actually have a useful Lisp, like you have to have a bunch of things that are very tangled up in like, when was it made? You know, what was it made for? And there's a lot

Starting point is 00:47:49 of history around this kind of idea of like correct abstractions, right? And I mean, the way that I go and I sort of deal with this in the books, just say like, don't call them abstractions, like call it a module, right? And talk about your model having invariance and talk about these sorts of things. If you go back to the original mention of abstraction in the CS literature, this is a paper called Proof of Correctness of Data Representations by Tony Hoare back in 72. And this is what he talks about. He says, look, you can go and you can prove that your data type is correct by saying here are the invariance in the model. And you can say that all the operations on the model maintain that invariant. And therefore, that invariant is closed over all different compositions of those operations, right? Which is great. It's a very

Starting point is 00:48:28 clean, clever way to go and sort of construct a proof, but that doesn't talk about whether or not it's useful, right? And so I think that like to, after much kind of talking about it, like to answer your question simply, I think that there is a focus on this sort of, this ineffable kind of always out of reach sense of like a correct abstraction, rather than just saying like, well, is it useful, given this particular set of circumstances. And I think that to the extent that there are these sorts of things, like, you know, even in, say, Haskell, there's a lot of having to hold the compiler right to make sure that it can go and like optimize properly, right? To go and do what you sort of expected to do. And there's a lot of expertise required to do that. And they say, well, okay, but that's not intrinsic to Haskell language. That's just its implementation.

Starting point is 00:49:14 And if we have the much-vaunted, sufficiently smart compiler, then this won't be a problem. And so this is not a critique of Pascal, just its current imperfect form. And what that presumes is imperfection is fleeting, right? We will escape the imperfection or it will become sort of asymptotically reduced over time, as opposed to we're just going to go and sort of bounce around a sort of like pinball machine of circumstances that is constantly changing and constantly making a scene like a year ago, we were complete fools for having thought that like this could never change or whatever. And like I much more view towards that, right?

Starting point is 00:49:54 That like change is constant. Things will fall out from underneath our feet. And the trick is to go and be able to see just far enough into the future that we're not constantly falling flat on our face, you know, but also having a plan for when we do. So like that, it's a much more pessimistic view of like what it is that we can control about our lives. And that's not to say that there aren't domains where someone is doing like a linear algebra library, right? Last time that the main LePAC sort of Fortran set of

Starting point is 00:50:26 subroutines change was when people switched off of like the Cray vector supercomputer model onto the Von Neumann, like the more typical kind of like tiered cache kind of mechanism that exists today. And that required them to rewrite it. And it's been pretty much constant ever since. There's probably like a new set of ones for the GPU now, right? Which is sort of actually closer to the gray model. But that's a very stable domain and you may work in a very stable domain. And the sort of biases that I mentioned may be completely sound within that domain or sound enough that it doesn't really matter. And so I'm not trying to say that everyone needs to see the world the way that I see it.

Starting point is 00:51:00 I think that you do, however, need to acknowledge that there is no such thing as timeless software in like a general sense, right? All there is is software that hasn't existed long enough to be useless or software that exists within an environment, which we are very carefully shielding it from the changing of the world. You know, there are still mainframes running COBOL, and I guarantee you there are just layers of calcified corporate structure around that, just trying to make that not fall over. This is like the skyscraper that's slowly sinking into the ground, like we can't do anything to it. But we can try to go and do something around it, like we have to try, right. And so I just think that if we go and we kind of take that as a given, right? Sort of, you know, the humanities, right?

Starting point is 00:52:07 Because the humanities is fundamentally focused on these sorts of ideas of context and how it applies and how it makes things sort of fall apart. And we can learn the lessons there and not have this kind of very blinkered sense that like the only problem is that we haven't imbibed enough math or enough like category theory or enough or created a powerful enough type system or something like this. And like those were all very useful.

Starting point is 00:52:29 I don't mean to go and sort of denigrate those. It's just they are a tool. They're not a solution. Yeah, I think that's a great place to end it. Thank you so much, Zach. This has been great. Yeah, thanks for having me. All right, that was the interview.

Starting point is 00:52:42 I hope you enjoyed it. I think Zach put his finger on a lot of problems with our online developer culture. Lots of people talking past each other because they work in different contexts. It was also just a lot of fun to me to talk to somebody from the Clojure community and kind of compare notes about Clojure versus Scala

Starting point is 00:52:59 and just the different ways that he thinks about things. Like he really comes at things from a literature and philosophical perspective. So I hope you enjoyed the episode. Let me know if you didn't or if you did or just that you listened. Reach out, tweet,

Starting point is 00:53:12 join the Slack, etc. Until next time, thank you so much for listening. We'll see you next time.

CoRecursive: Coding Stories - Tech Talk: Software in Context with Zach Tellman

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.