CoRecursive: Coding Stories - Abstraction and Learning with Runar Bjarnason

Episode Date: March 15, 2019

What is abstraction?  Can we have a precise definition of abstraction that, once understood, makes writing software simpler?  Runar has thought a lot about abstraction and how we can choose the prop...er level of abstraction for the software we write.  In this interview, he explains these concepts using examples from the real world, from SQL, from effectful computing and many other areas. We also talk about how to learn and acquire the skills necessary to understand abstract concepts like very polymorphic code and category theory. Runar also explains his latest project unison computing and how it uses the correct level of abstraction to rethink several foundation ideas in software development.   Links: Constraints Liberate Maximally Powerful, Minimally Useful Unison Computing Webpage for show

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, this is Adam Gordon-Bell. Join me as I learn about building software. This is Code Recursive. Yeah, it's a sort of mind-body integration thing. I think that it's just like learning to dance or play the guitar or whatever. You can't really know how to play the guitar abstractly. No one can really just tell you how to play the guitar. You got to do it. That was Runar Bjarnason. He co-wrote Functional Programming in Scala, a book that was very influential for me and for a number of people I know. But we aren't talking about that today. Actually, go back to episode five for that interview. Today we talk about abstraction and precision and also the relationship between constraints and freedom. My take on Runar's ideas is that when writing
Starting point is 00:00:53 software, if you constrain yourself to the least powerful abstraction that will work, then you will have more freedom to compose things at a higher level. Brunar, I interviewed you probably more than a year ago for the podcast, and we talked a lot about the basics of functional programming from your book, Functional Programming in Scala, that I guess you co-wrote. And a little bit at the end, we started talking about abstraction. So thanks for coming back. And I was hoping we could dig in more on that topic. Yeah, my pleasure. Absolutely.
Starting point is 00:01:33 So in broad sense, what is abstraction? What is abstraction? Abstraction is a sort of mental process that you go through all the time. Like it's the essential function of the human mind. And it's just like taking a bunch of things that we have observed or encountered or experienced in some way and then grouping them by their similarities, but in a way that they're different. Like grouping them by like a specific way in which they differ. Do you have an example? Do you have an example?
Starting point is 00:02:02 Well, like for instance, size. We can have an idea of size. It's an abstract idea. So we'll have a bunch of things, and we'll note that, you know, they're all different in some particular way. And we'll call that way size. But with regard to this idea, we're going to regard them as the same in every other respect. So this is a way of constructing sort of a new entity in our minds that we can manipulate and sort of have. And that we can give it a name.
Starting point is 00:02:37 And that refers to this particular difference among things. And then we're going to then basically regard those things as being the same in every other respect. So it's like, if I have like an apple and a car, when I talk about their size abstractly, I'm not worrying about their caloric value, but merely that an apple is smaller than a car. Right. Yeah. So they have different size, But in every other respect, with regard to this particular idea of their size, they are incomparable or the same. And, you know, you can start with a really imprecise idea. Like you can just say, well, the apple is small and the car is big. So there are only two sizes, big and small. But then as we get more and more abstract ideas, we can start to be very precise about how we compare, for instance,
Starting point is 00:03:26 sizes or colors or shapes or programs. Yeah, you use the word precision there. Actually, maybe before we get to precision. So how does this apply to the world of actual software development or code? So the way I'm sort of looking at this in my mind is that we're viewing the things that we observe, so out there in the real world. We're sort of looking at them and sort of lining them all up in some way. In such a way that we sort of punch a hole through the whole stack of things that we're integrating under some abstraction. And then we're going to sort of fill that hole with some quantity or quality. Like for instance with size we're going to just line everything up so it has a size shaped hole in it and then we're going to fill that with either big small or like an integer or square meters or whatever it is. I'm sort of
Starting point is 00:04:19 viewing it as you know like a stack of things that with like a hole punched through it that we can then fill with some other idea. And then the same kind of thing happens in programming when we abstract. That is, we take a program and we want the program to vary in some way. Like, for instance, we might want to take the sum of the numbers 1, 2, and 3, but that's not very abstract. So we might want to be able to take the sum of any numbers. And so we sort of punch a hole in our program and say, well, we're gonna actually take those things as an argument.
Starting point is 00:04:53 We're gonna call the numbers like X, Y, and Z. And so we have number shaped holes in our program that we're then later going to fill with some values. So like variables are a form of abstraction. Is that the example? Yeah. Yeah, exactly. So a variable you'll note can't be absent. You can't ignore the fact that it needs to have some value, but it could be any value. It stands for absolutely any value of a particular type, a value that fits that sort of hole in the shape. This is an interesting example. So like one plus two plus three, or let's just say one plus two is like a concrete addition,
Starting point is 00:05:29 but like X plus Y is abstract. And I think you said before that somehow that's more precise. How is X plus Y more precise? Oh, is that more precise? I think that it's certainly more abstract. Yes, yes. One plus two is already pretty abstract. Because like even the idea of like one and two, these are abstract ideas.
Starting point is 00:05:52 Like we're not specifying one of what or two of what. So these are ideas that work universally for anything that we might want to consider one or two of, right? Yeah, like it's just counting. We're not counting a specific thing, but just the number itself. Right, right. So that's an example like of where, so just at that level is an example of where like an abstraction buys us precision. Like out in the world, when we're like looking at the actual universe, it's sort of undifferentiated
Starting point is 00:06:20 and it's just this one big interconnected mess. But we can abstract out like two of something like we can notice that there's a pattern that like oh everyone has some a configuration of eyes uh or legs or like you know there might be like a pair of sheep in the field and two stars in the sky or something and they all have this same relationship is that that there's like there's two of them right so then we have this notion of is that there's like, there's two of them, right? So then we have this notion of two. And then, you know, from there, we get all the natural numbers and arithmetic.
Starting point is 00:06:51 And now we can start to be very precisely about adding things together and subdividing groups of things. So we can do very precise arithmetic now with these abstract quantities. Whereas like with actual stars and sheep and eyes, we can't really. Yeah. And if we're talking precisely about eyes, that's sort of a symbol, like there is no universal eye, right? We're referring to a whole bunch of different people's eyes as like a combined concept. Right. Yeah, exactly. So even there, that's abstract. So to make it more specific, because I feel like I'm going to end up otherwise in a really philosophical place.
Starting point is 00:07:29 Sure. I mean, maybe we will regardless. You had this example of actually writing printer code and how abstraction was lacking or something. I was wondering if you could share that. Oh, yeah. So in the talk I was talking about this time when, this is during the 90s sometime, when somebody wanted to hire, somebody hired me to write a program to make sales reports. And they had to be printed on this thought matrix printer that they had in their office. And they had their data in an SQL table. And so I just took their paradox tables or whatever it was. And I just essentially just dumped them out onto the printer using very specialized code using like the printer control codes. I wasn't using any kind of abstraction or anything like that. So talking very concretely to
Starting point is 00:08:18 this particular printer. And then when the client comes back and says, it says like, okay, well, you know, I want to manipulate these reports. You know, like, can I move this thing to over here, for instance? Or like, you know, can we take two reports and put them side by side on a page or something? That becomes basically impossible without rewriting my whole program. Because the program isn't able to talk about the location of things on the page or whatever. It can only talk about like how the printer is being controlled in this particular instance.
Starting point is 00:08:49 So I was missing this sort of abstract layer. It should have been talking about the reports sort of in the abstract layout and other things like that. Like even if I had known that PostScript existed, which I didn't. So that would have afforded me the ability to talk precisely about where things are on the page and things like that. So yeah, PostScript or PDFs is like an abstraction that lets you specify like where things would go on a page without being specific to a printer.
Starting point is 00:09:19 Right. And so yeah, the point I was trying to make with that one was one of compositionality. So, you know, I didn't have a compositional program, so I wasn't able to deconstruct the program that I had written into, you know, smaller elements that I could then rearrange. You know, it was all just one monolithic block of code. So how does abstraction relate to composition? If you had written more abstractly to PostScript, how is that more compositional,
Starting point is 00:09:46 or is it not? Yeah, it certainly would have been more compositional, or like any kind of sort of algebraic abstraction, like where you can take things apart and then recombine them, would have been better. I mean, it would have allowed me to do what I wanted to do. And I mean, the way that relates is that you can, you know, once things are very abstract, you can sort of manipulate them symbolically according to some rules. So there would be some rules of composition or combination. For instance, like if you have some kind of layout language, you might be able to say like this thing appears above this other thing, or these two things appear side by side.
Starting point is 00:10:24 Like that might be a rule of composition. But then the actual rendering of one element and the rendering of another element, if you combine those renderings together, you should get the same thing as if you were rendering the combined elements. Does that make sense? Yeah, it's almost like an intermediate representation and then some sort of printer-specific interpreter or something. Yeah. The thing that the abstraction buys you is to allow, again, this sort of some but any
Starting point is 00:10:56 idea that you can talk about. You have some printer, but it could be any printer, as long as it has a mapping from your abstraction down to how you actually control the printer. Yeah. I have this quote from you. It says, we should work symbolically and only go to a final representation at the latest possible time. So this, I suppose, is an example of that. We would map things to this small language for representing layout,
Starting point is 00:11:22 and then at the latest, like right when we, before we send it to the printer, we kind of interpret that into the printer codes. Right, yeah. Because if you kind of think about it, there'll be lots of different ways that you can combine things that are sort of in a very large language.
Starting point is 00:11:38 And a favorite example of mine is like SQL and strings. Combining two strings together, even if both of those strings represent some SQL statement, there are going to be lots of different ways of combining strings. And most of those ways are not going to be valid SQL. So if you want to ensure that you have valid SQL at the end, you have to sort of recover the structure of the SQL and then sort of manipulate it in some way and then recreate or sort of generate the combined string. But if, let's say, you had a symbolic representation of the SQL syntax as a library,
Starting point is 00:12:14 you could then just work using that library and combine your SQL in a way that's legal, like in a way that is still legal SQL, and then generate the string as the sort of last possible step. So you're not like trying to do some kind of string munging or hacks. Yeah, the interesting thing about that is almost necessarily there will be some SQL, like if I'm using some intermediate form that's much more compositional, like almost necessarily there will be things like types of SQL I can't do.
Starting point is 00:12:48 So I'm limiting myself in using this intermediate form, would you say? I'm not sure. I mean, if your intermediate form can represent, if that is like the syntactic specification for SQL, then no, there won't be any legal SQL that you can't generate. Maybe I'm misunderstanding your question. No, no, I think that makes sense. SQL is large, so in practice there's things I may miss.
Starting point is 00:13:13 But yeah, if I were able to specify... The perfect case is that I have this intermediate representation that lets me compositionally form any possible SQL that I want to do. Where I think what you're saying is that if you have a string representation, that is true, right? You can have any valid SQL will fit into that, but so will an infinite amount of things that are not SQL. Yeah, exactly. Yeah. So it is true that you can have any combination of valid SQL in your string, but yeah, you can have lots and lots of junk. But if you have a precise specification, if you have an abstract representation
Starting point is 00:13:46 of SQL, then you can see where precision comes in here. You can have precisely SQL and nothing else. Yeah. So in a way, precision separates things that are sort of distinct so that we prevent confusion, right? So it separates the things that we don't want to have confused so it avoids additional noise so that there are no unnecessary elements yeah so it is to use some of your language from before like it is a constraint on what you can do right a string can can hold more things but the representation that can only represent sql is much more precise right and this is the same with uh you know like everyday constant formation like with the idea of the natural numbers, for instance. With regard to like arithmetic on the natural numbers, the numbers are very precise and there's no additional noise.
Starting point is 00:14:35 Like the fact that those numbers represent guitars or something like that's just additional noise that you don't need. You don't want to consider. you don't want to consider you don't need to consider like the fact that two guitars are different from some other guitars it doesn't matter because if you have two of something and then two of something else you still have four and so with regard to their cardinality we want to regard them as being the same the place where this is funny because this is talk about abstraction. I feel like it's very abstract, but I suppose that makes sense. The place where I really found a lot of value in this concept, and I'm trying to think about how to communicate it well. The place that I found it most interesting was like in terms of polymorphism. So if I have a function that takes in an integer and returns an integer, then that is concrete as compared to if I have one that
Starting point is 00:15:27 takes in anything and returns something of the same type, right? Right. So that's now more abstract. So what does that abstraction buy us? Well, so it buys you that the, so the caller of the function can now select which type the function should operate on. And because the caller can pick absolutely any type, that means that you have no hope of writing a specialized program that works for every type. So you have to write a general purpose one, and there's only
Starting point is 00:15:58 one that could possibly work. You have to return the element of the type that you were given, because you have no idea up front when you're writing that program. You don't know what you could possibly be getting. Which is interesting because it means that the generic version is actually more precise, even though it's more abstract. Like the possible implementations are just one. Right. So it's a more abstract type, but it's a more precise specification for a program. And because the definition is so limited,
Starting point is 00:16:29 the amount of potential callers of it increase. Like there seems to be some relationship there. Yeah. Yeah. There's definitely a relationship there. That relationship is called an adjunction. Okay. What's an adjunction? I might regret asking that. So adjunction is an idea from category theory. So it's a relationship between two categories. So it's a pair of functors, one going from, if you have two categories, C and D, one of them goes from C to D and the other one goes from D to C. And they stand in this particular relationship such that they are, what's a sort of a layman's way,
Starting point is 00:17:02 they are sort of approximate inverses. Or one of them finds the most efficient solution to the other, and the other finds the most difficult problem that the other one can solve, if that makes sense. They're inverses. They're sort of like equalities, but they're in the opposite direction? Yeah, so they're in opposite directions, but they're not exactly inverses. They're sort of approximate inverses. That is,
Starting point is 00:17:26 if you go across the adjunction in one direction and then back again, you don't end up where you started. You end up either sort of below where you started or above where you started in some kind of, well, in the category, like in a hierarchy. Like you'll be able to either get to where you ended up from where you started, or you'll be able to get to where you started from where you ended up. So in our world of, in my world of writing software. Yeah. So I'm constructing a function. So I think what you're saying is like the more abstract and precise,
Starting point is 00:18:01 I write the definition of my function, that allows for more reuse or more potential callers of it. Yeah. So you can think of this as in terms, you can think of this in terms of variance, for instance, like if you're familiar with the way variance works in Scala, like in class hierarchies. Yeah. So like a function is contravariant in its argument. And so as you grow this type, the things that you could possibly receive, you make it easier to use. Like more, you can pass more things to it. Yeah, and at the same time, that limits what you can do
Starting point is 00:18:37 because you have to cover the most general case of all of them. Right, yeah, and conversely, if you shrink the type, you make it more difficult to use. But this particular function that we're talking about, like the identity function, like for all A, it takes an A and goes to A. You can see that it is giving a lot of freedom to the caller. It basically makes it really easy to call. You can call it with anything, but you can only set that in turn shrinks the things that you can return. You can only return one thing.
Starting point is 00:19:08 So it can be called by anything, but in fact, it does nothing. Right. So it is the most extreme example, I guess, right? Right, exactly. A less extreme example would be like functions in general. So if I am constraining myself to not do side effects, that obviously limits my function, right? Like what I can do to make something a function if there's no side effects. However, that means like more things could call it, I would say.
Starting point is 00:19:29 Yeah, that's right. Because if you do have side effects, those side effects may be inappropriate or impossible in lots of contexts. You know, like for instance, if you access the network, your function can't be called anywhere where the network is unavailable or where it's inappropriate or dangerous to access the network. Yeah, I've actually had this problem with, I think it's like the Java URL class where it actually does like a namespace lookup or something. That's a great example. I hit this problem, right? So you're like, what's going wrong here? I'm like, God, it's making a network call. Right. Why is it doing that?
Starting point is 00:20:06 The fact that it does a network call makes it not usable in a whole bunch of circumstances. Exactly. So the thing that you really, I guess, opened my eyes to is this back and forth relationship. And I think that like, as I write code, I actually want it to write, I want it to be reusable. I want composition. And that means that in this relationship, basically, I actually want it to write, I want it to be reusable, I want composition. And that means that in this relationship, basically, I always want to go as abstract and precise as possible, right? Well, I think that depends on your goals, really. Because sometimes going very abstract and very precise might be costly in terms of labor or computation or something like that. So I think
Starting point is 00:20:44 that you got to weigh it against your other goals as well. So it's not necessarily always correct to, and by correct, I mean, it doesn't always align with your values to go completely abstract and absolutely precise. Yeah. So like, here's an example. Should I use explicit recursion or does it make more sense to use like a fold? To me, it seems like there's some like there's less power in a fold. I agree with that.
Starting point is 00:21:09 And I think that therefore you should choose a fold. I mean, whenever you can. But often the fold might be very convoluted to do. In principle, it should be possible to write any function over a list using a fold. But it might be very convoluted. You might have to do some sort of intermediate steps and things like that. And there are lots of algorithms on lists
Starting point is 00:21:29 that are just easier to write using recursion. And which one to do, I think, just depends on what your goals are at the time. Yeah, it depends on the case, but you could be as abstract and precise as you possibly can in that case, I guess. This is the way I'm interpreting what you say. Like if I have a function that like the values of it are only going to be whole numbers,
Starting point is 00:21:52 like I shouldn't be returning like a double or a float. Right. Yeah, exactly. All right. What do you think is more precise? A like actor based system or like some sort of IO effect types? That's a hard thing to answer. I think that probably some kind of IO effect, you mean like a monadic IO? Yeah. Yeah, I mean, things that you can do with one, you can definitely do with the other.
Starting point is 00:22:17 But sort of an IO monad has an algebra. Like it has some laws that you can use to sort of reason about your code. It doesn't have a lot of laws, but you can, for instance, always know that if you break out some part of your program into a subroutine and call it from your main program, it's going to execute in the order that you expect, for instance. And other things like that. Whereas, like an actor system has no algebra at all. There are no laws of composition for actors, at least to my knowledge. I mean, there have been, you know,
Starting point is 00:22:56 systems to formalize the sort of mobile processes. I'm looking at my book here on Pi Calculus, for instance. So yeah, there are lots of sort of formalisms for talking about processes that communicate using messages, but we don't tend to talk about, like when we're writing those systems, we don't tend to write them sort of symbolically in the abstract like those formalisms do. That makes sense. I have a, in a piece of software I work on, there is like a, an actor that does some sort of job execution. It's very basic, but it will take any type of input into its message queue and then
Starting point is 00:23:31 do something with it. So if I were to replace it with some sort of, some sort of function that takes a specific input and then returns like IO unit or something, it seems to me that that is more precise because I've limited its input. Oh yeah, limiting the input I think is a good first step. But I think that you could go even more abstract than that. I mean, maybe not for that particular application. But you could, if you have some kind of network that is exchanging messages, you know, over lots of different actors, you could maybe write that symbolically, like you could have some kind of library that allows you to describe networks of actors talking to each other. And then as the sort of last step, compile that to actual
Starting point is 00:24:17 actors. Ah, very cool. If I have something that is like of type, like I just described, like type, like IO unit, or is that actually a function? Is IO unit an actual function? Yeah. If something returns that. Yeah, yeah. So it's constructing a value of type IO unit. So then, I mean, it depends on what language you're using or whatever, but assuming it's
Starting point is 00:24:40 something like Haskell or Scala, what you have, what you've constructed is sort of a script, which will then be passed to the runtime. In Haskell's case, you know, it'll be passed to the Haskell IO subsystem. Or in Scala case, it'll be passed to like, you know, the interpreter of the IO monad. So yeah, if you have a function from, you know, like X to IO unit, it is a function. It constructs a, it returns an actual value, which is first class, and you can manipulate it. Oh, I see what you're saying. So yeah, it does work. Yeah.
Starting point is 00:25:14 Okay, that's awesome. I guess what I was thinking is that, you know, a function, yeah, okay. This is interesting. I'm thinking out loud about this. Yeah, no problem. Because I'm thinking a function that takes in the same inputs and returns the same results should be equivalent. Those two functions should be equal, right?
Starting point is 00:25:32 Yeah. But I can write a function called not putStringLine that takes in a string and just returns IO unit, and it would not be equivalent to putStringLine. Right. Yeah, because it's a different function. Yeah. But in a language where you don't have monadic I.O., your point is good.
Starting point is 00:25:48 Where, like, if you call read line twice, you'll get two different strings because, you know, you'll prompt the user or read from standard input twice. Yeah. And so they're not interchangeable at all. Whereas in something like Haskell, read line will just be a value of type I.O. unit,
Starting point is 00:26:05 and every read line is the same as every other. It's just the value and you can pass it around or do whatever you want with it. Because to the programmer, it's completely opaque, right? The only thing, well, if we ignore unsafe perform IO, the only thing that you can do with it is to return it back to the runtime. Yeah, because the distinction has to do with that actual stage at the end where everything gets run, like where you don't see. Off the end of main somewhere, this thing is being evaluated.
Starting point is 00:26:34 Right. So I think that if you explained this to yourself when you were writing printer programs in Delphi, would it have made sense? I don't think so. I think I was just unequipped to learn this at the time. I mean, I was thinking in terms of, you know, a program is a sequential thing, you know.
Starting point is 00:26:52 I understood pretty well, like, how a PC worked on the inside. You know, like, you know, there's a program counter and it's, like, loading into registers and, you know, moving stuff in memory. And, like, that's how I thought about programming. And so when I thought, oh, I've got to print, to me, that meant like, oh, I got to send instructions to the parallel port. You know, like personally, I found like your book, very inspiring. A lot of your talks, you seem to have a vast knowledge. So I saw this talk, you're talking about printing
Starting point is 00:27:20 poetry, and then somehow you connect it to, you know, like junctions. And how did you get where you are? How should we follow in your footsteps? Oh, don't follow in my footsteps. There'd be dragons. No, I just, I've just always been really interested in software and software development and always found that there is like something else to be, to learn something new on the horizon. And I guess I was looking for the sort of governing dynamics or the abstraction that captures in general the thing that I'm doing. I watched like several of your talks
Starting point is 00:27:54 and one thing I noticed that maybe I've failed to do here is like you hammer examples. I saw you give a talk about category theory at the whatever conference that was in San Francisco and you didn't even really explain what it was, but just hit with examples one after another. give a talk about category theory at the whatever conference that was in San Francisco. And you didn't even really explain what it was, but just hit with examples one after another. Is this like how you learn or how you teach? Yeah, that's sort of my philosophy of teaching and also how I tend to learn. I think that this is the nature of abstraction because we started off talking about abstraction. The nature of figuring out an idea is to see lots of examples
Starting point is 00:28:28 and then take what is the same in those examples and what is different and sort of stamping that out as a concept. So if you line all of those things up, you can see in what way you can regard them as the same and in what way they vary. And then the ways in which they vary is the thing that you want to abstract out. So yeah, I definitely think that the way that we come to understand anything at all is to connect it with our existing experience. So seeing lots of examples and then forming that concept or abstraction ourselves. I don't think that an idea
Starting point is 00:29:06 can really be conveyed in any other way. Like trying to convey just the abstract idea, I think is sort of doomed to failure. You may see the structure of the abstract idea. Like for instance, if you don't know what a monad is, I can just show you the monad interface. I can say, yeah, it's this, and you'll know exactly what a monad is, but yet you will have no idea what a monad is, right? Yeah, exactly. Like you will have the syntax for monads, but you will have no semantics in which to interpret what you have learned. Yeah. A colleague of mine, he was recently, he was talking about applicative functors and he was going over the definition of like map two, like combining two things. And like I understand it because I've used map two and map whatever.
Starting point is 00:29:55 I don't know if it was all renamed. Like if the names were gone and I was just looking at the pure definition with the symbol that I would have understood it. I'm only relying on my ability to connect it to a specific case where I had used it. Yeah, this is one reason why often people find mathematics difficult. It's because lots of times they will just invent symbols. And so like the structure of the argument is all there right in front of you. But it's really difficult maybe to figure out what is being said because there's a lack of examples and all the symbols are, they don't refer to anything that connects with your existing
Starting point is 00:30:30 experience. You know, there might be like, just like, oh, gamma and sigma and X and Y. And that's, you know, that's it. Where it's like, yes, structurally that is correct. And like, I can, you know, use this paper or whatever it is to calculate something, but I don't know what it means. So like you talked about having a book on pi calculus there. So how do you overcome this? Obviously, you're able to take up these concepts. How are you putting meat on them? Probably by just conjuring up examples. So trying to connect what I'm looking at with something that I know already. Yeah, like I remember learning about applicative functors. And the way that I learned about applicative functors was just to experiment with lots of things that I knew fit that shape or that I thought might fit that shape.
Starting point is 00:31:16 Do you think that there is a... Maybe I'll tell a story. So I tried to learn Haskell in like 2012. I had like my real world Haskell book and at work I had some slack and I would just build things here and there with it. And I mean, I built some things, but I don't think the skills really set in. But now I work on a team of Scala developers who are really keen on functional programming. And we had somebody recently join the team, didn't know Scala, not that interested in FPE. And I feel like just through osmosis,
Starting point is 00:31:50 he's learned like so much stuff without even really caring compared to me struggling back in the day. So I don't know if there's an aspect of learning through like, just like, it's just contagious. Like just working in an environment where people understand concepts, somehow there's a transfer.
Starting point is 00:32:05 I don't know what you think about that. Yeah, no, I totally agree. I think that you learn a lot more by doing, by experiencing the thing like, university not really having learned anything is that they don't care about the subject material and it doesn't connect with anything that they do care about. And so I think that that's really important to connect the science or the abstract idea with the everyday work that you're actually doing. And, you know, sort of, like, chew on it gradually until it starts to fall into place. And do you recommend, like, you yourself, give them any talks, written books? Is that a method of learning or a method of teaching?
Starting point is 00:32:53 I think both. I mean, learning is just teaching yourself, right? Yeah. So I think that this is just a practical way of both learning and teaching basically anything. So, you know, I didn't really go to school to learn any of this. It's mostly just been through work that I've learned all this stuff.
Starting point is 00:33:11 So is it possible to become the future Runar via just working at your job where you're doing PHP development and then playing around with things on the weekend? Or do you need to be fully immersed? Yeah, I don't know what it means to be the next Runear. I think that you can absolutely become, you can become an expert in anything that interests you enough
Starting point is 00:33:33 and that you have enough exposure to in your day-to-day life. So it's not necessary to write books and to teach and to work somewhere where they apply these principles. You can just overcome that via self-interest? Well, you ask the question whether it's necessary to write and teach. I think that that's absolutely part of it. I think you learn a lot by teaching an idea to someone else, and you also learn a lot by writing your thoughts down.
Starting point is 00:34:01 So it kind of forces you to sort of face where your gaps are or where you might not have a very precise idea, like not precise enough to explain it to someone else. So yeah, no, I think that's definitely part of it. It's a sort of mind-body integration thing. Like I think, you know, it's just like learning to dance or play the guitar or whatever. You can't really know how to play the guitar abstractly. No one can really just tell you how to play the guitar or whatever. You can't really know how to play the guitar abstractly. No one can really just tell you how to play the guitar. You got to do it. Yeah, that's a good point. So that makes me think of like open source, right? So there's certain things I've learned about functional programming or abstraction or about types just by having somebody tear into me in a PR. I can't get
Starting point is 00:34:40 that from reading your book, but I can like try to contribute to some sort of open source work and then have somebody tear into me there. Yeah, for sure. Well, I think that I used up all the time. One thing I wanted to ask you about was this Unison Computing that you're working on. Yeah. Do we have time to talk about that a little bit? Absolutely. So what is Unison Computing? Yeah. So Unison Computing is a small company that i co-founded with a couple of friends in here in boston and so we're working on this language called unison which is a purely functional programming language and we hope to have it make it easier to write particularly distributed software but we we want to sort of revisit the entire developer experience and like the entire experience of caring for and owning software, starting with just how you interact with the compiler and then all the way to how you build large-scale
Starting point is 00:35:31 distributed systems. One thing I noticed when I looked at it, it seems like you don't represent code as text. Is that right? Yeah, that's right. So the code base is a persistent data structure. So essentially, the code base is a purely functional data structure that you can manipulate. It's an object that you can manipulate. So it's not a jumble of mutable text files on your disk. It's like an actual object that you can programmatically interact with. I feel like this ties into your idea earlier
Starting point is 00:36:02 about SQL and representing SQL as strings. Exactly. This is exactly like that. So we represent programs symbolically so that we can manipulate them very easily programmatically. And we can also communicate programs over the network. And so we can write distributed systems where to make a program run on a large network is no harder than just writing a program that runs on a single machine. So you write a program that describes your entire system, a distributed system, and then when you execute that program, that's the same thing as deploying
Starting point is 00:36:37 that distributed system. So it's almost like, this probably isn't how you would describe it, but it's almost like if your entire program was in the config for Kubernetes, then all you would need to do to deploy it is give that config, hands it out to all the nodes, and then executes it. Yeah, it's basically like that. Except that, imagine if you could talk about Kubernetes locations as a first-class entity
Starting point is 00:37:01 in your programming language. Then you're getting closer to where we're trying to go. But, you know, communicating programs over the network and representing them symbolically is not new. Like Smalltalk did this, Lisp does this. But we are doing this as a typed, purely functional programming language. And what we communicate are identifiers. We don't start by, like, sending programs over the wire.
Starting point is 00:37:24 So Unison has a unique approach to names, our identifiers. We don't start by sending programs over the wire. So Unison has a unique approach to names, which allows us to do this sort of unambiguously. What's the approach? Well, so, okay, so sort of as a little bit of a background, let's say I have a program that calculates the 24th Fibonacci number or something. And then I send you that program. You're another node on the network.
Starting point is 00:37:44 I send you that program and I say, execute this. And then you see this program, let's say like it's a Lisp program or something. And it has this expression in it, which is like Fibonacci of 24. Like, how do you know what Fibonacci is? And how do we know that we agree on what that is? Right. So there has to be some kind of resolution of free variables in the program that works across the network, across both space and also across time. Because the definition of what Fibonacci is, well, that's probably a bad example. But the definition of some function may change over time, like what a name means. So in Unison, a name is a cryptographic hash of the definition. So the Fibonacci function has a unique eternal identifier that we can unambiguously communicate.
Starting point is 00:38:36 If I send you that hash applied to the number 24, you either know exactly what I'm talking about, or you can unambiguously know that you don't know what I'm talking about. And then you can say, please tell me what this hash represents. I will send you the code, and then you can hash it. You verify that it matches that hash, and then away you go. It makes me think of two things. One is content addressable storage. So the key for a value is the hash of the contents. Yeah, it's exactly that idea. And then the other one is, are you familiar with this LittleTyper book and the Pi programming language or dependent types in general, I guess? I am familiar with dependent types, but I haven't taken a look at LittleTyper.
Starting point is 00:39:17 So in LittleTyper, because the type system allows for, you know, like functions that will evaluate some arguments to produce the type, right? So the type ends up being an actual function. To figure out if two types are the same actually means figuring out if any two functions are equivalent, right? Which means that they kind of do the same process because they need to be able to say the type of this function is returned is actually this is the type can actually be determined by evaluating a function. So to see if two types are the same, they may need to determine if two functions are the same. And so they have to do this process. Yeah, yeah. So that gets into notions of like extensional versus
Starting point is 00:39:55 intentional equality. The kind of thing that we have is like an intentional equality, like two values are the same, if they have the same implementation are the same if they have the same implementation. Or rather, they have the same name if they have the same implementation. Makes sense. Which is sufficient for distributed. But if two functions return the same inputs for all outputs, they're not necessarily the same by your concept of equality. Right.
Starting point is 00:40:21 Yeah, you could, for instance, insert an identity, a function to identity on the outside of your function. You take the source code from my function, add identity of that, and you'd have a different function as far as Unison is concerned. Which is fine because then it just distributes it somehow. Yeah. The idea is that you should be able to distribute implementations unambiguously and that once somebody has discovered a function, it's sort of like always there. It's like discovering, you know, a new star in the sky or something, you know, like it has like an unambiguous place in this space. It's rather like it was always there and it was just
Starting point is 00:40:55 waiting for you to discover it. Isn't this like Smalltalk you mentioned earlier, like it has some sort of, you can save data and definitions into a workbook or something? Is that kind of a similar concept? Yeah, so in Smalltalk, the code base, or I don't remember what it's actually called, but all of your Smalltalk definitions are saved to a database in your local storage. And there are tools to migrate code between people without necessarily sharing just the text.
Starting point is 00:41:25 Yeah, so I mean, it's a similar idea to that kind of thing. Well, it sounds like the problem is that what about like version control and my favorite text editor and things like that? Yeah, so Smalltalk is very sort of like a closed environment, but we very much want to make sure that you can still use your favorite text editor and that you still use Git or whatever to version things.
Starting point is 00:41:48 So we've made sure that the repository format of the code base is very amenable to being versioned by Git. So then you don't version the text of the code. What you version are the binaries of the code base. Oh, are they actually binary? Yeah, the code is actually stored in a binary format. So how come binary and not like... Like when you described it to me,
Starting point is 00:42:12 it sounded like some sort of abstract syntax tree or something. Yeah, it is an abstract syntax tree, but it's stored in a very compact format. So instead of having text that represents the thing syntactically, like for instance, a block of code will start with some byte, which represents open block. So it's rather like a byte code,
Starting point is 00:42:34 like a Java byte code, for instance. And so what you version is the byte code, and you can always recover the source code using your code base. And your code base has mappings from like the hashes to the names that you originally used. And so the rendering of the code will look very similar to what you originally wrote.
Starting point is 00:42:53 Or you can keep the source code around if you want as well. But the source code is not the source of truth. That's just sort of the peanut shell. And then once you've cracked open the peanut, you can keep the shells or you can throw them away. Yeah, I feel like we should have started with this as an example because it definitely seems to fit your abstraction idea, right? So the binary can only represent valid programs
Starting point is 00:43:14 or it's more restricted than a text form? Yes, it's definitely more restricted than a text form. I mean, it certainly can represent invalid programs like the abstract syntax tree data type. The algebraic data type for the syntax tree is not absolutely precise, right? So you could, for instance, apply a value that isn't a function to a value, for instance.
Starting point is 00:43:40 But that's why we have a type system, so that we can prevent you doing things like that. So how did you implement this? You're building a compiler and stuff? Yeah, so the compiler is built in Haskell. And we originally had a runtime that was written in Scala. And so we were running Unison programs on the JVM. And that's sort of the current implementation. We're migrating to a Haskell-based implementation of the runtime that's a lot simpler.
Starting point is 00:44:06 What was the, like I assume there was some disadvantage to the JVM? Well, we just weren't really very happy with our implementation of it. We were trying to be sort of too clever. And so the JVM-based implementation turned out to not buy us very much speed, which is the reason why we were trying to write it for the JVM in the first place. But the Haskell-based implementation is a lot easier to understand than the Scala-based one. So what kind of crazy distributed programs can I make with this language?
Starting point is 00:44:33 Oh, I mean, right now, none. I mean, it's still very much in development. And so we're sort of engaging the community of enthusiasts for this. But it's not really ready for everyday programmers to get their hands on it. I mean, you can, it's open source, it's on GitHub, but you can't really write any practical programs in this at the moment. But we're hoping to have an initial re-release of this
Starting point is 00:44:59 sometime this year. That's awesome. Where you can actually write some programs. So if people are interested, what should they do? Go to unisonweb.org. It's all one word, unisonweb.org. And then can we, I guess we can't use it yet. Can we contribute?
Starting point is 00:45:14 Can we learn things? Yeah. So GitHub has a number of outstanding issues and bugs and feature requests and things like that. So there are things that are easy to pick up, so if people want to contribute, they definitely can. And one of the ways in which you can contribute is porting libraries to Unison.
Starting point is 00:45:34 I mean, it might be kind of difficult at the moment because the compiler has bugs, and it's a lot in flux at the moment as of right now. But very soon we'll be able to accept contributions to things like something simple like a text manipulation library or something. Seems like a crazy ambitious project, Rinar. You're rethinking distributed computing
Starting point is 00:45:59 but also just representing code as text. What else are you rethinking? We're rethinking how to interact with compilers. We want to ultimately rethink how code is edited, make that a more sort of fluid experience. Yeah, there are lots of ways that we can go with this, but the fundamental thing that allows us to rethink all that is to think of the of the code as being you know a data
Starting point is 00:46:27 structure and uh having this idea of the sort of content addressable namespace so there's those two things together i think will enable a lot of cool technology that i don't know it seems pretty space age at the moment well i, I think it's awesome. Yeah, I mean, it just really hits home to me with your concept earlier because like an AST, the text file of my code can represent a lot of things that aren't valid. Like just code formatting, I guess. We have a lot of nice code formatting tools right now.
Starting point is 00:47:01 Yeah, that's just something that nobody should ever do. Like think of all the time that's just wasted on like formatting code. Like every time I just like make a new line and hit tab a couple of times or backspace to like make something line up like I die a little bit on the inside. Yeah, I agree. And like there's like weird things like like if I was doing like scheme programming, I really would like to have a way to tell my editor to just treat brackets as indents, like Python style, and not actually see this ending line with a thousand ending brackets.
Starting point is 00:47:35 Yeah, absolutely. And that's another thing that I think that we'll be able to do is to have lots of different surface syntaxes for the language. So if you want a lot of brackets and you want to use S expressions, you can. I mean, we're not there yet. We only have one syntax, which is very Haskell-like, but that's only because that's the aesthetic that we prefer. But there's no reason that we couldn't have
Starting point is 00:47:56 any number of surface syntaxes for this abstract language. You know, very much like Microsoft is doing with C Sharp and F Sharp and all of those languages that are all under the hood are the same CLR. But even more so because the Scala and Java both run on the JVM, but they're not the same language. This is more talking about the presentation could vary, right? But the actual semantics would be the same? Right, exactly.
Starting point is 00:48:20 That is a cool concept. You could get adoption by just having a mode that fits everybody's weird preferences. Yeah, yeah, yeah. Maybe we could make it easy to have sort of pluggable syntaxes. That's awesome. Well, thank you for your time, Runar. This has been a lot of fun. Yeah, likewise. It's been a blast. Thanks for having me. Thank you very much. That's the show. I hope you liked it.
Starting point is 00:48:41 If I'm totally honest, the interview with Runar went in a slightly different direction in the second half than I had planned for, but it ended up super interesting to hear about his new project, to hear about how he likes to learn and approach topics. Just a super interesting person to talk to. If you like the show, tell a friend about it and help spread the word and also maybe join our Slack channel. After the last episode about Scala native, we had some interesting talks about big O notation and complexity in the channel. And also user Joe shared a file server open source project he made in F sharp where you can upload files, but you get to set kind of a time to live on them. Kind of an interesting project exploring functional programming and serving files. So yeah, until next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.