Signals and Threads - Language design with Leo White

Starting point is 00:00:00 Welcome to Signals and Threads, in-depth conversations about every layer of the tech stack from Jane Street. I'm Ron Minsky. Today we're going to have a conversation about programming language design with Leo White. Leo works in our compiler team here at Jane Street and he works mostly on what's called the front end of the compiler which means he spends a lot of time thinking about language features and about the type system that supports those language features. And in particular, he does that work in OCaml, which is the programming language we use on a day-to-day basis. So Leo, to get started, I'm curious, can you tell us a little bit about how you got involved in all

Starting point is 00:00:36 of this? How you got involved in programming language design in the first place? I had some interest in programming languages from a young age. I started programming at, I don't know, 11 or something, and I think I was fairly interested in how these mysterious things were created. The most formative thing I can think of in this regard is between my first and second years at university, I remember reading Benjamin Pierce's Types and Programming Languages. That really clicked. I just thought that stuff was great. And I just really wanted to work on it and explore those ideas. So I think that's kind of one of the main things that got me into it. I knew I wanted to do some programming language stuff. I started doing a PhD in that area. It was more about compilation than programming language design.

Starting point is 00:01:14 And I was working on a source to source C compiler. And I was writing that compiler in a Camel because a Camel had a good library for reading C files. And I already knew standard ML because it's the kind of, it's what they teach you in the first year at Cambridge. So anyway, so I was using a camel to write this compiler and I realized that there was a language feature that I wanted to have that would make it easier for me to write this C compiler. And since I kind of hated the work I was doing on the C compiler and was not that interested in it, I procrastinated by adding this feature to a camel instead. And that's kind of how I got into writing code for the camel compiler. And so, so yeah,

Starting point is 00:01:52 so I was writing this language feature and went through the process of trying to upstream it. And whilst I was doing that, and my PhD was coming to an end, Anil Madhavapedi was looking around for people to join a camel labs, which was this new project to work on the camel compiler that Jane Street was kind of funding in Cambridge. And so he's trying to look for people to work on the camel compiler and suddenly this cambridge address shows up with the you know implementing language features for a camel and so he thought well who's you know i should go and see who this person is and so he got in touch with me and that's kind of i started working at a camel labs as an ra after having finished my phd it's by the way worth noting that most people have the wisdom to not immediately go and dive into adding features to languages that they're just starting to use and in particular diving into the type system

Starting point is 00:02:29 yeah i think when anil saw someone who's willing to just go in and add a new type system feature he was pretty excited yeah i did not realize at the time that this wasn't something that you just did yeah i was yeah somewhat under supervised in my phd and clearly no one told me that this is a terrible idea and a bad way to spend your time. So I just went ahead and did it instead, and I really enjoyed it. I mean, it is, it's kind of terrible. It's difficult, like the type system is hard to dive into. It is, I would not recommend doing it without some assistance.

Starting point is 00:02:56 The thing you mentioned about how you got involved in OCaml is the thing that's always worried me a little bit about how languages are designed, which is to say languages are almost inevitably designed by people who write compilers. And so they go around and look at the features they would like for writing compilers and go ahead and add them to the language. And I always wonder, like, how much of a bias is this in how our languages are designed? I would say not much of one, mostly, because that is probably a fair accusation at a camel, which is very nice for writing compilers and has all the things you would want for writing a compiler. But that is not true of most languages. Most languages are kind of horrible to write compilers in. So I don't think it's applying

Starting point is 00:03:32 there. So you got excited about programming language design from reading Types in Programming Languages, which is Benjamin Pierce's extremely famous book in this area, or at least extremely famous in an extremely narrow group of people. And you've been doing it now for quite a while. Some of that work during your PhD, some at OCaml Labs, and a bunch of it here in the intermediate time. What do you find exciting about language design? Why do you like it? One of the main things I think I like about language design is that it sits at an interesting intersection between some very interesting pure mathematics, especially on the type system design side of things, and a more aesthetic kind of design style. A lot of it is about trying to

Starting point is 00:04:09 find things that fit within the mathematical constraints, but are also a design that people like aesthetically and enjoy using. Right, and this is one of the real challenges of programming language design. It is really not, at its heart, a scientific enterprise, or at least not at the end. This is the thing you often hear programming language people complain about, is the fact that they can't do any proper studies. Or rather, whether they can or can't do it, that in practice that is not how the field is driven. So people do try and do studies in how effective a programming language is, but it kind of hits up against all the things that make it hard to do experiments in the social

Starting point is 00:04:43 sciences. It seems like in principle you could manage it, but the experiments are just very hard to do, right? What you'd really need to do is get a load of professional programmers to agree to use all these different languages to solve the same problems for a prolonged period of time. And, you know, clearly this isn't going to happen. Yeah, it does seem completely and utterly hopeless to me because if the things you care about are what you can do in a programming language with large groups of people solving hard problems with significant amounts of time being devoted to it, which seems like, to me at least, the interesting problem, and I think a problem that a lot of programming language, and then they study the thing they can, which is something like how easy is it to understand the very first errors they see in their language, or how well do they do when they try and do stuff in their first few weeks or maybe few months of working at it.

Starting point is 00:05:36 And that just doesn't get at the actual thing you care about when you think about what makes programming effective in the real world. This kind of genuinely bothers quite a lot of people. I think you work in programming languages. They really would much prefer that it were a proper science, but it's much better thought of as a kind of intersection between some mathematics and design. And those are both things that we know how to study, right?

Starting point is 00:05:56 The scientific method is not the only approach to knowledge. With mathematics, obviously, you can try and prove useful theorems and interesting properties. And people successfully do research in design disciplines like, my favorite example is normally architecture, which I think is a very similar area to programming language design in that it's kind of constrained by mathematics. You have to make sure the building stays up and also is interesting from a design perspective. You want to make buildings that people enjoy being in and looking at and so on.

Starting point is 00:06:28 And there's more than just the raw structural things you have to check. When you think about the design of a building, you think about what are transit times, and people do all sorts of studies to try and get feels for how effective it is. But in the end, no one thinks that the final way to figure out whether a building is good is to do a randomized control trial, where you make four different buildings and you have this business. You have four equivalent businesses that you sit in the four different buildings and you see how it goes. And I think the scale of the economic investment required, I'm not sure if it's not quite as absurd or a little more absurd to do the programming language equivalent, but it's up there. Yeah, I think that that's a better way of looking at programming language design as a discipline. You know, there are two ways of handling design problems. One is

Starting point is 00:07:09 you can solve them in the market, right? You can, you know, go off and have people go and design programming languages and see which are successful. And we have that down. There's a whole system for doing that. And then there's an academic equivalent of this. And a lot of the work on OCaml in particular happens in a context which includes a lot of academic stuff. Do you have a feel for how you would make things like evaluation of papers different than it is now in order to accommodate for this? I don't have concrete suggestions there. I've always kind of wanted to actually seriously look into disciplines like architecture and so on and, you know, look at how they, I mean, they have journals, they have assessment of things, there must be mechanisms by which they do this, but I've never actually sat down and put serious time into seeing what

Starting point is 00:07:48 that is. I'm kind of more struck by the fact that when you're at, say, ICFP, which is the main functional programming language academic conference, that there are topics that the people there care about and think are important in the design of languages that are not ever the subject of talks. Like they will only talk about the mathematical side of things, or maybe some attempts at some social sciences style experiments, but they'll never talk that much about the aesthetic aspects of the design in the talks. In the hallway, where they're all chatting with each other, they'll talk about it there. And that seems wrong to me. That doesn't seem like the best way to do things. Beyond that, I haven't really given it serious thought.

Starting point is 00:08:26 It's one of the benefits of no longer being an academic. I don't have to actually solve that problem. So let's switch gears a little. You've now spent a bunch of time working on OCaml in particular. How do you think about how OCaml fits into the overall space of programming languages? So firstly, I would say it's a general purpose language. Most programming languages that people use get described as general purpose programming languages, but I would say it's particularly true in a camel. It really is not necessarily the best in like every area.

Starting point is 00:08:51 It is strong in a lot of areas. You can use it for scripting things and you can use it for low level, high performance programming, and you can get pretty far with it in both of those worlds. I think in terms of its identifying characteristics, it has good predictable performance, which I think is very important. So this, I think if you take an experienced a camel programmer and show them some camel code, they have a pretty good shot at being able to tell you roughly what the assembly that gets produced looks like. Like it is a fairly simple, predictable compilation story.

Starting point is 00:09:19 Coming back to like where a camel sits, the other like identifying characteristic is it's got a strong type system. Quite a lot of languages in the functional family have these strong type systems, but I think it's a pretty key part of what makes a camel a camel. Personally, I think I could probably no longer manage to write in a language without a strong type system. It really kind of over time shapes the way that you think about programming. It both affects how you design things, and it also, you get into the habit of offloading certain habits of carefulness onto the type system.

Starting point is 00:09:47 And then you get to think about other things and you get bad at doing all the tracking of those things in your head. Yeah. And also you spend time doing things like fairly subconsciously structure the code to make it so that if there's a mistake, the type system will check it. I noticed recently that I can sense when there are two variables in scope with the same type, because that's like danger, you know, like, oh no, I might confuse them. And I, you know, and I can really feel when that's happening. And whereas the rest of the time, oh, they're all different types is fine.

Starting point is 00:10:11 I can't really screw this up anyway. So strong type system. And then the last thing, which I think is more unique to a camel or at least more unique to the MLs is that a camel takes modularity very seriously. So what do I mean by that? I mean, the best way to describe what I mean by modularity is probably to talk about abstraction. So the definition of abstraction is essentially that you have some system. And what you want to be able to do is take one component of the system and replace it with another. And the mechanism by which you do that is you have some notion of interface that describes the behavior loosely of a component. And when you replace one component with another, you replace it with one that provides the same interface as the other one. And so this ability to kind of

Starting point is 00:10:48 swap things out is very powerful and kind of very pervasive in the design of a camel. Right. In particular, there's this feature called functors, right, which allow you to essentially write a module of code that is parametrized over a different module of code. The key thing in a camel, everything lives inside modules. And so functors, which are a module parameterized by another module. So basically just like a function from modules to module, they allow you to parameterize by absolutely anything. So you can take any part of your program and abstract it over any other part of your program.

Starting point is 00:11:21 Very powerful idea. And from a language design perspective, it really forces you to take the idea of an interface seriously. So unlike most languages, a camel kind of has separate files for the implementations and the interfaces, it has a separate essentially language for describing interfaces. And whenever you want to add a language feature to a camel, when you want to make a change to it, you have to think seriously about how would you express that feature in the interface as well as in the implementation. So I saw Simon Peyton Jones, who's one of the originators of GHC and Haskell and so on. And I saw him give a talk where he described laziness in Haskell as wearing a hair

Starting point is 00:11:56 shirt. So laziness in Haskell means that when you write an expression to compute something, it won't be evaluated immediately. It'll wait until you need the result of that computation before evaluating it. It can be useful in terms of avoiding doing computations that you don't actually need the result of, but it means you can't really know when computations happen. They could happen anytime, basically. Right. And in ordinary strict language, you can more or less look at the program and say, I write down this expression, I write down that expression, and kind of more or less in the same linear order as the lexical text of the program, that's the order in which things will execute. And in a lazy language, it's not like it's impossible to know, but it's really complicated.

Starting point is 00:12:32 It's hard to reason, but hard to predict where those evaluations are going to occur. Exactly. And so the reason this, I think Simon would describe that as like wearing the hair shirt is because that forces Haskell to essentially be a pure language. Like, no, you can't really have computations with side effects in a lazy language because, I mean, if you don't know when the side effects are going to happen, it's going to be a nightmare to reason about your program. So the decision to make the language lazy forces them to make sure that it is pure and remains pure. And I think that the modularity aspects of a camel are very similar. You know, they kind of force you to take abstraction and remains pure. And I think that the modularity aspects of a camel are very similar. You know, they kind of force you to take abstraction and interfaces seriously.

Starting point is 00:13:13 You just unwind a bit of terminology there. When we say about purity and side effects, just by a pure language, we mean a language that doesn't allow you to express side effects. And by side effects, we mean things like do mutation or interact with the outside world haskell has a kind of premium on being explicit about which parts of your program are pure computation just computing something without any of this kind of interaction with other parts of the program and other parts of the world and keeping to just this pure subset and keeping it really explicit what that is is nice because it's easier to reason about pure computations than stuff with effects in it. Yeah, exactly. Yeah. So I think that the modularity in a camel is very similar to that. It kind of affects what language features you can reasonably put in the language, or at the very least it affects the design of those language features because you have to be

Starting point is 00:13:56 able to accurately and conveniently express their interface, which is not the case for a lot of features that a lot of languages have. So Simon has actually said about Haskell that the next ML should be pure and the next Haskell should be strict. The way I read that is to say, if you were to build a new language now, he would keep the purity, but he thinks the laziness, strictness being the opposite of laziness, the laziness is not a feature he would add to the next language. In other words, he was happy to have laziness in Haskell because it really got them to invest deeply in purity, but the feature on its own was not necessarily the right default. I'm curious if you feel more or less the same way about modularity. Is OCaml's focus on modularity useful for its own sake or for the ways it structures the language

Starting point is 00:14:39 or both? So I would say both. The idea that you would make an ML and not... The next ML should not have no module system. That's my opinion. So the most recent thing that looks vaguely ML-ish is probably Rust. So Rust's original compiler was written in a camel, so I'm pretty sure they had some experience with MLs when they wrote it, and you can see some of that coming through in some of the language design. But it lacks a module system, and so maybe the next ML didn't have a module system. But I think that's a mistake. I think that's a shortcoming in Rust that makes me sad and makes me less likely to use the language. But it's a bit of a chicken and egg thing. I don't know whether I use a camel

Starting point is 00:15:19 because I like modularity or I like modularity because I use a camel. It's kind of hard to unwind that. Yeah, this is the general Stockholm syndrome problem of language design, which is you use a programming language and you get very used to it and it is hard to keep track of whether you like its features because you use them or whether you use it because you like them. Exactly. So you think modularity is useful for its own purposes. What are the ways in which it puts virtuous pressure on the language designer?

Starting point is 00:15:43 Why is it good that language designers don't add features that they don't have a really clear way of expressing the interfaces of and thinking about how you abstract over pieces of code that utilize those features? A good example of a language feature that would not work in a language of the module system would be Haskell's type classes, which for those who are not familiar with them, are also very similar to Rust's trait system. It's the same thing. These are basically systems for doing ad hoc polymorphism.

Starting point is 00:16:10 When you have a single function that you would like to work on multiple types, that's the kind of the polymorphism part, but you would like the behavior to vary on those different types. So a good example is addition. You probably want addition to work on integers and you also want an addition that works on floats. But the operation is actually very different amongst those two

Starting point is 00:16:28 things. And maybe you want addition to work on matrices and vectors. There's lots of different things on which addition might reasonably be defined. Yes, but it's actually a different function on all of them. And so that's why it's kind of ad hoc polymorphism. Maybe function overloading is the term that people know best for this. Yeah, operator overloading is a mechanism that gives you ad hoc polymorphism. So I would say that type classes, they were a major breakthrough in ad hoc polymorphism, and they're still kind of the gold standard to measure your ad hoc polymorphism features against. But it's a very kind of anti-modular feature. So let's say you want this addition function, and so you define a addition type class, and then you decide that you

Starting point is 00:17:03 want it to work on integers. So you have to, somewhere in your program, write this is how addition works on integers. And the important thing there is there really has to be, just in the entire program for type classes to work, there has to be just one place where you've written this is how addition works on integers. And so if you have one library that's written

Starting point is 00:17:20 an addition instance over here, and another one that's written another addition instance, you will not be able to link them together. And it's kind of anti-modular because you don't notice this problem until you get to link time like you try and link them together and it says oh no those two bits have to live in separate worlds they can't combined like this you've kind of broken some of the compositionality of your language i remember being actually quite surprised about this when i first heard about it because there's essentially no way of hiding this aspect of your implementation you cannot in any way wrap it up such that the particular choices you made about type classes

Starting point is 00:17:50 in the middle of your program aren't exposed. It's sort of anti-modular in a fairly in-your-face way. Yeah, like you are very clearly leaking details of your implementation. You give this example of different ways of doing addition, and then that I think makes it a little bit easier to understand why people are kind of okay with this, in the sense that, like, are there really multiple different kinds of addition on integers? And well, yes, actually there are. There are cases where you really want to have some kind of general operation, which you do want to make different choices in different places.

Starting point is 00:18:19 Addition is maybe not the most compelling example, although I'm sure there are good cases where you really do want to have different forms of addition on what is an underlying level of the same type. Yeah, well, so a good example would be maybe someone has built… so you start with you want a notion of addition, but maybe that's really you want addition as part of some algebraic structure. It seems like you wouldn't want two additions on ints, but actually depending on what people have built on top of that, you might find yourself going, actually I'd really like to say that the addition on ints is actually maximum and

Starting point is 00:18:47 things like that. You might just simply have a case where two different parts of the program have made different choices about what implementation they want, right? It might just be a performance difference, right? In this case, I want to try and do the operation this way. In this case, I want to do it that other way. And then again, you're in this strange place where you literally can't link things together. So I did some work on a language feature called modular implicits, which is kind of based on some mix of type classes and a feature in Scala called implicits. And this is basically a direct attempt to solve the question of how do you make type classes work with module. And you basically, you know, you remove that restriction to only have one of them

Starting point is 00:19:23 and you deal with the problem that that solves in a different way. To my mind, the result is a cleaner system that is certainly subjective and debatable. Yeah, that for me is a good example of the modularity kind of pushing the design in a direction that gives you a better end result. Right. I think this theme of the end result of what is actually a better language feature being subjective and debatable is, I think, a kind of persistent feature of this area of work. Let me poke in another direction, which is one of the things you talked about before

Starting point is 00:19:50 is this notion of mathematical simplicity. And I think it's fair to say that there's a community and approach to language design which really cares about mathematical simplicity. And there's a whole wing of people who design programming languages who really don't think about that at all. And I'm curious what you think about the trade-off there. What are the reasons why someone who's a mere user of a programming language should prefer using a language that was designed with mathematical simplicity in mind?

Starting point is 00:20:15 The phrase mathematical simplicity is probably an oversimplification. A lot of people maybe use the word, would say elegance or something like that. But I tend to think of the problem in terms of languages having sharp corners and the kind of the interplay of different language features. If your language features are in some mathematical sense, fairly simple and composable, then there will tend to be many fewer corners that you can cut yourself on in the language. I think a language that is not got this, I would say something like C++, which I mean, I've never really used it professionally. I've used it, you know, here and there, but it is a language made up of sharp corners. If you want to understand exactly what happens when you use

Starting point is 00:20:53 interesting interplays of templates and operator overloading and this, that and the other, you know, you can very easily end up with code that kind of does what you want mostly, but then you try and use it on a different type and suddenly it breaks for all kinds of confusing reasons. And what's happening there is that the features of the language are not really compositional. They're somewhat ad hoc and complex. And when they interact with each other, it is hard to reason about what they're doing. And it's worth saying this is not like a pointy-headed outsider critique of C++. This is how people inside of C++ think about this. I remember many years ago, I did an internship at Bell Labs and was trying to build a matrix library in c plus plus and this was pretty

Starting point is 00:21:30 early in the language's development and i was like i don't know how to do these two things together i want to have inheritance and i want to have operators that work in a reasonable way and like how am i supposed to make this work and my advisor said oh why don't you know go over to like to the sixth floor over there and talk to the c plus people? Because the people writing C++ were right there. And I basically went over and talked to them and they're like, oh yeah, you shouldn't use those features together. Yeah. The thing I always hear from people who've worked with C++ is always like, oh, it's a really good language as long as you stick to a subset, but none of them agree on what the subset is. Anyway, so I think that's a good example. So I mean, maybe another way to think about mathematical simplicity is really

Starting point is 00:22:04 that when you're looking at languages and the behavior of programs mathematically, really what you're doing is trying to formalize how you reason about programs. As programmers, we reason about programs all the time in many diverse and interesting ways. And people attempt to encode that in mathematics. And if you can encode how to think about your language feature, how to reason about its behavior mathematically, because it is simple, because if it was complicated, it'd be very hard to do the maths there. It's probably just easier to think about. It's probably easier for people to reason about it too. I'm less certain that this is true, but another thing that I've come

Starting point is 00:22:39 to think works out better in languages that are kind of mathematically grounded is that features end up doing more than you might think they would. Like an example that really struck me many years ago was when there's this thing that was added to a camel called GADTs, generalized algebraic data types. And without going into the details of what this language feature is or how generalized algebraic data types generalize on ordinary algebraic data types. It's a language feature for which the obvious use cases that I had heard about and seen seem to all be about the kind of things that you do in constructing compilers

Starting point is 00:23:14 of like representing abstract syntax trees or other representations of program fragments. And it seemed like a great tool for compiler writers, but maybe not a great tool for me since I am not really a compiler writer. And it turns out GADTs have been incredibly useful in a wide variety of ways, in ways that surprised us, and I think surprised people who added GADTs to the language. It turns out to have been, for us, very useful for doing performance optimization and various ways of

Starting point is 00:23:41 trying to get more control over the memory representation of objects in the language have been aided by GADTs in a way that I think a lot of people found pretty surprising. And I tie some of this uncanny generality of the features to the fact that the features hold together in a way that goes just beyond the initial use cases that were imagined by the people who added the feature. Yes, I think that one also kind of comes back to the thing you said right at the beginning about your concern that compiler writers tend to write features that help you write compilers. And GADT does seem like one of those. I would tell you that, you know, obviously all of your programming problems are secretly writing a compiler. And that's why these things generalize.

Starting point is 00:24:16 But I would say that because that brings all of programming into my domain. So I am somewhat biased. You spend a lot of time working on programming language design, but you never design a new language, or at least very rarely think about designing entirely new languages. The work you do is mostly about evolving a language forward. How do you think about the difference between designing new language features from scratch and figuring out how to extend an existing language to make it better? It is harder to evolve an existing language than to work on a new one. I tend to kind of think of writing a language from scratch as kind of doing the whole thing on easy mode. Where's the fun

Starting point is 00:24:50 there? I genuinely enjoy the challenge of trying to fit features into the existing language, which comes in many forms from frustration that keyboards only have so many symbols on them, and there just really isn't room for one more symbol in the language. I've yet to try and use a Unicode emoji for anything, but it'll get there eventually. We talked earlier about the interactions of features to just thinking hard about making sure that your feature is composable and sensible and simple and composes with the stuff that's already there. It also, you tend to hit technical debt in the language. The places where people took a shortcut in a language design perspective and said, oh, they will just make it do this. It'll be fine. Often turn out to be the places that are a bit

Starting point is 00:25:27 of a nightmare for you later when you want to add something else. I enjoy the challenge of that more. I also think that it's really the only feasible way to actually write stuff that's going to get used by real users to solve real problems. If I just make a toy language, the odds of anyone using it other than me are almost zero. The mechanics of how languages become popular are confusing and unknown, and you are very unlikely to produce the next big language. So to a certain extent, you don't really have a choice if you want to actually solve real problems and actually help real people with their programming problems. Part of the reason I think you have to spend almost all of your effort on language evolution is decent languages take forever.

Starting point is 00:26:07 Jane Street started using OCaml 18 years ago. And I remember at the time thinking it was already like a pretty good language, like had a good implementation. In fact, surprisingly good given that it was an academic language, since academics don't usually produce reasonable software. And we've been working on it for a long time. And there's still a lot of deficiencies and lots of things to fix. Writing a really great programming language is literally a decades-long affair.

Starting point is 00:26:32 And so if you want to do great things, it's much, much easier if you can glom onto some existing language and figure out how to evolve it than just come up with some grand new idea. New languages, when they do break through, they tend to always be built around some new idea. Rust, for instance, breaking through is kind of the idea of having some kind of linearity style typing or ownership typing in a language kind of breaking through. I doubt that Rust has the best ownership story that there will ever be. They're the first one.

Starting point is 00:27:01 It's the kind of the big breakthrough. Basically, I just like taking other people's language stuff and moving it onto OCaml and then doing it after they've already done the hard work of working out where all the problems are, which you normally do through trial and error by implementing the wrong thing and then finding out that it was the wrong thing later. In fact, OCaml has kind of a habit of this, which is language features tend to arrive, I guess, depending on what perspective you have, kind of late or early. You can look at various features in languages like

Starting point is 00:27:29 OCaml and think of OCaml as really cutting edge. Lots of languages now are starting in the last five, 10 years to get type inference. OCaml had type inference from the mid-90s. Of course, that's a feature that had already existed in ML, in the ML language that preceded it, for 20 years, right? So even that feature that now seems kind of cutting edge was already kind of old hat when OCaml picked it up. And OCaml over time has lots of examples of picking up features after they have been bounced through lots of other language communities. But it takes such an incredibly long period of time for language features to get popular that you can sort of still be somewhere in the middle and seem both cutting edge and incredibly stodgy at the same time. I guess my favorite example of here is garbage collection because it is the single most obviously useful

Starting point is 00:28:14 advance in programming language. People can argue about how valuable types are, and I'm a really big proponent of types and all of that, but garbage collection, garbage collection is clearly a useful language feature. And it was invented in the mid f50s and got popular in like 95 with Java. You could sort of argue like that's the point where it really hit the mainstream. Yeah. Algebraic data types, another one. That's like Hope in 1980-something. Which, I mean, it's just from a programming language design perspective, like the lack of some types, the lack of types that represent something being either this or this, it's just kind of insane. It's just so obviously dual to records and structs, which have been in everything since forever. I'm always amazed when a new language comes out without them. What's the

Starting point is 00:28:55 most recent big language that came out without? Pretty go. How is there not a variant type in there? It's insane. To say that again, because I think it's really easy to miss, there's this thing that every language on earth does, which is give you a good way of essentially expressing conjunction. I have this and that and the other. And whether you call it a struct or a record or a class, there's always a way of doing that and, that conjunction. And then there's, turns out there's the obvious complement to that, which is disjunction. I would like to have a type that represents this or that or the other. And there are always ways of encoding it. There's union types in C, and you can have things that inherit from the same class in

Starting point is 00:29:31 object-oriented languages and all of that. But it's always clunky and second class and doesn't give you the kind of type checking and static guarantees that you really want. And it is, I agree, the most screamingly obviously useful feature. And I guess among mainstream languages, Swift now has it. And Rust. Swift and Rust. And that's it. I guess you could argue Scala is, again, another one of these things that's maybe falling into the mainstream-ish.

Starting point is 00:29:58 So Scala does, although even there, they have to represent it in a way that kind of fits into the JVM's view of the world. So they kind of look like classes still, which is a bit weird. So another constraint around evolving languages which you didn't talk about is backwards compatibility and the pressure from all the existing source code that's out there, right? Even if you want to make relatively subtle changes in the semantics of a language, it can be very hard to get those changes rolled out because there's millions or billions of lines of code that depend on the current behavior and figuring out where it is and how to evolve that forward can be extremely challenging. The most extreme,

Starting point is 00:30:37 terrible case of this I know of is the Python 2 to Python 3 transition, where people tried to make a whole bunch of changes to the language and it took, I mean, I think it's starting to get finished up now, 15 years or something after it started. I don't know the exact timing, but it's been an incredibly long process. I'm curious how you think about the role that popularity plays in the ability to evolve your language. Oh, that's interesting. Yeah. So, I mean, popularity, I think, does make it harder to evolve your language. I mean, it kind of trivially does when the least popular language is used only by me, and then it's easy. I can just change all my codes and that's fine. There's no compatibility issue. Yeah. So the more popular your language is,

Starting point is 00:31:19 the more difficult it is going to be to change it, at least to change it in a backwards and compatible way. It's worth saying that, like, you know, you can make, especially if you're careful, you can make a lot of changes in a way that just doesn't break anything. And I think where possible, you should just, you should do that. But sometimes it's unavoidable. There's just, you know, something that you implemented earlier is just wrong. Or, you know, there's just some tiny corner that's preventing a new feature from working how it should.

Starting point is 00:31:42 And so you just kind of really need to fix an old feature before you can add this new one. That definitely does get more difficult if your language is more popular. A lot of people are kind of fairly obsessed with the popularity of their language. I'm not indifferent, but not a thing I spend my time thinking about very much.

Starting point is 00:32:00 There's like a certain amount of popularity that is required for you to have an active community that is producing useful libraries and doing good things. And you want to make sure that you have that. Beyond that, I'm less clear on the, you know, the cost versus the benefits. It's a mixed bag. So your goal isn't to make OCaml the most popular language on earth, sounds like. What are your goals for OCaml? I think one thing that's worth saying is that like, to a certain extent, I'm still fairly selfish in my approach to the language design. Like I still kind of think that the point of working on OCaml is to make it easier for me to write programs.

Starting point is 00:32:34 And I try and generalize that to encompass everyone or, you know, at the very least, the other people at Jane Street. But I think that in terms of where ideas come from, I think they kind of come from there. I said earlier that a camel was very general purpose, but it's certainly not great in every niche, in every use case. I'd really like to make it competitive with Rust and C++ and C for writing low-level hand-tuned code. That seems plausible to me. I can see a path by which you add certain features and make it so that you can do that in a way that is convenient. I mean, you can get that in a way that is convenient. I mean, you can get pretty far right now. You can avoid allocating and so on and keep your latencies, so you avoid garbage collection latencies. And we have some upcoming features,

Starting point is 00:33:13 things like unbox types that we're currently working on here, similar to the unbox types in Haskell, that'll give you better control of your layout. So Aki, what do you think of as the limitations of OCaml as a competitor for languages like C++ and Rust? And to be clear, at Jane Street, we in fact spend a lot of time writing very high performance programs in OCaml. So as you said, it is a thing that you can do, but there are certainly ways in which it's not optimal. So I'm kind of curious if you could try and lay out what you think the core limitations are. Sure. So I guess control of memory layout is just crucial to performance on a modern processor, and you don't have as much control of your memory layout in a camel as you do in those languages.

Starting point is 00:33:54 There is some cost to using garbage collection, so sometimes you would like to avoid it, which you can do now in a camel. But when you do it in a camel but when you do it in a camel you lose the ability to create things like records and structs of data and the variant types we were just talking about you lose the ability to use those really you're you know so you you go from a language which is generally higher level and easier to program in the c to actually losing some of the things that c has because a camel relies on the garbage collector to provide those features. The thing that I think is maybe worth understanding about OCaml as a language is it has a bunch of lovely abstractions and then an almost brutally simple way of representing those on the heap. You get very little choice in how memory layout works, and that's in some ways nice.

Starting point is 00:34:42 It leads to an overall simpler implementation. There's lots of benefits from simplicity, but it means that it's sometimes grotesquely more expensive than it should be. Like a fine example is just every single element in a record takes a full machine word. So if you would like to represent a single character as a field in a record, that will cost you 64 bits. And if you would like to represent a bool, which is a single bit, logically, that's going to cost you 64 bits. It kind of doesn't matter what you do. You always take a word. Yeah, I mean, it's worse than that.

Starting point is 00:35:12 Like my favorite would be if you use like a camel's, you know, normal support for 32-bit integers. It's like 10 words per 32-bit integer because you've kind of, you've got to box the thing and then the box thing is a few words. So that's clearly room for improvement. That is not of, you've got to box the thing and then the box thing is a few words. So that's clearly room for improvement. That's, that is not the, you know, the smallest and simplest way to represent 32 bits of data. And so that specific case is something we're looking at quite a lot with the, with the unbox types, which kind of is our approach to fixing that aspect.

Starting point is 00:35:37 So another thing that these low-level languages give you is the ability to allocate things on the stack and refer to those, have pointers into your stack, basically. That gives you a much cheaper than garbage collection way of allocating data without having to copy all, you know, you can make a struct on your stack and then pass the pointer forwards down to functions that you call, which gives you a cheap way of grouping some data together and passing around a reference to that data. So that's another feature that you just can't really do in a camel. The key issue there is that when you're relying on the stack frame for storing the data, the idea is that when you're relying on the stack frame for storing the data,

Starting point is 00:36:05 the idea is that when the function called the allocated returns, the thing goes away, and you have to make sure at that point that there are no references that have leaked out. Yes. There. So you have to, in some sense, control where the data moves around at the type level to make that idiom safe. Yes. Yeah.

Starting point is 00:36:21 If you want to keep the kind of safety that a camel users are used to, then you have to do that. I mean, you can also just be C and let people screw it up, but that's, you know, not great. That's kind of the thing that leads to tons of security bugs. So it's not ideal. Yeah. Those are probably the main things that I think of for the low level languages provide that we don't really have. Another one I suppose is having facilities for conveniently reasoning about resources that you hold. This is kind of by necessity. Those languages lack a garbage collector. So when you allocate memory, this is a resource that you have to pay attention to because you've got to free it yourself. And so they provide a lot of useful mechanisms to help you with that process.

Starting point is 00:36:59 But the mechanisms that are good for helping you manage a piece of memory that you have allocated are also great for managing a file you've opened or a database connection that you've made. Things that need closing when you're done with them. So those languages have some great features for that. And I think OCaml could definitely benefit from brazenly stealing them. Those are all examples of making OCaml a better low-level systems programming kind of language. What other ways do you see that the language could be improved? The type system is expressive, but it is not the most expressive. There are more interesting

Starting point is 00:37:35 type systems that let you express more properties of your program. In particular, I'm thinking of the dependently typed languages. This is theorem provers like Coq and I guess Lean and languages like Agda and Idris. Dependently typed languages are maybe a bit difficult to explain in a short period of time. A classic example would be the ability to have a type for lists of length n, where n is not some fixed constant. n is like, you know, a variable in your program. Like, you know, someone passes you a number n and then a list that's as long as that. And the type system knows that it's exactly that long. And, you know, so if n is zero, it knows that the thing is empty. And if it's one, it knows that you don't have to worry about, you know, you can just take the first element, you don't have

Starting point is 00:38:15 to think about whether it's empty or not. That's the quintessential example of a dependent type. But that's very expressive. And it's called dependent because the type depends on the value, essentially? Yeah, yeah. So the type of those lists was depending on the value of the integer or the natural number that you would, you know, n, basically. So anyway, I'd like to push a camel in some ways in that direction. So Haskell is doing this in a big way. There's been long-running projects to kind of make Haskell more expressive like this.

Starting point is 00:38:43 And I would like to do similar things in OCaml, especially kind of around the module system, which is already vaguely related to dependently typed languages in some ways. If you could make a pitch to a grizzled old systems programmer like myself as to why I should care about this fancy type theoretic dependent types nonsense, like why will this make my life better as an OCaml programmer to have easier access to dependent types in. Like, why will this make my life better as an OCaml programmer to have easier access to dependent types in the language? It's quite easy to make that argument. What GADTs are giving you is what this is giving you, but less broken. GADTs are to some extent a broken version of dependent types that are not actually quite the thing you want. And so since, as you

Starting point is 00:39:20 said earlier, you've already been convinced of the use of GADTs. I can, you know, it's not that hard. It's hard to take you the rest of the way. And I'd probably do that. I'm not quite sure how you take someone who's never written anything using something like GADTs and show them that this is how they want to write their programs. Can you maybe come up with an example of a time some Jane Street programmer came to you with a practical problem of like, I am trying to achieve X and I can't figure out how to jam it through the type system that having proper dependent types would have made easier? Yeah, sure. So I think that people currently use in Jane Street, they will write code where they have a type and the

Starting point is 00:39:55 type is parameterized by some notion of permission, basically, whether like, whether this is a read thing or a writing thing, a file is an obvious example for this, you know, is this, is this a file handle rather, you know, is this a read only file handle or a write thing. A file is an obvious example of this. Is this a file handle rather? Is this a read-only file handle or a write-only file handle? Or is it read and write? So the way that they achieve that is they make up a type called read and a type called write. And then they have their file type be a parameterized type. It has a type parameter. And that type parameter is either going to be the read type or the write type. That's kind of how they achieve that. But this is clearly kind of nonsense. This is just like

Starting point is 00:40:28 an encoding to get the type checker to check what they actually want. Read is not a type like int is a type. There are no values of type read. It's kind of just a fiction they're creating to get the behaviour that they're after. And so dependent types would let you do that properly by actually, rather than indexing your type by these made up types, you would index your type by an actual value of a sum type, something literally just like an enum in C that's just either read or is write. You would have a read file and a write file, and there would be an actual data type that you could actually write functions on and things like that. This actually reflects a problem we've run into a lot, which is there's lots of places where you try and get some extra power out of the type system. And so you do

Starting point is 00:41:09 some clever encoding to get the type system to check an extra property that you want. But I think people are often surprised at how error prone it is to get that encoding just right. Because as you say, it's kind of disassociated with what you're really doing. OCaml programmers are in some sense spoiled. They're used to the fact that there's a large class of errors that the type system just automatically captures for them. And they're kind of used to things clicking together. But when you start doing these kind of encoding tricks

Starting point is 00:41:36 to add extra rules into the type system that aren't naturally there, they end up being kind of free-floating and it's very easy to just screw it up. And you think you've encoded some permissioning thing correctly, and you could just get it wrong, and it's very hard to notice that you got it wrong. So it sounds like having something that's kind of more organized around dependent types would ground this kind of coding, so it would be less likely to just be wrong in a silent

Starting point is 00:42:02 way. Yeah, that's right. I mean, every time people are doing that encoding, dependent types are roughly the feature that lets you actually just directly write what they're trying to do. And that's always going to be easier if you're not trying to encode something,

Starting point is 00:42:13 if you're just trying to, you know, you can just write down what you actually mean. It's just simpler. And like, that's how I think of that feature is that it's really, it's about doing a thing that people already do. As soon as you give them anything that's a bit like GADTs,

Starting point is 00:42:24 they straight away start doing this kind of encoding stuff. And it's really, that's telling you that like, this already do. As soon as you give them anything that's a bit like GADTs, they straight away start doing this kind of encoding stuff. And it's really that's telling you that like, this wasn't the feature you should have given them in the first place. You really, it's dependent times what they were after. You know, and it's a, it's a ton more work than implementing GADTs. So, you know, there's a reason one comes first. Anyway, I would, I would love to add that to OCaml in some capacity or other.

Starting point is 00:42:41 So I want to switch gears yet again. Can you talk a little bit about what it's like working with the open source community around OCaml in some capacity or other. So I want to switch gears yet again. Can you talk a little bit about what it's like working with the open source community around OCaml? I think one thing that's interesting about the position that Jane Street has is that we're kind of not at the two endpoints that organizations often are when it comes to working on a programming language. I think by far the most common situation to be in is you are victim of the language that you are using, which is to say there's a programming language, it has certain properties, and you're going to use it and figure out how to deal with those properties, good and bad, and you have very little ability to affect things. And then there are organizations that

Starting point is 00:43:18 are totally on the other end, where they are the designers of a new programming language. Think Google with respect to Go or Mozilla with respect to Rust. I think in those cases, it's an organization that created a programming language and got to set up, even if there is a larger community and they got to set that community up

Starting point is 00:43:37 and decide how it was going to run. And we had a very different experience. We started as mere supplicants. We were using the language and didn't really have the power to make changes ourselves and would occasionally ask nicely for things. And over time, we've moved to a spot where we are among the significant contributors to the language. And I'm kind of curious how you think about negotiating that situation, especially when it is, as it really is the case, that we don't have the deciding say. It's not, in a sense, our language. It's a language that we participate in the community of, but we definitely don't have

Starting point is 00:44:08 the last word. We definitely don't have the last word. I mean, you can go to GitHub and observe us attempt and fail to get various things merged upstream to see that we don't have authority to just go and put whatever we like in. Yeah. So I think firstly, it's worth pointing out how much we benefit from this. If we were to like fork and just manage our own language and do our own thing off to the side, we would lose, you know, the review from the upstream developers who've been working on a camel for 30 years and really, you know, are the kind of people you want reviewing your code. Yeah. All the libraries and tools in a camel which we can use that come from the excellent open source community. So there's a lot of benefit to, I think, working with an existing language and working with an existing community that you don't just kind of own.

Starting point is 00:44:55 And I think if you look at languages where places where they have kind of spawned the language, they often go to great lengths to try and give the ownership away, essentially, to try and get some of this. F Sharp, I think, is a good example of that, where Microsoft is trying to let the community take a bigger and bigger role because, you know, the input that you get from that doing so is great. At the same time, it obviously kind of slows us down and means that we have to stop and think about things. So I think effective communication with upstream is kind of the key to having this process be successful and useful for everyone involved. It's not a thing we've always got right.

Starting point is 00:45:30 Sometimes we'll develop something completely internally for far too long and not really show it upstream and then just kind of drop this finished thing. And I think that that is a lack of communication and that's something that you should try and avoid. You can see how it happens. It's very tempting to just kind of not show up people a thing until you've got it just right. You want to add things to the language and for it to make progress. And the process of upstreaming

Starting point is 00:45:53 is obviously, you know, the mechanism by which that is slowed down. And, but I think that that's kind of by necessity. I think language design is quite a slow process and should be quite a slow process. The backwards compatibility things that we were talking about earlier mean that, you know, you put the wrong thing in, you will be stuck with it kind of indefinitely, or you will have to do some huge amount of work later to undo it. So you really want to be sure that you're doing the right thing in language design. So I think that necessarily makes everything kind of move a bit slowly.

Starting point is 00:46:20 And that can be a frustrating process, but pros outweigh the cons. You talk about ways in which, you know, we've not always done the ideal job in communicating the changes that we want to make upstream. And it sounds like OCaml itself is also evolving in this regard. The larger community is trying to get more organized and get better about the way in which it communicates and discusses new features. Can you talk a little bit more about how that process has evolved? I mean, so when I first submitted a feature to a camel, they used, what was it? Mantis for bug tracking, some old web program.

Starting point is 00:46:50 You would just kind of copy paste a big patch onto that bug tracker and be like, here you go. Maybe apply this to the, I think it was SVN. It switched from CVS to SVN for its version control management before I started. So that process was obviously not ideal and did not encourage external contribution very much. I think over time it switched to putting the source on GitHub, using GitHub, the kind of GitHub pull requests and issues for managing the process of contributing code. And I think that's aligned with like more people contributing to the, to the compiler. Recently we've created an RFCs repo. So like a request for change,

Starting point is 00:47:23 I guess is what it should stand for in this case. So basically somewhere where people can go and say, here is a design document explaining something I think I would like to add to the language or add to the standard library or change about the implementation of the compiler. And so the point of that is to really let you collaborate earlier in the process. GitHub is very good for collaborating on the code, but you start by writing all the code and then you start collaborating. So the idea here is to move that, get to use the same workflow, but earlier on. So you start by writing the design and you collaborate on the design and then you go away and write some code.

Starting point is 00:47:55 Rust is the great example of how they manage this process of people submitting design ideas. Their process is more open than ours. They really are encouraging just anyone to come along and write ideas. I don't think we have the kind of manpower to deal with the kind of flow of design things that they get. It's a little bit more of a tool for the existing compiler developers, but still it's, I think, a step in the right direction towards increasing collaboration and increasing communication. So you were talking a little bit before about forking and why it's not really a great idea to just straight up fork the language. And indeed, that's something I get asked about a lot. People are like, oh,

Starting point is 00:48:29 Jane Street has a few thousand people and lots of resources. Why doesn't Jane Street just fork? And I think you did a good job of explaining the downsides of that. But I wonder how you think about forking a little, which is to say some of the constraints you were describing around language evolution have to do with the problems of backwards compatibility. Another constraint that you talked about, essentially, is the need for feedback. You're saying, in some sense, what OCaml has been doing, the MO has been, wait for someone else to try something out in some other language, work out the theory, write a few papers, try a few awkward implementations in other languages. And then when you see the dust settle, well, okay, pick something up and try and get the best version of that into a camel.

Starting point is 00:49:08 But it's super tempting to want to do some of that experimentation. And it feels like a place like Jane Street feels like in some sense, a safer place to do it. Because one of the things that we can do is we can fix things. We have these kind of workflows around these large monorepos where we have an enormous amount of control and we can add a feature in, expose it only internally, and only turn it on in a few places. And then when we discover inevitably that we did a terrible job of it, we can rip it out and just update all the code that used it or modify the feature in significant ways and worry much less about these kind of constraints

Starting point is 00:49:45 of backwards compatibility and the unbounded usage of the feature, and by dint of that get a lot of extra feedback which could hopefully help us iterate and innovate faster. What do you think the trade-offs are here? Should we do more of that? I think it's basically, it would be good to be able to do this, and the thing that places limits on it is the management of that kind of fork, like how much, you know, rebasing large changes over a moving code base is a kind of painful and expensive process. And that's the kind of cost you're going to pay to have this forked version of the compiler. To be honest, I think we are leaning more and more in this direction. I think we kind of, you know, we've always had our own

Starting point is 00:50:22 branch because, you know, you want to backport a bug fix or something from, you know, before you've moved to the next version. And I think that the patches that we are having on our own branch are getting bigger. So I think we are leaning in this direction. But yeah, I think the limit on it comes from just the technical aspects of how you manage that fork. It's striking how, in some sense, crude our method of extending some other person's software is. We literally represent the changes we want to make as a set of textual patches, right? Where you just say, oh, let's like remove this line and put some other line instead.

Starting point is 00:50:52 The surprise there is not how hard that is to maintain, but that it kind of works at all. The thing you can do to try and improve it is obviously to kind of improve the abstractions in the compiler itself, you know, to kind of put some APIs in place, put some decent interfaces on things that don't have to move all the time. That makes it easier for things to evolve independently without breaking each other.

Starting point is 00:51:13 There are whole language communities that work in some sense this way and have easier to use ways of extending the compiler. I think of languages like Racket, which is a descendant of Scheme, itself a descendant ofacket, which is a descendant of Scheme, itself a descendant of Lisp, which is a language that goes back to the mid-50s. But if you look at a language ecosystem like Racket, everything is designed around making the compiler extensible so that people can write their own extensions without having to fork. Can you imagine a future where we would have more capabilities of that kind inside a language like OCaml? To me, that kind of Scheme scheme style extensibility, working out how you do that

Starting point is 00:51:48 in a typed language in a way where you maintain the type safety. That's the thing I think of as the holy grail. That's what a lot of work in, I think, programming language design is kind of towards. I'm not sure that people doing that work are explicitly thinking in those terms, but certainly that's how I see it. Yeah. So I would basically just say that like, yeah, that'd be great, but that is an open research problem. And can you give a little more texture as to why it's hard? Why does a type system make this kind of extensibility more difficult? So if you want to add a new type system feature to a language without breaking its safety,

Starting point is 00:52:22 the code of that feature, like the implementation that the user is writing, it's going to need to have certain properties and you are going to have to show statically that the code that they've written has those properties. So essentially you're going to end up needing a type system that is very expressive in order to express the type of language extension that doesn't break everything, which is kind of roughly what you want to do. If you've ever tried to implement or prove correct, prove sound at least, a type system in a theorem prover, which I have done, it is a painful and slow and very time-consuming experience. I've been working on a type system in implementation in the Koch theorem prover for like, I don't know, four or five years. I proved it sound after about two, but it was still two years of work to get to that bit.

Starting point is 00:53:07 Is maybe a way of saying this, that in order to make the language modular and extensible at the level of the definition of language, you essentially have to make the proof of correctness of its type system also modular. Yeah, I think that's a reasonable way of thinking about it, yeah. And also, so what I just described with the writing the proof is kind of like you write the thing and then afterwards you kind of prove it to a, yeah. And also, so what I just described, like, with the writing the proof, it's kind of like you write the thing, and then afterwards you kind of prove it to a certain extent. And actually, you'd probably want to do this in a more, like, correct by construction way,

Starting point is 00:53:31 where you'd only let people express things that actually worked. One day, I'm, you know, hopeful that we will have that kind of ability. But yeah, it's difficult. So we talked about this a little earlier, but one of the things I often worry about when I think about how OCaml is evolving forward is the kind of Stockholm syndrome you get of just kind of getting used to the feature set you have, thinking that these features are great and all I need, and learning how to encode all the things you want to do in this, no matter how awkward the encoding.

Starting point is 00:53:59 I think one way of escaping from the Stockholm syndrome is just to think about other languages and what they provide and, you know, think about what useful features you can steal. And I'm curious if you have any thoughts about features you really like in other languages that you wish OCaml would have. First of all, we've already mentioned a couple of things. We mentioned type classes and dependent types, ad hoc polymorphism. Well, you just mentioned Scheme. So macros. Yeah. So I think of like macros in Scheme and Lisp as kind of doing two things that are like related. So one is like extending the language to add new language features, which we were just talking about. It also is for like meta programming where you want to just have some code that generates some code, which is a like slightly simpler problem. So there

Starting point is 00:54:35 are ways of doing that. You can write macros that, you know, generate correctly typed code. And that's sufficient for this problem, which is much easier. So, um, I have some work in that area. What are the kinds of problems that you think are best solved by writing a program that writes a program rather than just like the normal one-step thing of writing a program that does the thing that you want? I think about this a lot in terms of making the abstraction free. Let's say I have some problem and I want to write some code that does something. And I have a piece of data that represents the solution to that problem. And one way to implement my solution in code is to have a function that reads that data and executes the solution that it represents. And so at runtime, that's going to be literally

Starting point is 00:55:14 reading in this description and then kind of acting on it. And that's a nice way to write code, but it has a cost. You're kind of reading and following the instructions live at runtime. It's like, you know, it takes time. It's actually, you're spending computation on that. If you replace that with something that's staged, something that's kind of based on macros, you do that reading at compile time and just generate the kind of expanded version of the code. And so, you know, the performance improves. So it's kind of a way of being able to write nice high level code that has the performance of low level code. Can you make that example a little more concrete? You're saying you have some data that represents a way of solving a problem. What's an example of how that would come up in real life?

Starting point is 00:55:52 I worked with an intern who implemented some macro stuff for a camel. There was a good example that he'd wrote based on a library that Oleg Kiselyov had written for writing code operating over like lists of data or streams of data. And essentially you would describe a series of operations that you wanted to do on the list. So you kind of like, oh, I want to map over the list. I'll map of this, adding one to it, then I'll filter the elements out that are even, and then I'll map over the list again and divide by two everything in the list. And then I'll run a function that kind of i'll take the maximum number out of all of them something like that this is clearly a nonsense program but you get the idea and so that's the kind of high level description you would write right the very specific thing is

Starting point is 00:56:33 a nonsense program but what you're describing is is a very real way of constructing programs which is as these kind of chained sequences of transformations that shows up all the time that part does not feel made up to me at all. Yeah, yeah. But anyway, so what the library would do is it would then, it would read that description and at compile time, it would produce code that was just always a single while loop operating on the list. You know, rather than iterating over the list,

Starting point is 00:56:57 making a new intermediate list and then making another intermediate list and another intermediate list, which is, you know, has an obvious cost. It would all just become like a single while loop that just like read the first list and did the necessary things to work out what the final answer was. And that was like a static guarantee. The tools you were given in the input language are such that no matter how they're composed, you could always produce just the single loop.

Starting point is 00:57:18 And part of the point of this example is the method of optimization, the fact that you can collapse all of these things together, is something that is under the explicit control of the person who's providing the library for doing these transformations. And I think this highlights a thing that comes up in a lot of the different design work you've been involved in is this choice between two different ways of thinking about optimization.

Starting point is 00:57:39 One kind of optimization is you have like a nice, general idiomatic way of writing programs, and then you try and at the level of the compiler, find ways of transforming that into machine code that operates reasonably efficiently. And then there's this quite different approach of trying to give very precise control to the programmer where you really take the performance behavior as a first class part of the semantics that you're presenting to the programmer. And it sounds like your point here about macros is it's an example of the latter. It's a way of giving control over performance-relevant details to the person who's writing the program.

Starting point is 00:58:12 Yeah, I think that's a good description of it, yeah. So I guess the feature in other languages would either be kind of delimited continuations in the lisps or async-await style thing that's in almost everything now. So async-await is's in almost everything now. Async await is specifically tools for writing concurrency in what I would call direct style, so conveniently expressing concurrency. Delimited continuation is a more expressive version of that where you can write more general what I would call control effects. I'll just give some examples. I think that'll give people the idea. Async await is a good example, generators, coroutines. Right, generators are a feature that show up in Python.

Starting point is 00:58:49 The people may have seen a lot of ways of conveniently writing complex programs that generate things that essentially morally look like lists, but behind them have some computation that's generating the things on the list. Yep. And so the feature that we want to have to account for that is algebraic effects. And that's kind of an attempt to steal basically those kinds of ideas, especially the limited continuation style of things. I think algebraic effects is quite a good example, actually, of a couple of the things that we were talking about. So on the one hand, it's a good example

Starting point is 00:59:17 of how a mathematically simple approach can be beneficial because they're very mathematically grounded. There's a very elegant mathematical explanation of what algebraic effects are. And that simple mathematical construct isn't able to express a load of ad hoc constructs that have been added to other languages. It lets you do async await and generators, yeah, and coroutines. And depending on how you implement,

Starting point is 00:59:38 you can have backtracking search and things like that, all of which just provided by a single language feature. So that's a good example of that. Another thing as an example of is that, with the idea that you can go and take an idea from somewhere like Scheme and then work out how to type check it to give you a language feature.

Starting point is 00:59:51 It's funny. I think the people who work on languages like Scheme and Racket and all of that actually think in a very type-oriented way. In fact, a bunch of those people wrote some key type theory papers. But there's this kind of superpower they have. Operating in an untyped context gives you a lot of freedom to do exciting experimentation

Starting point is 01:00:10 around language features, which is really quite hard work to do in a typeful context. So yeah, the idea that you see what are the interesting things they do, wait a decade or two to see which one turns out to be really valuable, and then spend the effort to figure out how to lift that into a language where you have a type system as well. Yeah. So this is, I think this is just generally always a good idea. That macros design we talked about was like, yeah, very much about looking at Racket's macro system and working out how to make it work with the Camel's module system and with typing. And yeah, algebraic effects are, there's a technical term for it. They're essentially macro expressible with delimited continuations. That means that you could do a totally local transformation. So you could take a program written with delimited

Starting point is 01:00:48 continuations and turn it into one written with algebraic effects and vice versa, entirely locally. You don't have to look at the whole program to see how to do this. You can just do it locally, which means it's a good way of saying that they are equally expressive. But that translation is not type-preserving. So essentially they are are from a dynamic semantics perspective in terms of like what happens at runtime they're the same language feature basically but one is easy to type and the other is not and so algebraic effects are kind of that that's one way of thinking about them one of the things i've always really liked about okaml is that okaml is a language which which is pretty easy to read which is to say i can look at an OCaml program and without too much effort understand what it's doing and why.

Starting point is 01:01:28 And part of that easiness of reading is that it's relatively explicit. This is a funny thing to say for a language that has type inference. But it's still kind of the case that for the most part you can pretty straightforwardly trace through OCaml programs and without a a lot of guessing, figure out what it's doing. And some of the features that you've been working on over the years, and in particular this kind of ad hoc polymorphism style feature, modular implicits, in some ways compromises this, which is to say it adds a lot more interesting inference on the part of the compiler to decide what needs to be done. In particular, you write down your program and then some type inference happens where it figures out what the

Starting point is 01:02:09 types of different pieces are, and then it decides which concrete code to dispatch based on the types. So I'm wondering how you think about trade-offs around explicitness and implicitness and what we gain and what we lose as we add more of these kind of powerful search-based techniques for deciding what code to dispatch into the language. Sure. So with modular implicits, I think there is a certain amount of implicitness necessary for any kind of ad hoc polymorphism. Kind of, in a way, the point is that being able to write 1 plus 2 and 1.5 plus 2.5 and having those both do the thing people think they will do. And like, you're clearly missing some information there.

Starting point is 01:02:49 So some amount of adding some implicitness is kind of the point. And then I think from there, we tried to basically design the system with the least amount of implicitness possible, like keeping things as explicit as you could whilst getting this little bit of, I don't like implicitness. Is there a better adjective implicitly implicit yeah i i've got nothing yeah anyway trying to have the the least amount of that that we could get away with so i said that the feature was kind of i mean you can tell from the name the feature was kind of based on some of the work in

Starting point is 01:03:21 scala on implicits and scala implicits have a reputation for being extremely implicit and hard to reason about. And I often have to reassure people that whilst we took inspiration from there, we did not copy all of it. And that a lot of the aspects of Scala's design that make it extremely implicit are not really present in our version. Can you say a little bit more about what are aspects of how implicitness works in Scala and how that can be confusing in ways that you've tried to defend against that in the context of modular implicits? Right. So the basic idea of both language features is that you look for something based on its type. In the accountable

Starting point is 01:03:59 modular implicits case, it's a module you're looking for in Scala. It's just kind of any value, but either way, you're looking for something based on its type. So I'm just gonna say, assume that you're looking for an int, which is not likely to be the kind of thing you want to look for by this mechanism, but it's easy to think about. You know, you've got a list of possible things, and you're trying to find the one that's an int, and you're going to use that one. And in Scala, the places where they search for that int are complicated because you have the type of the thing that the method is being called on. So you go to its class and look inside of it there and see if there's an int in there

Starting point is 01:04:30 that looks good. And then you go to, you know, the classes of the arguments, the other arguments and see if they've got anything. Then you go to the friend classes of those classes and see maybe there there's something good. And it's just, it's difficult to understand where it's searching. It's just a bit more complicated. I'm not sure there's much they could do about this. I just think, but in modular implicit, you just kind of, instead, it's difficult to understand where it's searching. It's just a bit more complicated. I'm not sure there's much they could do about this.

Starting point is 01:04:45 I just think, but in modular implicit, you just kind of, instead, it's got a very simple notion of scope. You're just looking kind of basically up the file to see if it's there. You know, maybe you can include other files, but again, it's in a very simple way. Some of these things are not things that one should exactly blame the Scala designers for in the sense that they're to some degree solving a harder problem, which is that Scala is a language that has to cope with something like a Java-style object-oriented system and also something like an OCaml-style, Hindley-Milner-whatever kind of style type system.

Starting point is 01:05:15 And doing those together increases, in a to some degree unavoidable way, the complexity of the language. Probably there are things that one could have done differently, but there's a sense in which the problem in OCaml is fundamentally simpler. Yeah, exactly. There's not things one could have done differently, but there's a sense in which the problem in OCaml is fundamentally simpler. Yeah, exactly. There's not much they could have done on that part. There are some other things they've done where they could have gone another way. If you're looking for something and you find two possible candidates, there are some rules by which they decide which is the best candidate. Whereas in modular implicits, if you find two, you just error and say,

Starting point is 01:05:43 no, it wasn't obvious which one you meant. Like both of these would have worked. And that I think makes things quite a lot safer. If you think about reasoning about the code, that's a lot easier if you don't have to worry about that problem. I still suspect it's the case that as with lots of language features, how well it works in practice also depends on the approach and discipline with which people start writing libraries to use it.

Starting point is 01:06:06 I think once you have this notion of implicit selection, you could probably design libraries that use it all over the place, or you could design libraries that use it in a small number of places where it's high leverage. And I suspect those kind of choices probably affect in practice how easy the language feature is to use. Yeah, I mean, I think there's always a question when you're designing a language feature about exactly how much rope you want to give people. And then it's kind of up to them, whether they hang themselves with it or not. You can give them

Starting point is 01:06:32 less and trust them less with it. And often that's the right decision. But there's always a trade off there. You know, algebraic effects are also I think, another thing that adds a certain amount of implicitness. Originally, we thought about adding them untyped without any types tracking the effect. Depends how you look at it. But one way to think about that is that we looked at GoTo and thought the problem with it was that the labels weren't dynamically bound. The problem with GoTo is that you statically can tell where you're jumping to, whereas clearly what you need is dynamically bound GoTo. Dynamically bound GoTo considered awesome.

Starting point is 01:06:59 Yeah, exactly. Which sounds, yes, sorry. There's a very famous paper, GoTo, considered harmful about how GoTo is a terrible idea and far too implicit in how it does things. And also dynamic binding is historically thought of as mostly a mistake in a lot of language design things for again, being really horrendously implicit. And so the idea that combining the two would be a great idea is amusing, but actually in practice, it is actually a great idea idea they only look like that from a particular angle and if you look at them a different way

Starting point is 01:07:27 you can see that it's totally fine and that none of the issues that you might have with something like that should arise Okay, well, thanks very much this has been a lot of fun Thanks You can find links with more information about the topics we discussed

Starting point is 01:07:44 including some of Leo's talks and papers about things like algebraic effects and modular implicits, along with a full transcript of the episode at siglalsandthreads.com. Thanks for joining us, and see you next week!

Signals and Threads - Language design with Leo White

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.