CoRecursive: Coding Stories - Abstraction and Learning with Runar Bjarnason
Episode Date: March 15, 2019What is abstraction? Can we have a precise definition of abstraction that, once understood, makes writing software simpler? Runar has thought a lot about abstraction and how we can choose the prop...er level of abstraction for the software we write. In this interview, he explains these concepts using examples from the real world, from SQL, from effectful computing and many other areas. We also talk about how to learn and acquire the skills necessary to understand abstract concepts like very polymorphic code and category theory. Runar also explains his latest project unison computing and how it uses the correct level of abstraction to rethink several foundation ideas in software development.  Links: Constraints Liberate Maximally Powerful, Minimally Useful Unison Computing Webpage for show
Transcript
Discussion (0)
Hello, this is Adam Gordon-Bell. Join me as I learn about building software.
This is Code Recursive.
Yeah, it's a sort of mind-body integration thing. I think that it's just like learning to dance or play the guitar or whatever.
You can't really know how to play the guitar abstractly. No one can really just tell you how to play the
guitar. You got to do it. That was Runar Bjarnason. He co-wrote Functional Programming in Scala,
a book that was very influential for me and for a number of people I know. But we aren't talking
about that today. Actually, go back to episode five for that interview. Today we talk about abstraction and precision and also the
relationship between constraints and freedom. My take on Runar's ideas is that when writing
software, if you constrain yourself to the least powerful abstraction that will work,
then you will have more freedom to compose things at a higher level.
Brunar, I interviewed you probably more than a year ago for the podcast, and we talked a lot about the basics of functional programming from your book, Functional Programming in Scala,
that I guess you co-wrote. And a little bit at the end, we started talking about abstraction.
So thanks for coming back.
And I was hoping we could dig in more on that topic.
Yeah, my pleasure.
Absolutely.
So in broad sense, what is abstraction?
What is abstraction?
Abstraction is a sort of mental process that you go through all the time.
Like it's the essential function of the human mind.
And it's just like taking a bunch of things that we have observed or encountered or experienced in some way and then grouping them by their similarities, but in a way that they're different.
Like grouping them by like a specific way in which they differ.
Do you have an example?
Do you have an example?
Well, like for instance, size.
We can have an idea of size.
It's an abstract idea.
So we'll have a bunch of things, and we'll note that, you know, they're all different in some particular way.
And we'll call that way size.
But with regard to this idea, we're going to regard them as the same in every other respect.
So this is a way of constructing sort of a new entity in our minds that we can manipulate and sort of have.
And that we can give it a name.
And that refers to this particular difference among things.
And then we're going to then basically regard those things as being the
same in every other respect. So it's like, if I have like an apple and a car, when I talk about
their size abstractly, I'm not worrying about their caloric value, but merely that an apple
is smaller than a car. Right. Yeah. So they have different size, But in every other respect, with regard to this particular idea of
their size, they are incomparable or the same. And, you know, you can start with a really
imprecise idea. Like you can just say, well, the apple is small and the car is big. So there are
only two sizes, big and small. But then as we get more and more abstract ideas, we can start to be very precise about how we compare, for instance,
sizes or colors or shapes or programs. Yeah, you use the word precision there. Actually, maybe
before we get to precision. So how does this apply to the world of actual software development or
code? So the way I'm sort of looking at this in my mind is that we're viewing the things that we observe, so out there in the real world.
We're sort of looking at them and sort of lining them all up in some way.
In such a way that we sort of punch a hole through the whole stack of things that we're integrating under some abstraction.
And then we're going to sort of fill that hole with some quantity or quality. Like for instance with size
we're going to just line everything up so it has a size shaped hole in it and then we're going to
fill that with either big small or like an integer or square meters or whatever it is. I'm sort of
viewing it as you know like a stack of things that with like a hole punched through it that we can then fill with some other idea.
And then the same kind of thing happens in programming when we abstract.
That is, we take a program and we want the program to vary in some way.
Like, for instance, we might want to take the sum of the numbers 1, 2, and 3,
but that's not very abstract.
So we might want to be able to take the sum of any numbers.
And so we sort of punch a hole in our program and say,
well, we're gonna actually take those things as an argument.
We're gonna call the numbers like X, Y, and Z.
And so we have number shaped holes in our program
that we're then later going to fill with some values.
So like variables are a form of abstraction. Is that the example? Yeah. Yeah,
exactly. So a variable you'll note can't be absent. You can't ignore the fact that it needs
to have some value, but it could be any value. It stands for absolutely any value of a particular
type, a value that fits that sort of hole in the shape. This is an interesting example. So like one plus two plus three,
or let's just say one plus two is like a concrete addition,
but like X plus Y is abstract.
And I think you said before that somehow that's more precise.
How is X plus Y more precise?
Oh, is that more precise?
I think that it's certainly more abstract.
Yes, yes.
One plus two is already pretty abstract.
Because like even the idea of like one and two, these are abstract ideas.
Like we're not specifying one of what or two of what.
So these are ideas that work universally for anything that we might want to consider one or two of, right?
Yeah, like it's just counting.
We're not counting a specific thing, but just the number itself.
Right, right.
So that's an example like of where, so just at that level is an example of where like
an abstraction buys us precision.
Like out in the world, when we're like looking at the actual universe, it's sort of undifferentiated
and it's just this one big interconnected mess.
But we can abstract out like
two of something like we can notice that there's a pattern that like oh everyone has some a
configuration of eyes uh or legs or like you know there might be like a pair of sheep in the field
and two stars in the sky or something and they all have this same relationship is that that there's
like there's two of them right so then we have this notion of is that there's like, there's two of them, right?
So then we have this notion of two.
And then, you know, from there, we get all the natural numbers and arithmetic.
And now we can start to be very precisely about adding things together and subdividing groups of things.
So we can do very precise arithmetic now with these abstract quantities. Whereas like with actual
stars and sheep and eyes, we can't really. Yeah. And if we're talking precisely about eyes,
that's sort of a symbol, like there is no universal eye, right? We're referring to
a whole bunch of different people's eyes as like a combined concept.
Right. Yeah, exactly. So even there, that's abstract.
So to make it more
specific, because I feel like I'm going to end up otherwise in a really philosophical place.
Sure. I mean, maybe we will regardless. You had this example of actually writing printer code
and how abstraction was lacking or something. I was wondering if you could share that.
Oh, yeah. So in the talk I was talking about this time when, this is during the 90s sometime, when somebody wanted to
hire, somebody hired me to write a program to make sales reports. And they had to be printed
on this thought matrix printer that they had in their office. And they had their data in an SQL
table. And so I just took their paradox tables or whatever it was. And I just essentially just
dumped them out onto the printer using very specialized code using like the printer control
codes. I wasn't using any kind of abstraction or anything like that. So talking very concretely to
this particular printer. And then when the client comes back and says, it says like, okay, well,
you know, I want to manipulate these reports.
You know, like, can I move this thing to over here, for instance?
Or like, you know, can we take two reports and put them side by side on a page or something?
That becomes basically impossible without rewriting my whole program.
Because the program isn't able to talk about the location of things on the page or whatever.
It can only talk about like how the printer is being controlled
in this particular instance.
So I was missing this sort of abstract layer.
It should have been talking about the reports
sort of in the abstract layout and other things like that.
Like even if I had known that PostScript existed,
which I didn't.
So that would have afforded me the ability to talk precisely about where things are
on the page and things like that. So yeah, PostScript or PDFs is like an abstraction
that lets you specify like where things would go on a page without being specific to a printer.
Right. And so yeah, the point I was trying to make with that one was one of compositionality. So, you know, I didn't have a compositional program,
so I wasn't able to deconstruct the program
that I had written into, you know, smaller elements
that I could then rearrange.
You know, it was all just one monolithic block of code.
So how does abstraction relate to composition?
If you had written more abstractly to PostScript,
how is that more compositional,
or is it not? Yeah, it certainly would have been more compositional, or like any kind of
sort of algebraic abstraction, like where you can take things apart and then recombine them,
would have been better. I mean, it would have allowed me to do what I wanted to do.
And I mean, the way that relates is that you can, you know, once things are very abstract,
you can sort of manipulate them symbolically according to some rules.
So there would be some rules of composition or combination.
For instance, like if you have some kind of layout language, you might be able to say
like this thing appears above this other thing, or these two things appear side by side.
Like that might be a rule of composition.
But then the actual rendering of one element and the rendering of another element,
if you combine those renderings together, you should get the same thing
as if you were rendering the combined elements.
Does that make sense?
Yeah, it's almost like an intermediate representation
and then some sort of printer-specific interpreter or something.
Yeah. The thing that the abstraction buys you is to allow, again, this sort of some but any
idea that you can talk about. You have some printer, but it could be any printer, as long
as it has a mapping from your abstraction
down to how you actually control the printer.
Yeah. I have this quote from you.
It says, we should work symbolically and only go to a final representation
at the latest possible time.
So this, I suppose, is an example of that.
We would map things to this small language for representing layout,
and then at the latest, like right when we,
before we send it to the printer,
we kind of interpret that into the printer codes.
Right, yeah.
Because if you kind of think about it,
there'll be lots of different ways
that you can combine things
that are sort of in a very large language.
And a favorite example of mine is like SQL and strings.
Combining two strings together, even if both of those strings
represent some SQL statement, there are going to be lots of different ways of combining strings.
And most of those ways are not going to be valid SQL. So if you want to ensure that you have valid
SQL at the end, you have to sort of recover the structure of the SQL and then sort of manipulate
it in some way and then recreate or sort of generate the combined string.
But if, let's say, you had a symbolic representation
of the SQL syntax as a library,
you could then just work using that library
and combine your SQL in a way that's legal,
like in a way that is still legal SQL,
and then generate the string as the sort of last
possible step. So you're not like trying to do some kind of string munging or hacks.
Yeah, the interesting thing about that is almost necessarily there will be some SQL,
like if I'm using some intermediate form that's much more compositional, like almost necessarily
there will be things like types of SQL I can't do.
So I'm limiting myself in using this intermediate form, would you say?
I'm not sure.
I mean, if your intermediate form can represent,
if that is like the syntactic specification for SQL,
then no, there won't be any legal SQL that you can't generate.
Maybe I'm misunderstanding your question.
No, no, I think that makes sense.
SQL is large, so in practice there's things I may miss.
But yeah, if I were able to specify... The perfect case is that I have this intermediate representation
that lets me compositionally form any possible SQL
that I want to do.
Where I think what you're saying is that if you have a string representation,
that is true, right? You can have any valid SQL will fit into that, but so will an infinite
amount of things that are not SQL. Yeah, exactly. Yeah. So it is true that you can have any
combination of valid SQL in your string, but yeah, you can have lots and lots of junk. But if you have
a precise specification, if you have an abstract representation
of SQL, then you can see where precision comes in here. You can have precisely SQL and nothing else.
Yeah. So in a way, precision separates things that are sort of distinct so that we prevent
confusion, right? So it separates the things that we don't want to have confused so it avoids additional
noise so that there are no unnecessary elements yeah so it is to use some of your language from
before like it is a constraint on what you can do right a string can can hold more things but
the representation that can only represent sql is much more precise right and this is the same
with uh you know like everyday constant formation like with the idea of the natural numbers, for instance.
With regard to like arithmetic on the natural numbers, the numbers are very precise and there's no additional noise.
Like the fact that those numbers represent guitars or something like that's just additional noise that you don't need.
You don't want to consider. you don't want to consider you don't need to consider like the fact that two guitars are different from some other guitars it doesn't
matter because if you have two of something and then two of something else you still have
four and so with regard to their cardinality we want to regard them as being the same the place
where this is funny because this is talk about abstraction. I feel like it's very abstract, but I suppose that makes sense. The place where I really found a lot of value in this concept,
and I'm trying to think about how to communicate it well. The place that I found it most interesting
was like in terms of polymorphism. So if I have a function that takes in an integer and returns an
integer, then that is concrete as compared to if I have one that
takes in anything and returns something of the same type, right?
Right.
So that's now more abstract.
So what does that abstraction buy us?
Well, so it buys you that the, so the caller of the function can now select which type
the function should operate on. And because the
caller can pick absolutely any type, that means that you have no hope of writing a specialized
program that works for every type. So you have to write a general purpose one, and there's only
one that could possibly work. You have to return the element of the type that you were given,
because you have no idea up front when you're writing that program.
You don't know what you could possibly be getting.
Which is interesting because it means that the generic version is actually more precise,
even though it's more abstract.
Like the possible implementations are just one.
Right.
So it's a more abstract type, but it's a more precise specification for a program. And because the definition is so limited,
the amount of potential callers of it increase. Like there seems to be some relationship there.
Yeah. Yeah. There's definitely a relationship there. That relationship is called an adjunction.
Okay. What's an adjunction? I might regret asking that.
So adjunction is an idea from category theory.
So it's a relationship between two categories.
So it's a pair of functors, one going from, if you have two categories, C and D,
one of them goes from C to D and the other one goes from D to C.
And they stand in this particular relationship such that they are, what's a sort of a layman's way,
they are sort of approximate inverses.
Or one of them finds the most efficient solution to the other,
and the other finds the most difficult problem that the other one can solve,
if that makes sense.
They're inverses.
They're sort of like equalities, but they're in the opposite direction?
Yeah, so they're in opposite directions, but they're not exactly inverses.
They're sort of approximate inverses. That is,
if you go across the adjunction in one direction and then back again, you don't end up where you
started. You end up either sort of below where you started or above where you started in some kind of,
well, in the category, like in a hierarchy. Like you'll be able to either get to where you ended up from where you started,
or you'll be able to get to where you started from where you ended up.
So in our world of, in my world of writing software.
Yeah.
So I'm constructing a function.
So I think what you're saying is like the more abstract and precise,
I write the definition of my function, that allows for more reuse or more
potential callers of it. Yeah. So you can think of this as in terms, you can think of this in terms
of variance, for instance, like if you're familiar with the way variance works in Scala, like in
class hierarchies. Yeah. So like a function is contravariant in its argument. And so as you grow this type,
the things that you could possibly receive,
you make it easier to use.
Like more, you can pass more things to it.
Yeah, and at the same time, that limits what you can do
because you have to cover the most general case of all of them.
Right, yeah, and conversely, if you shrink the type,
you make it more difficult to use.
But this particular function that we're talking about, like the identity function,
like for all A, it takes an A and goes to A. You can see that it is giving a lot of freedom
to the caller. It basically makes it really easy to call. You can call it with anything,
but you can only set that in turn shrinks the things that you can return. You can only return
one thing.
So it can be called by anything, but in fact, it does nothing.
Right.
So it is the most extreme example, I guess, right?
Right, exactly.
A less extreme example would be like functions in general. So if I am constraining myself to not do side effects,
that obviously limits my function, right?
Like what I can do to make something a function if there's no side effects.
However, that means like more things could call it, I would say.
Yeah, that's right.
Because if you do have side effects, those side effects may be inappropriate or impossible in lots of contexts.
You know, like for instance, if you access the network, your function can't be called anywhere where the network is unavailable or where it's inappropriate or dangerous to access the network.
Yeah, I've actually had this problem with, I think it's like the Java URL class where it actually does like a namespace lookup or something.
That's a great example.
I hit this problem, right? So you're like, what's going wrong here?
I'm like, God, it's making a network call.
Right. Why is it doing that?
The fact that it does a network call makes it not usable in a whole bunch of circumstances.
Exactly.
So the thing that you really, I guess, opened my eyes to is this back and forth relationship.
And I think that like, as I write code, I actually want it to write, I want it to be reusable. I want
composition. And that means that in this relationship, basically, I actually want it to write, I want it to be reusable, I want composition. And that means
that in this relationship, basically, I always want to go as abstract and precise as possible,
right? Well, I think that depends on your goals, really. Because sometimes going very abstract and
very precise might be costly in terms of labor or computation or something like that. So I think
that you got to weigh it against your other goals as well.
So it's not necessarily always correct to, and by correct, I mean, it doesn't always
align with your values to go completely abstract and absolutely precise.
Yeah.
So like, here's an example.
Should I use explicit recursion or does it make more sense to use like a fold?
To me, it seems like there's some like there's less power in a fold.
I agree with that.
And I think that therefore you should choose a fold.
I mean, whenever you can.
But often the fold might be very convoluted to do.
In principle, it should be possible to write any function over a list using a fold.
But it might be very convoluted.
You might have to do some sort of intermediate steps
and things like that.
And there are lots of algorithms on lists
that are just easier to write using recursion.
And which one to do, I think,
just depends on what your goals are at the time.
Yeah, it depends on the case,
but you could be as abstract and precise
as you possibly can in that case, I guess.
This is the way I'm interpreting what you say.
Like if I have a function that like the values of it are only going to be whole numbers,
like I shouldn't be returning like a double or a float.
Right. Yeah, exactly.
All right. What do you think is more precise?
A like actor based system or like some sort of IO effect types?
That's a hard thing to answer.
I think that probably some kind of IO effect, you mean like a monadic IO?
Yeah.
Yeah, I mean, things that you can do with one, you can definitely do with the other.
But sort of an IO monad has an algebra.
Like it has some laws that you can use to sort of reason about your code.
It doesn't have a lot of laws, but you can, for instance, always know that if you break out some
part of your program into a subroutine and call it from your main program, it's going to execute
in the order that you expect, for instance. And other things like that. Whereas, like an actor system has no algebra at all.
There are no laws of composition for actors,
at least to my knowledge.
I mean, there have been, you know,
systems to formalize the sort of mobile processes.
I'm looking at my book here on Pi Calculus, for instance.
So yeah, there are
lots of sort of formalisms for talking about processes that communicate using messages,
but we don't tend to talk about, like when we're writing those systems, we don't tend to write them
sort of symbolically in the abstract like those formalisms do. That makes sense. I have a, in a
piece of software I work on, there is like a, an actor that does some sort of
job execution. It's very basic, but it will take any type of input into its message queue and then
do something with it. So if I were to replace it with some sort of, some sort of function that
takes a specific input and then returns like IO unit or something, it seems to me that that is
more precise because I've limited its
input. Oh yeah, limiting the input I think is a good first step. But I think that you could go
even more abstract than that. I mean, maybe not for that particular application. But you could,
if you have some kind of network that is exchanging messages, you know, over lots of different actors, you could maybe
write that symbolically, like you could have some kind of library that allows you to describe
networks of actors talking to each other. And then as the sort of last step, compile that to actual
actors. Ah, very cool. If I have something that is like of type, like I just described, like type,
like IO unit, or is that actually a function?
Is IO unit an actual function?
Yeah.
If something returns that.
Yeah, yeah.
So it's constructing a value of type IO unit.
So then, I mean, it depends on what language you're using or whatever, but assuming it's
something like Haskell or Scala, what you have, what you've constructed is sort of a script, which will then be passed to the runtime.
In Haskell's case, you know, it'll be passed to the Haskell IO subsystem.
Or in Scala case, it'll be passed to like, you know, the interpreter of the IO monad.
So yeah, if you have a function from, you know, like X to IO unit, it is a function.
It constructs a, it returns an actual value, which is first class, and you can manipulate it.
Oh, I see what you're saying.
So yeah, it does work.
Yeah.
Okay, that's awesome.
I guess what I was thinking is that, you know, a function, yeah, okay.
This is interesting.
I'm thinking out loud about this.
Yeah, no problem.
Because I'm thinking a function that takes in the same inputs
and returns the same results should be equivalent.
Those two functions should be equal, right?
Yeah.
But I can write a function called not putStringLine
that takes in a string and just returns IO unit,
and it would not be equivalent to putStringLine.
Right. Yeah, because it's a different function.
Yeah.
But in a language where you don't have monadic I.O.,
your point is good.
Where, like, if you call read line twice,
you'll get two different strings
because, you know, you'll prompt the user
or read from standard input twice.
Yeah.
And so they're not interchangeable at all.
Whereas in something like Haskell,
read line will just be a value of type I.O. unit,
and every read line is the same as every other.
It's just the value and you can pass it around or do whatever you want with it.
Because to the programmer, it's completely opaque, right?
The only thing, well, if we ignore unsafe perform IO,
the only thing that you can do with it is to return it back to the runtime.
Yeah, because the distinction has to do with that actual stage at the end
where everything gets run, like where you don't see.
Off the end of main somewhere, this thing is being evaluated.
Right.
So I think that if you explained this to yourself
when you were writing printer programs in Delphi,
would it have made sense?
I don't think so.
I think I was just unequipped to learn this at the time.
I mean, I was thinking in terms of, you know,
a program is a sequential thing, you know.
I understood pretty well, like, how a PC worked on the inside.
You know, like, you know, there's a program counter
and it's, like, loading into registers and, you know,
moving stuff in memory.
And, like, that's how I thought about programming.
And so when I thought, oh, I've got to print, to me, that meant like, oh, I got to send instructions
to the parallel port. You know, like personally, I found like your book, very inspiring. A lot of
your talks, you seem to have a vast knowledge. So I saw this talk, you're talking about printing
poetry, and then somehow you connect it to, you know, like junctions. And how did you get
where you are? How should we follow in your footsteps? Oh, don't follow in my footsteps.
There'd be dragons. No, I just, I've just always been really interested in software and software
development and always found that there is like something else to be, to learn something new on
the horizon. And I guess I was looking for the sort of governing dynamics
or the abstraction that captures in general
the thing that I'm doing.
I watched like several of your talks
and one thing I noticed that maybe I've failed to do here
is like you hammer examples.
I saw you give a talk about category theory
at the whatever conference that was in San Francisco
and you didn't even really explain what it was, but just hit with examples one after another. give a talk about category theory at the whatever conference that was in San Francisco. And you
didn't even really explain what it was, but just hit with examples one after another. Is this like
how you learn or how you teach? Yeah, that's sort of my philosophy of teaching and also how I tend
to learn. I think that this is the nature of abstraction because we started off talking about abstraction. The nature of figuring out an idea is to see lots of examples
and then take what is the same in those examples and what is different
and sort of stamping that out as a concept.
So if you line all of those things up,
you can see in what way you can regard them as the same
and in what way they vary.
And then the ways in which they vary is the thing that you want to abstract out.
So yeah, I definitely think that the way that we come to understand anything at all is to connect it with our existing experience.
So seeing lots of examples and then forming that concept or abstraction ourselves. I don't think that an idea
can really be conveyed in any other way. Like trying to convey just the abstract idea, I think
is sort of doomed to failure. You may see the structure of the abstract idea. Like for instance,
if you don't know what a monad is, I can just show you the monad interface. I can say, yeah, it's this,
and you'll know exactly what a monad is, but yet you will have no idea what a monad is, right?
Yeah, exactly.
Like you will have the syntax for monads, but you will have no semantics in which to interpret what you have learned.
Yeah. A colleague of mine, he was recently, he was talking about applicative functors and he was going over the definition of like map two, like combining two things.
And like I understand it because I've used map two and map whatever.
I don't know if it was all renamed.
Like if the names were gone and I was just looking at the pure definition with the symbol that I would have understood it.
I'm only relying on my ability to connect it to a specific case where I had used it.
Yeah, this is one reason why often people find mathematics difficult.
It's because lots of times they will just invent symbols.
And so like the structure of the argument is all there right in front of you.
But it's really difficult maybe to figure out what is being said because there's a lack of
examples and all the symbols are, they don't refer to anything that connects with your existing
experience. You know, there might be like, just like, oh, gamma and sigma and X and Y. And that's,
you know, that's it. Where it's like, yes, structurally that is correct. And like, I can,
you know, use this paper or whatever it is to calculate something, but I don't know what it means.
So like you talked about having a book on pi calculus there. So how do you overcome this?
Obviously, you're able to take up these concepts. How are you putting meat on them?
Probably by just conjuring up examples. So trying to connect what I'm looking at with something that
I know already. Yeah, like I remember learning about applicative functors.
And the way that I learned about applicative functors was just to experiment with lots of things that I knew fit that shape or that I thought might fit that shape.
Do you think that there is a...
Maybe I'll tell a story.
So I tried to learn Haskell in like 2012.
I had like my real world Haskell book and
at work I had some slack and I would just build things here and there with it. And I mean, I built
some things, but I don't think the skills really set in. But now I work on a team of Scala developers
who are really keen on functional programming. And we had somebody recently join the team, didn't know Scala, not that interested in FPE.
And I feel like just through osmosis,
he's learned like so much stuff
without even really caring
compared to me struggling back in the day.
So I don't know if there's an aspect of learning
through like, just like, it's just contagious.
Like just working in an environment
where people understand concepts,
somehow there's a transfer.
I don't know what you think about that.
Yeah, no, I totally agree.
I think that you learn a lot more by doing, by experiencing the thing like, university not really having learned anything is that they don't care about the subject material and it doesn't connect with anything that they do care about.
And so I think that that's really important to connect the science or the abstract idea with the everyday work that you're actually doing.
And, you know, sort of, like, chew on it gradually until it starts to fall into place.
And do you recommend, like, you yourself,
give them any talks, written books?
Is that a method of learning or a method of teaching?
I think both.
I mean, learning is just teaching yourself, right?
Yeah.
So I think that this is just a practical way
of both learning and teaching basically anything.
So, you know, I didn't really go to school to learn any of this.
It's mostly just been through work
that I've learned all this stuff.
So is it possible to become the future Runar
via just working at your job
where you're doing PHP development
and then playing around with things on the weekend?
Or do you need to be fully immersed?
Yeah, I don't know what it means to be the next Runear.
I think that you can absolutely become,
you can become an expert in anything that interests you enough
and that you have enough exposure to in your day-to-day life.
So it's not necessary to write books and to teach
and to work somewhere where they apply these principles.
You can just overcome that via self-interest?
Well, you ask the question whether it's necessary to write and teach.
I think that that's absolutely part of it.
I think you learn a lot by teaching an idea to someone else,
and you also learn a lot by writing your thoughts down.
So it kind of forces you to sort of face where your gaps are or where you might not have
a very precise idea, like not precise enough to explain it to someone else. So yeah, no,
I think that's definitely part of it. It's a sort of mind-body integration thing. Like I think,
you know, it's just like learning to dance or play the guitar or whatever. You can't really
know how to play the guitar abstractly. No one can really just tell you how to play the guitar or whatever. You can't really know how to play the guitar abstractly. No one can really
just tell you how to play the guitar. You got to do it. Yeah, that's a good point. So that makes
me think of like open source, right? So there's certain things I've learned about functional
programming or abstraction or about types just by having somebody tear into me in a PR. I can't get
that from reading your book, but I can like try to contribute to some sort of open source work and then have somebody tear into me there. Yeah, for sure. Well, I think that I used up all
the time. One thing I wanted to ask you about was this Unison Computing that you're working on.
Yeah. Do we have time to talk about that a little bit? Absolutely. So what is Unison Computing?
Yeah. So Unison Computing is a small company that i co-founded with a couple of friends in here in boston and so we're working on this language called unison which is a purely functional
programming language and we hope to have it make it easier to write particularly distributed
software but we we want to sort of revisit the entire developer experience and like the entire
experience of caring for and owning software, starting with
just how you interact with the compiler and then all the way to how you build large-scale
distributed systems. One thing I noticed when I looked at it, it seems like you don't represent
code as text. Is that right? Yeah, that's right. So the code base is a persistent data structure.
So essentially, the code base is a purely functional data structure
that you can manipulate.
It's an object that you can manipulate.
So it's not a jumble of mutable text files on your disk.
It's like an actual object that you can programmatically interact with.
I feel like this ties into your idea earlier
about SQL and representing SQL as strings.
Exactly. This is exactly like that.
So we represent programs symbolically so that we can manipulate them very easily programmatically.
And we can also communicate programs over the network.
And so we can write distributed systems where to make a program run on a large network
is no harder than just writing a program
that runs on a single machine. So you write a program that describes your entire system,
a distributed system, and then when you execute that program, that's the same thing as deploying
that distributed system. So it's almost like, this probably isn't how you would describe it,
but it's almost like if your entire program was in the config for Kubernetes,
then all you would need to do to deploy it
is give that config, hands it out to all the nodes,
and then executes it.
Yeah, it's basically like that.
Except that, imagine if you could talk about
Kubernetes locations as a first-class entity
in your programming language.
Then you're getting closer to where we're trying to go.
But, you know, communicating programs over the network
and representing them symbolically is not new.
Like Smalltalk did this, Lisp does this.
But we are doing this as a typed, purely functional programming language.
And what we communicate are identifiers.
We don't start by, like, sending programs over the wire.
So Unison has a unique approach to names, our identifiers. We don't start by sending programs over the wire.
So Unison has a unique approach to names,
which allows us to do this sort of unambiguously.
What's the approach?
Well, so, okay, so sort of as a little bit of a background,
let's say I have a program that calculates the 24th Fibonacci number or something.
And then I send you that program.
You're another node on the network.
I send you that program and I say, execute this. And then you see this program, let's say like
it's a Lisp program or something. And it has this expression in it, which is like Fibonacci of 24.
Like, how do you know what Fibonacci is? And how do we know that we agree on what that is?
Right. So there has to be some kind of resolution of free variables in the program that works across the network, across both space and also across time.
Because the definition of what Fibonacci is, well, that's probably a bad example.
But the definition of some function may change over time, like what a name means.
So in Unison, a name is a cryptographic hash of the definition.
So the Fibonacci function has a unique eternal identifier that we can unambiguously communicate.
If I send you that hash applied to the number 24, you either know exactly what I'm talking about,
or you can unambiguously know that you don't know what I'm
talking about. And then you can say, please tell me what this hash represents. I will send you the
code, and then you can hash it. You verify that it matches that hash, and then away you go.
It makes me think of two things. One is content addressable storage. So the key for a value is
the hash of the contents. Yeah, it's exactly that idea.
And then the other one is, are you familiar with this LittleTyper book and the Pi programming language or dependent types in general, I guess?
I am familiar with dependent types, but I haven't taken a look at LittleTyper.
So in LittleTyper, because the type system allows for, you know, like functions that
will evaluate some arguments to produce the type,
right? So the type ends up being an actual function. To figure out if two types are the same
actually means figuring out if any two functions are equivalent, right? Which means that they kind
of do the same process because they need to be able to say the type of this function is returned
is actually this is the type can actually be determined by evaluating a function. So to see
if two types are the same, they may need to determine if two functions are the same. And so
they have to do this process. Yeah, yeah. So that gets into notions of like extensional versus
intentional equality. The kind of thing that we have is like an intentional equality, like two
values are the same, if they have the same implementation are the same if they have the same implementation.
Or rather, they have the same name if they have the same implementation.
Makes sense.
Which is sufficient for distributed.
But if two functions return the same inputs for all outputs,
they're not necessarily the same by your concept of equality.
Right.
Yeah, you could, for instance, insert an identity, a function to identity on the outside of your function.
You take the source code from my function, add identity of that,
and you'd have a different function as far as Unison is concerned.
Which is fine because then it just distributes it somehow.
Yeah. The idea is that you should be able to distribute implementations unambiguously
and that once somebody has discovered a function, it's sort of like always
there. It's like discovering, you know, a new star in the sky or something, you know, like it has
like an unambiguous place in this space. It's rather like it was always there and it was just
waiting for you to discover it. Isn't this like Smalltalk you mentioned earlier, like it has some
sort of, you can save data and definitions into a workbook or something? Is that kind of a similar concept?
Yeah, so in Smalltalk, the code base,
or I don't remember what it's actually called,
but all of your Smalltalk definitions are saved to a database
in your local storage.
And there are tools to migrate code between people
without necessarily sharing just the text.
Yeah, so I mean, it's a similar idea to that kind of thing.
Well, it sounds like the problem is that
what about like version control
and my favorite text editor and things like that?
Yeah, so Smalltalk is very sort of like a closed environment,
but we very much want to make sure
that you can still use your favorite text editor
and that you still use Git or whatever to version things.
So we've made sure that the repository format of the code base
is very amenable to being versioned by Git.
So then you don't version the text of the code.
What you version are the binaries of the code base.
Oh, are they actually binary?
Yeah, the code is actually stored in a binary format.
So how come binary and not like...
Like when you described it to me,
it sounded like some sort of abstract syntax tree or something.
Yeah, it is an abstract syntax tree,
but it's stored in a very compact format.
So instead of having text that represents the thing syntactically,
like for instance, a block of code
will start with some byte,
which represents open block.
So it's rather like a byte code,
like a Java byte code, for instance.
And so what you version is the byte code,
and you can always recover the source code
using your code base.
And your code base has mappings
from like the hashes to the names that you originally used.
And so the rendering of the code will look very similar
to what you originally wrote.
Or you can keep the source code around if you want as well.
But the source code is not the source of truth.
That's just sort of the peanut shell.
And then once you've cracked open the peanut,
you can keep the shells or you can throw them away.
Yeah, I feel like we should have started with this as an example
because it definitely seems to fit your abstraction idea, right?
So the binary can only represent valid programs
or it's more restricted than a text form?
Yes, it's definitely more restricted than a text form.
I mean, it certainly can represent invalid programs
like the abstract syntax tree data type.
The algebraic data type for the syntax tree
is not absolutely precise, right?
So you could, for instance, apply a value
that isn't a function to a value, for instance.
But that's why we have a type system,
so that we can prevent you doing things like that.
So how did you implement this? You're building a compiler and stuff?
Yeah, so the compiler is built in Haskell.
And we originally had a runtime that was written in Scala.
And so we were running Unison programs on the JVM.
And that's sort of the current implementation.
We're migrating to a Haskell-based implementation of the runtime that's a lot simpler.
What was the, like I assume there was some disadvantage to the JVM?
Well, we just weren't really very happy with our implementation of it.
We were trying to be sort of too clever.
And so the JVM-based implementation turned out to not buy us very much speed,
which is the reason why we were trying to write it for the JVM in the first place.
But the Haskell-based implementation is a lot easier to understand
than the Scala-based one.
So what kind of crazy distributed programs can I make with this language?
Oh, I mean, right now, none.
I mean, it's still very much in development.
And so we're sort of engaging the community of enthusiasts for this.
But it's not really ready for everyday programmers
to get their hands on it.
I mean, you can, it's open source, it's on GitHub,
but you can't really write any practical programs in this at the moment.
But we're hoping to have an initial re-release of this
sometime this year.
That's awesome.
Where you can actually write some programs.
So if people are interested, what should they do?
Go to unisonweb.org.
It's all one word, unisonweb.org.
And then can we, I guess we can't use it yet.
Can we contribute?
Can we learn things?
Yeah.
So GitHub has a number of outstanding issues and bugs
and feature requests and things like that.
So there are things that are easy to pick up,
so if people want to contribute, they definitely can.
And one of the ways in which you can contribute
is porting libraries to Unison.
I mean, it might be kind of difficult at the moment
because the compiler has bugs,
and it's a lot in flux at the moment as of right now.
But very soon we'll be able to accept contributions
to things like something simple like
a text manipulation library or something.
Seems like a crazy ambitious project, Rinar.
You're rethinking distributed computing
but also just representing code as text.
What else are you rethinking?
We're rethinking how to interact with compilers.
We want to ultimately rethink how code is edited,
make that a more sort of fluid experience.
Yeah, there are lots of ways that we can go with this,
but the fundamental thing that allows us to rethink all that
is to think of the of the code as being you know a data
structure and uh having this idea of the sort of content addressable namespace so there's those
two things together i think will enable a lot of cool technology that i don't know it seems pretty
space age at the moment well i, I think it's awesome.
Yeah, I mean, it just really hits home to me with your concept earlier
because like an AST, the text file of my code
can represent a lot of things that aren't valid.
Like just code formatting, I guess.
We have a lot of nice code formatting tools right now.
Yeah, that's just something that nobody should ever do.
Like think of all the time that's just wasted on like formatting code.
Like every time I just like make a new line and hit tab a couple of times or backspace to like make something line up like I die a little bit on the inside.
Yeah, I agree.
And like there's like weird things like like if I was doing like scheme programming, I
really would like to have a way to tell my editor
to just treat brackets as indents, like Python style,
and not actually see this ending line with a thousand ending brackets.
Yeah, absolutely.
And that's another thing that I think that we'll be able to do
is to have lots of different surface syntaxes for the language.
So if you want a lot of brackets and you want to use S expressions, you can.
I mean, we're not there yet.
We only have one syntax, which is very Haskell-like,
but that's only because that's the aesthetic that we prefer.
But there's no reason that we couldn't have
any number of surface syntaxes for this abstract language.
You know, very much like Microsoft is doing
with C Sharp and F Sharp and all of those languages that are all under the hood are the same CLR.
But even more so because the Scala and Java both run on the JVM,
but they're not the same language.
This is more talking about the presentation could vary, right?
But the actual semantics would be the same?
Right, exactly.
That is a cool concept.
You could get adoption by just having a mode
that fits everybody's weird preferences.
Yeah, yeah, yeah. Maybe we could make it easy to have sort of pluggable syntaxes.
That's awesome. Well, thank you for your time, Runar. This has been a lot of fun.
Yeah, likewise. It's been a blast. Thanks for having me.
Thank you very much.
That's the show. I hope you liked it.
If I'm totally honest, the interview with Runar went in a slightly different direction in the second half than I had planned for, but it ended up super
interesting to hear about his new project, to hear about how he likes to learn and approach topics.
Just a super interesting person to talk to. If you like the show, tell a friend about it and
help spread the word and also maybe join our Slack channel. After the last episode about Scala native, we had some interesting talks about
big O notation and complexity in the channel. And also user Joe shared a file server open source
project he made in F sharp where you can upload files, but you get to set kind of a time to live
on them. Kind of an interesting project exploring functional programming and serving files. So yeah,
until next time.